Evaluating Machine Translations of Company Reports for Grammatical and Lexical Accuracy
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
This thesis addresses the gap in the machine translation evaluation metrics, particularly in translating complex company reports. Traditional metrics like NIST focus on n-gram informativeness, METEOR on precision, recall, and matching techniques, and ChrF++ on lexical accuracy through n-grams at character and word levels but they fail to capture the full scope of grammatical accuracy and lexical cohesion. The aim of this thesis is to evaluate machine translations of company reports for grammatical and lexical accuracy. To achieve this, a new combined score was introduced which incorporates grammatical and lexical features. Using data from the GRI’s Sustainability Disclosure Dataset, this thesis analyzed translations of reports in English, French, Chinese, and Japanese languages. The findings of the thesis indicate that while traditional metrics offer valuable insights,they lack comprehensive linguistic evaluative capacity. The thesis concludes that enhancing traditional metrics to include deeper linguistic elements is essential for improving translation quality. Future research should explore advanced models like BERT and GPT-4, incorporate human reviews, and expand cross-linguistic validation to enhance translation quality and reliability.
Place, publisher, year, edition, pages
2024.
Keywords [en]
Machine Translation (MT), Translation Evaluation, Neural Machine Translation (NMT), NIST, METEOR, ChrF++, Grammatical Metrics, Lexical Metrics, Entropy Weight, Ordinary Least Squares
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:du-48954OAI: oai:DiVA.org:du-48954DiVA, id: diva2:1881932
Subject / course
Microdata Analysis
2024-07-042024-07-04