Dalarna University's logo and link to the university's website

du.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluating Machine Translations of Company Reports for Grammatical and Lexical Accuracy
Dalarna University, School of Information and Engineering, Microdata Analysis.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis addresses the gap in the machine translation evaluation metrics, particularly in translating complex company reports. Traditional metrics like NIST focus on n-gram informativeness, METEOR on precision, recall, and matching techniques, and ChrF++ on lexical accuracy through n-grams at character and word levels but they fail to capture the full scope of grammatical accuracy and lexical cohesion. The aim of this thesis is to evaluate machine translations of company reports for grammatical and lexical accuracy. To achieve this, a new combined score was introduced which incorporates grammatical and lexical features. Using data from the GRI’s Sustainability Disclosure Dataset, this thesis analyzed translations of reports in English, French, Chinese, and Japanese languages. The findings of the thesis indicate that while traditional metrics offer valuable insights,they lack comprehensive linguistic evaluative capacity. The thesis concludes that enhancing traditional metrics to include deeper linguistic elements is essential for improving translation quality. Future research should explore advanced models like BERT and GPT-4, incorporate human reviews, and expand cross-linguistic validation to enhance translation quality and reliability.

Place, publisher, year, edition, pages
2024.
Keywords [en]
Machine Translation (MT), Translation Evaluation, Neural Machine Translation (NMT), NIST, METEOR, ChrF++, Grammatical Metrics, Lexical Metrics, Entropy Weight, Ordinary Least Squares
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:du-48954OAI: oai:DiVA.org:du-48954DiVA, id: diva2:1881932
Subject / course
Microdata Analysis
Available from: 2024-07-04 Created: 2024-07-04

Open Access in DiVA

fulltext(941 kB)154 downloads
File information
File name FULLTEXT01.pdfFile size 941 kBChecksum SHA-512
d052f06ec4d9fe29cbd7dd38e5ed92c7b6902f05b29b9fc320f875d97bb9f65028e5400e40e5ae06c62fa6e834519cfee5c6091a6cc972962f895225511d0335
Type fulltextMimetype application/pdf

By organisation
Microdata Analysis
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 154 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 475 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf