Dalarna University's logo and link to the university's website

du.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluating the Accuracy of Automated Transcription using Open AI's Whisper Model for Swedish IT-Related Interviews: A Comparative Analysis with Manual Transcription
Dalarna University, School of Information and Engineering.
2024 (English)Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This study investigates the accuracy of using Open AI's Whisper model as an automated transcription tool for Swedish language interviews in a specialized IT domain. The research compares the performance of Whisper-generated transcripts against manual transcriptions, focusing on Word Error Rate (WER) as the primary metric of accuracy. Five interviews conducted with IT professionals at Trafikverket, the Swedish Transport Administration, serve as the basis for this analysis. The study reveals variations in transcription accuracy, with Word Error Rate (WER) ranging from levels close to human performance to being 5-6 times worse. A qualitative examination of the types of transcription errors, including substitutions, insertions, and deletions, provides deeper insights into the model's strengths and limitations. The findings suggest that while Whisper shows potential as a time-saving tool, its performance varies considerably. This variability highlights the importance of ongoing research to better understand and improve its reliability, particularly in smaller languages like Swedish. While these tools can be integrated into qualitative research, it’s crucial to be mindful of their current limitations and areas where they may fall short.

Place, publisher, year, edition, pages
2024.
Keywords [en]
Automated Transcription, OpenAI Whisper Model, Word Error Rate (WER), Swedish, Audio-to-Text Conversion, Domain-Specific Language.
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:du-49741OAI: oai:DiVA.org:du-49741DiVA, id: diva2:1918314
Subject / course
Microdata Analysis
Available from: 2024-12-04 Created: 2024-12-04 Last updated: 2025-10-09

Open Access in DiVA

fulltext(673 kB)539 downloads
File information
File name FULLTEXT01.pdfFile size 673 kBChecksum SHA-512
58a41a4d78fa038d2063373889044cb89aa1425cc7f3851bf60a74b72743200b532b3d93330f8bac21f38afebe979390b7f70d83d1f66855572a68ad459fe1ed
Type fulltextMimetype application/pdf

By organisation
School of Information and Engineering
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 540 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 896 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf