Dalarna University's logo and link to the university's website

du.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving Social Media Sentiment Analysis with Multi-Modal Data and Deep Learning
Dalarna University, School of Culture and Society.
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Social media memes often combine images and text, making the detection of hateful content challenging for traditional sentiment analysis methods. Sentiment Analysis is a natural language processing task that determines the emotional tone (positive, negative, neutral) in the body of text. This research addressed that challenge by evaluating and comparing unimodal and multimodal deep learning models using the Facebook Hateful Memes dataset. Three models were implemented: a text-only BERT model, an image-only ResNet-50 model, and a fused multimodal model (BERT + ResNet-50) that integrates both text and image features.

Results showed that unimodal models performed reasonably well in specific cases BERT excels in text-heavy memes, while ResNet-50 captured visual cues. However, both do not generalize effectively when sentiment relies on cross-modal interactions. The fused model significantly improved classification performance by using complementary information from both modalities. ROC-AUC, or Area Under the Receiver Operating Characteristic Curve, is a performance metric used to evaluate the ability of a binary classification model to distinguish between classes. The multimodal achieved an accuracy of 89%, an F1-score of 0.905, a precision of 91%, a recall of 90%, and a ROC-AUC of 0.92 substantially outperforming the unimodal baselines.

This study proved that multimodal fusion captures complex semantics often missed by single-modality models. Hate Speech is the offensive or threatening content targeting individuals or groups based on attributes such as race, religion, or gender. The fused model reduces false negatives, which is a crucial improvement for hate speech detection systems. These findings underscore the necessity of multimodal approaches in the sentiment analysis of modern social media content. This research offers actionable insights for building robust, ethical AI systems capable of analysing real-world multimodal data.

Place, publisher, year, edition, pages
2025.
Keywords [en]
Sentiment Analysis, Multimodal Learning, Hateful Memes, BERT, ResNet-50
National Category
Information Systems
Identifiers
URN: urn:nbn:se:du-51246OAI: oai:DiVA.org:du-51246DiVA, id: diva2:1998419
Subject / course
Business Administration and Management
Available from: 2025-09-16 Created: 2025-09-16 Last updated: 2025-10-09

Open Access in DiVA

fulltext(2212 kB)260 downloads
File information
File name FULLTEXT01.pdfFile size 2212 kBChecksum SHA-512
970f3695e5ec11e1d8fcd8a7857b2858487dddf9022043dc4575a830921247bb14d489e616c47ed2e69ff9b2208e675b487dda11bd0de98ff5c7c8588d6e2d39
Type fulltextMimetype application/pdf

By organisation
School of Culture and Society
Information Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 260 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 223 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf