Improving Social Media Sentiment Analysis with Multi-Modal Data and Deep Learning
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 10 credits / 15 HE credits
Student thesis
Abstract [en]
Social media memes often combine images and text, making the detection of hateful content challenging for traditional sentiment analysis methods. Sentiment Analysis is a natural language processing task that determines the emotional tone (positive, negative, neutral) in the body of text. This research addressed that challenge by evaluating and comparing unimodal and multimodal deep learning models using the Facebook Hateful Memes dataset. Three models were implemented: a text-only BERT model, an image-only ResNet-50 model, and a fused multimodal model (BERT + ResNet-50) that integrates both text and image features.
Results showed that unimodal models performed reasonably well in specific cases BERT excels in text-heavy memes, while ResNet-50 captured visual cues. However, both do not generalize effectively when sentiment relies on cross-modal interactions. The fused model significantly improved classification performance by using complementary information from both modalities. ROC-AUC, or Area Under the Receiver Operating Characteristic Curve, is a performance metric used to evaluate the ability of a binary classification model to distinguish between classes. The multimodal achieved an accuracy of 89%, an F1-score of 0.905, a precision of 91%, a recall of 90%, and a ROC-AUC of 0.92 substantially outperforming the unimodal baselines.
This study proved that multimodal fusion captures complex semantics often missed by single-modality models. Hate Speech is the offensive or threatening content targeting individuals or groups based on attributes such as race, religion, or gender. The fused model reduces false negatives, which is a crucial improvement for hate speech detection systems. These findings underscore the necessity of multimodal approaches in the sentiment analysis of modern social media content. This research offers actionable insights for building robust, ethical AI systems capable of analysing real-world multimodal data.
Place, publisher, year, edition, pages
2025.
Keywords [en]
Sentiment Analysis, Multimodal Learning, Hateful Memes, BERT, ResNet-50
National Category
Information Systems
Identifiers
URN: urn:nbn:se:du-51246OAI: oai:DiVA.org:du-51246DiVA, id: diva2:1998419
Subject / course
Business Administration and Management
2025-09-162025-09-162025-10-09