Exploration of an Automated Motivation Letter Scoring System to Emulate Human Judgement
2020 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
As the popularity of the master’s in data science at Dalarna University increases, so does the number of applicants. The aim of this thesis was to explore different approaches to provide an automated motivation letter scoring system which could emulate the human judgement and automate the process of candidate selection. Several steps such as image processing and text processing were required to enable the authors to retrieve numerous features which could lead to the identification of the factors graded by the program managers. Grammatical based features and Advanced textual features were extracted from the motivation letters followed by the application of Topic Modelling methods to extract the probability of each topics occurring within a motivation letter. Furthermore, correlation analysis was applied to quantify the association between the features and the different factors graded by the program managers, followed by Ordinal Logistic Regression and Random Forest to build models with the most impactful variables. Finally, Naïve Bayes Algorithm, Random Forest and Support Vector Machine were used, first for classification and then for prediction purposes. These results were not promising as the factors were not accurately identified. Nevertheless, the authors suspected that the factors may be strongly related to the highlight of specific topics within a motivation letter which can lead to further research.
Place, publisher, year, edition, pages
2020.
Keywords [en]
Natural Language Processing, Machine Learning, Supervised Learning, Unsupervised Learning, Automation, Feature Extraction, Image Processing, Text Processing, Text Exploration, Motivation Letter, Dalarna University, Student Application, Topic Modelling, Business Intelligence, Data Science
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:du-34563OAI: oai:DiVA.org:du-34563DiVA, id: diva2:1452809
2020-07-072020-07-07