Smart Search Engine: A Design and Test of Intelligent Search of News with Classification
2021 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE credits
Student thesis
Abstract [en]
Background
Google, Bing, and Baidu are the most commonly used search engines in the world. They also have some problems. For example, when searching for Jaguar, most of the search results are cars, not animals. This is the problem of polysemy. Search engines always provide the most popular but not the most correct results.
Aim
We want to design and implement a search function and explore whether the method of classified news can improve the precision of users searching for news.
Method
In this research, we collect data by using a web crawler. We use a web crawler to crawl the data of news in BBC news. Then we use NLTK, inverted index to do data pre-processing, and use BM25 to do data processing.
Results
Compare to the normal search function, our function has a lower recall rate and a higher precision.
Conclusions
This search function can improve the precision when people search for news.
Implications
This search function can be used not only to search news but to search everything. It has a great future in search engines. It can be combined with machine learning to analyze users' search habits to search and classify more accurately.
Place, publisher, year, edition, pages
2021.
Keywords [en]
Smart search, precision, recall rate, NLTK, inverted index, BM25
National Category
Information Systems
Identifiers
URN: urn:nbn:se:du-37601OAI: oai:DiVA.org:du-37601DiVA, id: diva2:1577981
Subject / course
Information Systems
2021-07-052021-07-052025-10-09