A data mining application on predicting relevance of search results from E-commerce platforms

Search engines like google.com have become the dominant model of online search. Large and small e-commerce provide built-in search capability to their visitors to examine the products they have. While most large business are able to hire the necessary skills to build advanced search engines, small o...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Yuanrui
Other Authors: Xiao Xiaokui
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70196
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Search engines like google.com have become the dominant model of online search. Large and small e-commerce provide built-in search capability to their visitors to examine the products they have. While most large business are able to hire the necessary skills to build advanced search engines, small online business still lack the capability to evaluate the results of their search engines, which means losing the opportunity to compete with larger business. The purpose of this project is to build an open-source solution that could measure the relevance of search results for online business as well as the accuracy of their underlined algorithms. The data set is taken from ‘CrowdFlower Search Result Relevance’ competition from Kaggle.com. A data mining application is implemented using Python and R language, with design for automating the process of feature engineering and model parameter tuning. As a result, the application helps to reduce the time for searching out the best optimal model and at the same time, maintain a good quality of prediction accuracy rate.