Online shopping sites crawler

The advancement of technology brought about a wide range of benefits for society but also inevitably contributed to fast-paced lifestyles. Increasingly, people now prefer to carry out their shopping activities online and at the same time, look for innovative new ways to obtain the best bargain. Ther...

Full description

Saved in:
Bibliographic Details
Main Author: Leong, Letitia Justina Si En
Other Authors: Liang Qian Hui
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/69155
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The advancement of technology brought about a wide range of benefits for society but also inevitably contributed to fast-paced lifestyles. Increasingly, people now prefer to carry out their shopping activities online and at the same time, look for innovative new ways to obtain the best bargain. Therefore, the aim of this project is to design a way to collect merchants’ data from multiple shopping sites and display them into a platform that enables shoppers to perform product comparison. Firstly, a shopping site crawler was developed using Scrapy framework to initiate crawling and scraping from different shopping sites. As every website is structured differently, the scraping process gets a little more complicated. In order for the web crawler to extract specific data from a website, it requires their XPaths to be specified. That is why, a Tkinter program was created to alleviate this problem of code rework while providing convenience in configuring new and existing web spiders. Secondly, collected merchants’ data will undergo the process of text mining whereby preprocessing, clustering and topic modelling take place. Clustering and topic modelling were used to detect interesting patterns for grouping similar products together and to discover attractive topics. These results will be presented to the shoppers in a way that allow them to search for their desired products easily and efficiently. Thirdly, a frontend web application was established to display recommended products, appealing product themes as well as all merchants that provides the same or one kind of products. In addition, filters were also implemented to facilitate users’ preferences search. Lastly, a backend web application was also set up to manage any product related data within the database. By the end of the project, all objectives were successfully accomplished. There were some unresolved limitations found within the developed system due to time constraint and limited manpower. However, these limitations along with the suggestions for further enhancement can be looked into and brushed up in the future.