Online shopping sites crawler

The advancement of technology brought about a wide range of benefits for society but also inevitably contributed to fast-paced lifestyles. Increasingly, people now prefer to carry out their shopping activities online and at the same time, look for innovative new ways to obtain the best bargain. Ther...

Full description

Saved in:
Bibliographic Details
Main Author: Leong, Letitia Justina Si En
Other Authors: Liang Qian Hui
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/69155
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-69155
record_format dspace
spelling sg-ntu-dr.10356-691552023-03-03T20:56:59Z Online shopping sites crawler Leong, Letitia Justina Si En Liang Qian Hui School of Computer Engineering DRNTU::Engineering The advancement of technology brought about a wide range of benefits for society but also inevitably contributed to fast-paced lifestyles. Increasingly, people now prefer to carry out their shopping activities online and at the same time, look for innovative new ways to obtain the best bargain. Therefore, the aim of this project is to design a way to collect merchants’ data from multiple shopping sites and display them into a platform that enables shoppers to perform product comparison. Firstly, a shopping site crawler was developed using Scrapy framework to initiate crawling and scraping from different shopping sites. As every website is structured differently, the scraping process gets a little more complicated. In order for the web crawler to extract specific data from a website, it requires their XPaths to be specified. That is why, a Tkinter program was created to alleviate this problem of code rework while providing convenience in configuring new and existing web spiders. Secondly, collected merchants’ data will undergo the process of text mining whereby preprocessing, clustering and topic modelling take place. Clustering and topic modelling were used to detect interesting patterns for grouping similar products together and to discover attractive topics. These results will be presented to the shoppers in a way that allow them to search for their desired products easily and efficiently. Thirdly, a frontend web application was established to display recommended products, appealing product themes as well as all merchants that provides the same or one kind of products. In addition, filters were also implemented to facilitate users’ preferences search. Lastly, a backend web application was also set up to manage any product related data within the database. By the end of the project, all objectives were successfully accomplished. There were some unresolved limitations found within the developed system due to time constraint and limited manpower. However, these limitations along with the suggestions for further enhancement can be looked into and brushed up in the future. Bachelor of Engineering (Computer Science) 2016-11-11T08:48:40Z 2016-11-11T08:48:40Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/69155 en Nanyang Technological University 93 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Leong, Letitia Justina Si En
Online shopping sites crawler
description The advancement of technology brought about a wide range of benefits for society but also inevitably contributed to fast-paced lifestyles. Increasingly, people now prefer to carry out their shopping activities online and at the same time, look for innovative new ways to obtain the best bargain. Therefore, the aim of this project is to design a way to collect merchants’ data from multiple shopping sites and display them into a platform that enables shoppers to perform product comparison. Firstly, a shopping site crawler was developed using Scrapy framework to initiate crawling and scraping from different shopping sites. As every website is structured differently, the scraping process gets a little more complicated. In order for the web crawler to extract specific data from a website, it requires their XPaths to be specified. That is why, a Tkinter program was created to alleviate this problem of code rework while providing convenience in configuring new and existing web spiders. Secondly, collected merchants’ data will undergo the process of text mining whereby preprocessing, clustering and topic modelling take place. Clustering and topic modelling were used to detect interesting patterns for grouping similar products together and to discover attractive topics. These results will be presented to the shoppers in a way that allow them to search for their desired products easily and efficiently. Thirdly, a frontend web application was established to display recommended products, appealing product themes as well as all merchants that provides the same or one kind of products. In addition, filters were also implemented to facilitate users’ preferences search. Lastly, a backend web application was also set up to manage any product related data within the database. By the end of the project, all objectives were successfully accomplished. There were some unresolved limitations found within the developed system due to time constraint and limited manpower. However, these limitations along with the suggestions for further enhancement can be looked into and brushed up in the future.
author2 Liang Qian Hui
author_facet Liang Qian Hui
Leong, Letitia Justina Si En
format Final Year Project
author Leong, Letitia Justina Si En
author_sort Leong, Letitia Justina Si En
title Online shopping sites crawler
title_short Online shopping sites crawler
title_full Online shopping sites crawler
title_fullStr Online shopping sites crawler
title_full_unstemmed Online shopping sites crawler
title_sort online shopping sites crawler
publishDate 2016
url http://hdl.handle.net/10356/69155
_version_ 1759858235912749056