Web crawler and NLP enabled data mining : a statistical study on the formation of hotel ratings
Web crawler has been regarded as one of the most effective ways in extracting large amount of data from websites. With information technology, human languages can be understood by natural language processing (NLP) programs to some extent. In this report, web crawling and natural language processi...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | Chinese |
Published: |
2014
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/55818 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | Chinese |
Summary: | Web crawler has been regarded as one of the most effective ways in extracting large amount of data from websites. With information technology, human languages can be understood by natural language processing (NLP) programs to some extent.
In this report, web crawling and natural language processing technology were used to extract reviewer opinions from Tripadvisor webpages. We studied opinions towards 50 hotels located in Las Vegas, Untied States of America, and constructed a model to predict customer ratings in relation to their opinions, experience and hotel ranking. It has been found that reviewer ratings towards a certain hotel has a positive correlation with both reviewer opinions and reviewer experience, and has a negative correlation with hotel ranking.
Future research directions include improvement on NLP’s accuracy and applications on other industries such as entertainment, consumer goods, etc. |
---|