A framework for mining opinions from user generated content
In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/54823 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-54823 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-548232023-03-04T00:34:45Z A framework for mining opinions from user generated content Amit Kumar Saini Chang Kuiyu School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating meaningful summaries. We have built our system using travel reviews but the system can be used on any domain with minimal changes. Our focus is to come up with a highly scalable framework. A system which can scale both horizontally and vertically to deploy on large scale distributed systems. Hence, we presented an architecture by carefully examining every component used in the system including the database for storing reviews. We have compared various choices and chosen state-of-the-art open source technologies that use distributed multi-node architecture. As a result, millions of reviews can be stored and indexed. We have used travel reviews for testing purpose but the system can be used on any domain with minimal changes. We have implemented a dynamic feature extraction engine that utilizes unsupervised learning to associate extracted features and opinions starting with only one domain seed feature. For example, the feature seed word 'hotel' is all that is needed to extract a list of related hotel feature words like 'room' and 'service'. Next we extract the opinions expressed on the dynamically extracted features and perform sentence level sentiment analysis. To present the results to the end user in an intuitive manner, we subsequently created a web interface and experimented with new visualization techniques. Experiments were conducted to evaluate the system and proposed methods. From the analysis of the results we discuss drawbacks of our current approach and future direction of the research. Finally, a fully-functioning prototype has been created to demonstrate the end-to-end system. Master of Engineering (SCE) 2013-08-30T03:03:27Z 2013-08-30T03:03:27Z 2013 2013 Thesis http://hdl.handle.net/10356/54823 en 97 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Amit Kumar Saini A framework for mining opinions from user generated content |
description |
In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating meaningful summaries. We have built our system using travel reviews but the system can be used on any domain with minimal changes.
Our focus is to come up with a highly scalable framework. A system which can scale both horizontally and vertically to deploy on large scale distributed systems. Hence, we presented an architecture by carefully examining every component used in the system including the database for storing reviews. We have compared various choices and chosen state-of-the-art open source technologies that use distributed multi-node architecture. As a result, millions of reviews can be stored and indexed. We have used travel reviews for testing purpose but the system can be used on any domain with minimal changes.
We have implemented a dynamic feature extraction engine that utilizes unsupervised learning to associate extracted features and opinions starting with only one domain seed feature. For example, the feature seed word 'hotel' is all that is needed to extract a list of related hotel feature words like 'room' and 'service'. Next we extract the opinions expressed on the dynamically extracted features and perform sentence level sentiment analysis. To present the results to the end user in an intuitive manner, we subsequently created a web interface and experimented with new visualization techniques.
Experiments were conducted to evaluate the system and proposed methods. From the analysis of the results we discuss drawbacks of our current approach and future direction of the research. Finally, a fully-functioning prototype has been created to demonstrate the end-to-end system. |
author2 |
Chang Kuiyu |
author_facet |
Chang Kuiyu Amit Kumar Saini |
format |
Theses and Dissertations |
author |
Amit Kumar Saini |
author_sort |
Amit Kumar Saini |
title |
A framework for mining opinions from user generated content |
title_short |
A framework for mining opinions from user generated content |
title_full |
A framework for mining opinions from user generated content |
title_fullStr |
A framework for mining opinions from user generated content |
title_full_unstemmed |
A framework for mining opinions from user generated content |
title_sort |
framework for mining opinions from user generated content |
publishDate |
2013 |
url |
http://hdl.handle.net/10356/54823 |
_version_ |
1759853350682099712 |