A framework for mining opinions from user generated content

In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating...

Full description

Saved in:
Bibliographic Details
Main Author: Amit Kumar Saini
Other Authors: Chang Kuiyu
Format: Theses and Dissertations
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54823
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-54823
record_format dspace
spelling sg-ntu-dr.10356-548232023-03-04T00:34:45Z A framework for mining opinions from user generated content Amit Kumar Saini Chang Kuiyu School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating meaningful summaries. We have built our system using travel reviews but the system can be used on any domain with minimal changes. Our focus is to come up with a highly scalable framework. A system which can scale both horizontally and vertically to deploy on large scale distributed systems. Hence, we presented an architecture by carefully examining every component used in the system including the database for storing reviews. We have compared various choices and chosen state-of-the-art open source technologies that use distributed multi-node architecture. As a result, millions of reviews can be stored and indexed. We have used travel reviews for testing purpose but the system can be used on any domain with minimal changes. We have implemented a dynamic feature extraction engine that utilizes unsupervised learning to associate extracted features and opinions starting with only one domain seed feature. For example, the feature seed word 'hotel' is all that is needed to extract a list of related hotel feature words like 'room' and 'service'. Next we extract the opinions expressed on the dynamically extracted features and perform sentence level sentiment analysis. To present the results to the end user in an intuitive manner, we subsequently created a web interface and experimented with new visualization techniques. Experiments were conducted to evaluate the system and proposed methods. From the analysis of the results we discuss drawbacks of our current approach and future direction of the research. Finally, a fully-functioning prototype has been created to demonstrate the end-to-end system. Master of Engineering (SCE) 2013-08-30T03:03:27Z 2013-08-30T03:03:27Z 2013 2013 Thesis http://hdl.handle.net/10356/54823 en 97 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Amit Kumar Saini
A framework for mining opinions from user generated content
description In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating meaningful summaries. We have built our system using travel reviews but the system can be used on any domain with minimal changes. Our focus is to come up with a highly scalable framework. A system which can scale both horizontally and vertically to deploy on large scale distributed systems. Hence, we presented an architecture by carefully examining every component used in the system including the database for storing reviews. We have compared various choices and chosen state-of-the-art open source technologies that use distributed multi-node architecture. As a result, millions of reviews can be stored and indexed. We have used travel reviews for testing purpose but the system can be used on any domain with minimal changes. We have implemented a dynamic feature extraction engine that utilizes unsupervised learning to associate extracted features and opinions starting with only one domain seed feature. For example, the feature seed word 'hotel' is all that is needed to extract a list of related hotel feature words like 'room' and 'service'. Next we extract the opinions expressed on the dynamically extracted features and perform sentence level sentiment analysis. To present the results to the end user in an intuitive manner, we subsequently created a web interface and experimented with new visualization techniques. Experiments were conducted to evaluate the system and proposed methods. From the analysis of the results we discuss drawbacks of our current approach and future direction of the research. Finally, a fully-functioning prototype has been created to demonstrate the end-to-end system.
author2 Chang Kuiyu
author_facet Chang Kuiyu
Amit Kumar Saini
format Theses and Dissertations
author Amit Kumar Saini
author_sort Amit Kumar Saini
title A framework for mining opinions from user generated content
title_short A framework for mining opinions from user generated content
title_full A framework for mining opinions from user generated content
title_fullStr A framework for mining opinions from user generated content
title_full_unstemmed A framework for mining opinions from user generated content
title_sort framework for mining opinions from user generated content
publishDate 2013
url http://hdl.handle.net/10356/54823
_version_ 1759853350682099712