Sentiment analysis on the web

In the Information Age, the wide range of Web usage has been increasing due to the advancement in hardware and software technology. As a result of that, the Web becomes the valuable source of massive amount of data contents. Nowadays, large volumes of data are created by Internet users. Among the di...

Full description

Saved in:
Bibliographic Details
Main Author: Chit, Lin Su.
Other Authors: Ong Yew Soon
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/55010
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In the Information Age, the wide range of Web usage has been increasing due to the advancement in hardware and software technology. As a result of that, the Web becomes the valuable source of massive amount of data contents. Nowadays, large volumes of data are created by Internet users. Among the different kinds of data available on the Web, considerable amount of data comes from social media. This is the place where users express themselves freely in the context of various topics. Therefore, sentiment data has gained increasing attention from both business and consumer to discovery valuable knowledge from these kinds of data. However, in order to accomplish analyzing the sentiment data, step by step processes have to be executed. In this project, software application was developed in order to support all step by step processes involved in sentiment analysis on the Web. Software application was separated into different software components to assist in data collection, data preparation, sentiment analysis, and data visualization processes. Literature studies were done for a better understanding of these processes. Software design methodology was created with the use of Unified Modeling Language (UML) before the actual implementation was performed using Java object oriented programing language in NetBeans Integrated Development Environment (IDE). Software testing was done for each process by using the real world online review data from Amazon web site. Web crawler and parser processed the real world data, and data pre-processor and text processor performed data transformation. Different kinds of sentiment classification techniques such as Naïve Bayes, Sequential Minimal Optimization and k-Nearest Neighbor learning were applied in sentiment analysis on the Web and results were visualized for end users. Classification accuracy results were observed and compared in which SMO performed better than Naïve Bayes and kNN in different scenarios. One of the research works of domain adaption were analyzed and perform experimentations for future direction of sentiment analysis.