Vox pop: Automated opinion detection and classification with data clustering

A large amount of opinions, such as those found in blogs, forums and product reviews, are being uploaded daily as internet technology is progressing. However, these data bring more inconvenience than benefits due to its lack or organization. It is also difficult to find and underutilized. With the u...

Full description

Saved in:
Bibliographic Details
Main Authors: Bautista, Glecer Z., Garcia, Michael Adrian S., Tan, Richmond Jamal C.
Format: text
Language:English
Published: Animo Repository 2010
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_bachelors/10133
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
Description
Summary:A large amount of opinions, such as those found in blogs, forums and product reviews, are being uploaded daily as internet technology is progressing. However, these data bring more inconvenience than benefits due to its lack or organization. It is also difficult to find and underutilized. With the use of Natural Language Processing, it is possible to organize these data making it useful to aid in decision or policy making. This paper will focus on the development of a system that uses text processing techniques in organizing the sentiments of public commentaries. Current systems are able to differentiate facts from opinions, as well as classify these opinions based on their polarity. Clustering has also been done based on the words used. The system Vox Pop performs there three functions, namely, opinion detection, polarity classification and clustering using a rule-based approach. Opinions are classified by computing for polarity using scores produced by SentiWordNet. Commentaries are clustered by computing for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering algorithm are some of the resources and tools used by the system. Expert and non-expert evaluations were done in order to test the system. The detection, classification and clustering modules have accuracy rates of 50.5% and 53.85% respectively.