Vox pop: Automated opinion detection and classification with data clustering

A large amount of opinions, such as those found in blogs, forums and product reviews, are being uploaded daily as internet technology is progressing. However, these data bring more inconvenience than benefits due to its lack or organization. It is also difficult to find and underutilized. With the u...

Full description

Saved in:
Bibliographic Details
Main Authors: Bautista, Glecer Z., Garcia, Michael Adrian S., Tan, Richmond Jamal C.
Format: text
Language:English
Published: Animo Repository 2010
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_bachelors/10133
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_bachelors-10778
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_bachelors-107782023-01-30T06:27:43Z Vox pop: Automated opinion detection and classification with data clustering Bautista, Glecer Z. Garcia, Michael Adrian S. Tan, Richmond Jamal C. A large amount of opinions, such as those found in blogs, forums and product reviews, are being uploaded daily as internet technology is progressing. However, these data bring more inconvenience than benefits due to its lack or organization. It is also difficult to find and underutilized. With the use of Natural Language Processing, it is possible to organize these data making it useful to aid in decision or policy making. This paper will focus on the development of a system that uses text processing techniques in organizing the sentiments of public commentaries. Current systems are able to differentiate facts from opinions, as well as classify these opinions based on their polarity. Clustering has also been done based on the words used. The system Vox Pop performs there three functions, namely, opinion detection, polarity classification and clustering using a rule-based approach. Opinions are classified by computing for polarity using scores produced by SentiWordNet. Commentaries are clustered by computing for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering algorithm are some of the resources and tools used by the system. Expert and non-expert evaluations were done in order to test the system. The detection, classification and clustering modules have accuracy rates of 50.5% and 53.85% respectively. 2010-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_bachelors/10133 Bachelor's Theses English Animo Repository Cluster analysis Cluster analysis--Data processing.
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Cluster analysis
Cluster analysis--Data processing.
spellingShingle Cluster analysis
Cluster analysis--Data processing.
Bautista, Glecer Z.
Garcia, Michael Adrian S.
Tan, Richmond Jamal C.
Vox pop: Automated opinion detection and classification with data clustering
description A large amount of opinions, such as those found in blogs, forums and product reviews, are being uploaded daily as internet technology is progressing. However, these data bring more inconvenience than benefits due to its lack or organization. It is also difficult to find and underutilized. With the use of Natural Language Processing, it is possible to organize these data making it useful to aid in decision or policy making. This paper will focus on the development of a system that uses text processing techniques in organizing the sentiments of public commentaries. Current systems are able to differentiate facts from opinions, as well as classify these opinions based on their polarity. Clustering has also been done based on the words used. The system Vox Pop performs there three functions, namely, opinion detection, polarity classification and clustering using a rule-based approach. Opinions are classified by computing for polarity using scores produced by SentiWordNet. Commentaries are clustered by computing for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering algorithm are some of the resources and tools used by the system. Expert and non-expert evaluations were done in order to test the system. The detection, classification and clustering modules have accuracy rates of 50.5% and 53.85% respectively.
format text
author Bautista, Glecer Z.
Garcia, Michael Adrian S.
Tan, Richmond Jamal C.
author_facet Bautista, Glecer Z.
Garcia, Michael Adrian S.
Tan, Richmond Jamal C.
author_sort Bautista, Glecer Z.
title Vox pop: Automated opinion detection and classification with data clustering
title_short Vox pop: Automated opinion detection and classification with data clustering
title_full Vox pop: Automated opinion detection and classification with data clustering
title_fullStr Vox pop: Automated opinion detection and classification with data clustering
title_full_unstemmed Vox pop: Automated opinion detection and classification with data clustering
title_sort vox pop: automated opinion detection and classification with data clustering
publisher Animo Repository
publishDate 2010
url https://animorepository.dlsu.edu.ph/etd_bachelors/10133
_version_ 1756432681919315968