FeLex builder: A semi-supervised lexical resource builder for opinion mining in product reviews

As the Internet continuous to expand into one of the largest sources of information in the world, more and more applications are being created that are able to extract and make use of valuable knowledge obtained from it. Opinion Mining (OM) and Sentiment Analysis (SA) are sub-fields of Natural Langu...

Full description

Saved in:
Bibliographic Details
Main Authors: Arcilla, Angelo Miguel E., Esquivel, Antonn Vittorio S., Quiros, Celina Franchesca G., Velasco, Karina Francheska O.
Format: text
Language:English
Published: Animo Repository 2012
Online Access:https://animorepository.dlsu.edu.ph/etd_bachelors/11825
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
Description
Summary:As the Internet continuous to expand into one of the largest sources of information in the world, more and more applications are being created that are able to extract and make use of valuable knowledge obtained from it. Opinion Mining (OM) and Sentiment Analysis (SA) are sub-fields of Natural Language Processing (NLP) that focus on the subjective elements of text, and the inherent opinions of the writer rather than the contents of their message. The subjective and opinionated nature of the Internet makes it a prime source of data for OM and SA applications. Sentiment Lexicons are vast databases that stores polarity scores of words, and are used by many sentiment analysis applications as a standard data source. However, one of the main limitations of modern sentiment lexicons is that they do not take the content of the word into account, and how the polarity of a word can change depending on what it is describing. As certain words may have different opinion orientations when pertaining to different objects, applications that use standard Sentiment Lexicons like SentiWordnet are not able to efficiently identify the opinion content of a certain text. This research addressed this issue through the creation of a lexicon builder that involves word pairs instead of individual words. Felex Builder is a feature-descriptor based lexical resource builder that is created through an automated extraction of word pairs present in texts and a semi-supervised polarity scoring process, and has been built and tested on online product reviews. In order to test the accuracy of the system, tests were based on data from six different product categories on Amazon. The experimental evaluation shows a 75.02% accuracy performance of the Felex Builder in extracting word pairs.