Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
This paper reports a study in automatic sentiment classification, i.e., automatically classifying documents as expressing positive or negative sentiments/opinions. The study investigates the effectiveness of using SVM (Support Vector Machine) on various text features to classify product reviews i...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/101094 http://hdl.handle.net/10220/20045 http://www.ergon-verlag.de/bibliotheks--informationswissenschaft/advances-in-knowledge-organization/band-9.php |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This paper reports a study in automatic sentiment classification, i.e., automatically
classifying documents as expressing positive or negative sentiments/opinions. The study investigates
the effectiveness of using SVM (Support Vector Machine) on various text features to classify product
reviews into recommended (positive sentiment) and not recommended (negative sentiment). Compared
with traditional topical classification, it was hypothesized that syntactic and semantic processing of
text would be more important for sentiment classification. In the first part of this study, several
different approaches, unigrams (individual words), selected words (such as verb, adjective, and
adverb), and words labeled with part-of-speech tags were investigated. A sample of 1,800 various
product reviews was retrieved from Review Centre (www.reviewcentre.com) for the study. 1,200
reviews were used for training, and 600 for testing. Using SVM, the baseline unigram approach
obtained an accuracy rate of around 76%. The use of selected words obtained a marginally better
result of 77.33%. Error analysis suggests various approaches for improving classification accuracy:
use of negation phrase, making inference from superficial words, and solving the problem of
comments on parts. The second part of the study that is in progress investigates the use of negation
phrase through simple linguistic processing to improve classification accuracy. This approach
increased the accuracy rate up to 79.33%. |
---|