Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews

This paper reports a study in automatic sentiment classification, i.e., automatically classifying documents as expressing positive or negative sentiments/opinions. The study investigates the effectiveness of using SVM (Support Vector Machine) on various text features to classify product reviews i...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhou, Yunyun, Khoo, Christopher S. G., Na, Jin-Cheon, Sui, Haiyang, Chan, Syin
Other Authors: Wee Kim Wee School of Communication and Information
Format: Conference or Workshop Item
Language:English
Published: 2014
Subjects:
Online Access:https://hdl.handle.net/10356/101094
http://hdl.handle.net/10220/20045
http://www.ergon-verlag.de/bibliotheks--informationswissenschaft/advances-in-knowledge-organization/band-9.php
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-101094
record_format dspace
spelling sg-ntu-dr.10356-1010942019-12-06T20:33:19Z Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews Zhou, Yunyun Khoo, Christopher S. G. Na, Jin-Cheon Sui, Haiyang Chan, Syin Wee Kim Wee School of Communication and Information International ISKO Conference (8th : 2004 : London) Communication and Information This paper reports a study in automatic sentiment classification, i.e., automatically classifying documents as expressing positive or negative sentiments/opinions. The study investigates the effectiveness of using SVM (Support Vector Machine) on various text features to classify product reviews into recommended (positive sentiment) and not recommended (negative sentiment). Compared with traditional topical classification, it was hypothesized that syntactic and semantic processing of text would be more important for sentiment classification. In the first part of this study, several different approaches, unigrams (individual words), selected words (such as verb, adjective, and adverb), and words labeled with part-of-speech tags were investigated. A sample of 1,800 various product reviews was retrieved from Review Centre (www.reviewcentre.com) for the study. 1,200 reviews were used for training, and 600 for testing. Using SVM, the baseline unigram approach obtained an accuracy rate of around 76%. The use of selected words obtained a marginally better result of 77.33%. Error analysis suggests various approaches for improving classification accuracy: use of negation phrase, making inference from superficial words, and solving the problem of comments on parts. The second part of the study that is in progress investigates the use of negation phrase through simple linguistic processing to improve classification accuracy. This approach increased the accuracy rate up to 79.33%. Accepted version 2014-07-03T04:59:11Z 2019-12-06T20:33:19Z 2014-07-03T04:59:11Z 2019-12-06T20:33:19Z 2004 2004 Conference Paper Na, J.-C., Sui, H., Khoo, C. S. G., Chan, S., & Zhou, Y. (2004). Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews. In I.C. McIlwaine (Ed.), Knowledge Organization and the Global Information Society: Proceedings of the Eighth International ISKO Conference (pp. 49-54). Wurzburg, Germany: Ergon Verlag. https://hdl.handle.net/10356/101094 http://hdl.handle.net/10220/20045 http://www.ergon-verlag.de/bibliotheks--informationswissenschaft/advances-in-knowledge-organization/band-9.php en © 2004 International ISKO Conference. This is the author created version of a work that has been peer reviewed and accepted for publication by Proceedings of the Eighth International ISKO Conference, International ISKO Conference. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [URL: http://www.ergon-verlag.de/bibliotheks--informationswissenschaft/advances-in-knowledge-organization/band-9.php]. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Communication and Information
spellingShingle Communication and Information
Zhou, Yunyun
Khoo, Christopher S. G.
Na, Jin-Cheon
Sui, Haiyang
Chan, Syin
Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
description This paper reports a study in automatic sentiment classification, i.e., automatically classifying documents as expressing positive or negative sentiments/opinions. The study investigates the effectiveness of using SVM (Support Vector Machine) on various text features to classify product reviews into recommended (positive sentiment) and not recommended (negative sentiment). Compared with traditional topical classification, it was hypothesized that syntactic and semantic processing of text would be more important for sentiment classification. In the first part of this study, several different approaches, unigrams (individual words), selected words (such as verb, adjective, and adverb), and words labeled with part-of-speech tags were investigated. A sample of 1,800 various product reviews was retrieved from Review Centre (www.reviewcentre.com) for the study. 1,200 reviews were used for training, and 600 for testing. Using SVM, the baseline unigram approach obtained an accuracy rate of around 76%. The use of selected words obtained a marginally better result of 77.33%. Error analysis suggests various approaches for improving classification accuracy: use of negation phrase, making inference from superficial words, and solving the problem of comments on parts. The second part of the study that is in progress investigates the use of negation phrase through simple linguistic processing to improve classification accuracy. This approach increased the accuracy rate up to 79.33%.
author2 Wee Kim Wee School of Communication and Information
author_facet Wee Kim Wee School of Communication and Information
Zhou, Yunyun
Khoo, Christopher S. G.
Na, Jin-Cheon
Sui, Haiyang
Chan, Syin
format Conference or Workshop Item
author Zhou, Yunyun
Khoo, Christopher S. G.
Na, Jin-Cheon
Sui, Haiyang
Chan, Syin
author_sort Zhou, Yunyun
title Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
title_short Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
title_full Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
title_fullStr Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
title_full_unstemmed Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
title_sort effectiveness of simple linguistic processing in automatic sentiment classification of product reviews
publishDate 2014
url https://hdl.handle.net/10356/101094
http://hdl.handle.net/10220/20045
http://www.ergon-verlag.de/bibliotheks--informationswissenschaft/advances-in-knowledge-organization/band-9.php
_version_ 1681038510628274176