Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons
This article introduces a new general-purpose sentiment lexicon called WKWSCI Sentiment Lexicon and compares it with five existing lexicons: Hu & Liu Opinion Lexicon, Multi-perspective Question Answering (MPQA) Subjectivity Lexicon, General Inquirer, National Research Council Canada (NRC) Word-S...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/83570 http://hdl.handle.net/10220/42704 https://doi.org/10.21979/N9/DWWEBV |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-83570 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-835702021-01-18T04:50:20Z Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons Khoo, Christopher S. G. Johnkhan, Sathik Basha Wee Kim Wee School of Communication and Information Sentiment analysis Sentiment categorisation This article introduces a new general-purpose sentiment lexicon called WKWSCI Sentiment Lexicon and compares it with five existing lexicons: Hu & Liu Opinion Lexicon, Multi-perspective Question Answering (MPQA) Subjectivity Lexicon, General Inquirer, National Research Council Canada (NRC) Word-Sentiment Association Lexicon and Semantic Orientation Calculator (SO-CAL) lexicon. The effectiveness of the sentiment lexicons for sentiment categorisation at the document level and sentence level was evaluated using an Amazon product review data set and a news headlines data set. WKWSCI, MPQA, Hu & Liu and SO-CAL lexicons are equally good for product review sentiment categorisation, obtaining accuracy rates of 75%–77% when appropriate weights are used for different categories of sentiment words. However, when a training corpus is not available, Hu & Liu obtained the best accuracy with a simple-minded approach of counting positive and negative words for both document-level and sentence-level sentiment categorisation. The WKWSCI lexicon obtained the best accuracy of 69% on the news headlines sentiment categorisation task, and the sentiment strength values obtained a Pearson correlation of 0.57 with human-assigned sentiment values. It is recommended that the Hu & Liu lexicon be used for product review texts and the WKWSCI lexicon for non-review texts. Accepted version 2017-06-14T07:38:22Z 2019-12-06T15:25:51Z 2017-06-14T07:38:22Z 2019-12-06T15:25:51Z 2017 Journal Article Khoo, C. S. G., & Johnkhan, S. B. (2017). Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons. Journal of Information Science, in press. 0165-5515 https://hdl.handle.net/10356/83570 http://hdl.handle.net/10220/42704 10.1177/0165551517703514 en Journal of Information Science https://doi.org/10.21979/N9/DWWEBV © 2017 The Author(s) (published by SAGE Publications). This is the author created version of a work that has been peer reviewed and accepted for publication in Journal of Information Science, published by SAGE Publications on behalf of the author(s). It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1177/0165551517703514]. 21 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Sentiment analysis Sentiment categorisation |
spellingShingle |
Sentiment analysis Sentiment categorisation Khoo, Christopher S. G. Johnkhan, Sathik Basha Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons |
description |
This article introduces a new general-purpose sentiment lexicon called WKWSCI Sentiment Lexicon and compares it with five existing lexicons: Hu & Liu Opinion Lexicon, Multi-perspective Question Answering (MPQA) Subjectivity Lexicon, General Inquirer, National Research Council Canada (NRC) Word-Sentiment Association Lexicon and Semantic Orientation Calculator (SO-CAL) lexicon. The effectiveness of the sentiment lexicons for sentiment categorisation at the document level and sentence level was evaluated using an Amazon product review data set and a news headlines data set. WKWSCI, MPQA, Hu & Liu and SO-CAL lexicons are equally good for product review sentiment categorisation, obtaining accuracy rates of 75%–77% when appropriate weights are used for different categories of sentiment words. However, when a training corpus is not available, Hu & Liu obtained the best accuracy with a simple-minded approach of counting positive and negative words for both document-level and sentence-level sentiment categorisation. The WKWSCI lexicon obtained the best accuracy of 69% on the news headlines sentiment categorisation task, and the sentiment strength values obtained a Pearson correlation of 0.57 with human-assigned sentiment values. It is recommended that the Hu & Liu lexicon be used for product review texts and the WKWSCI lexicon for non-review texts. |
author2 |
Wee Kim Wee School of Communication and Information |
author_facet |
Wee Kim Wee School of Communication and Information Khoo, Christopher S. G. Johnkhan, Sathik Basha |
format |
Article |
author |
Khoo, Christopher S. G. Johnkhan, Sathik Basha |
author_sort |
Khoo, Christopher S. G. |
title |
Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons |
title_short |
Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons |
title_full |
Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons |
title_fullStr |
Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons |
title_full_unstemmed |
Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons |
title_sort |
lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons |
publishDate |
2017 |
url |
https://hdl.handle.net/10356/83570 http://hdl.handle.net/10220/42704 https://doi.org/10.21979/N9/DWWEBV |
_version_ |
1690658453387739136 |