Automatic sentiment classification of movie reviews.
The increasing number of online reviews of goods and services has lead to the development of many approaches for sentiment classification and analysis. This study presents a framework for sentiment classification for movie reviews. There are several existing approaches for sentiment classificati...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Research Report |
Language: | English |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/17252 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-17252 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-172522019-12-10T14:32:10Z Automatic sentiment classification of movie reviews. Chan, Kok Hong. Wee Kim Wee School of Communication and Information DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing The increasing number of online reviews of goods and services has lead to the development of many approaches for sentiment classification and analysis. This study presents a framework for sentiment classification for movie reviews. There are several existing approaches for sentiment classification. Sentiment classification using unigrams has being the most successful for most of the previous studies. However, results generated by unigrams could be degraded by negation terms and terms that require users to do inference. To address this problem, there are several studies that indicate that higher order n-grams have good potential of producing better classification. Problems encountered by unigrams such as negation terms could be solved by higher order n-grams such as bigrams because terms like “not good” has being extracted as a single term. In addition, higher order n-grams with feature reduction methods, such as X2 features reduction, are been explored to see if this attempt will produce better results. The movie reviews datasets are selected because they are considered to be one of the most difficult domains to classify. Producing good classification results from the movie review domain will ensure that good results will be achieved when applied on other datasets. The research methods used for this study will consist of three portions. Firstly, the results from the simple unigram approach in this study are compared with the results presented by Pang, Lee & Vaithyanathan (2002). Secondly, the classification results generated by higher n-grams and adjectives are compared to those presented by Pang et al. (2002). Lastly, the classification results after application of feature reduction methods such as X2 features reduction are compared. An application has also been developed for non-technical users so that these users are not subjected to the tedious process of creating training set and using sentiment classification. Additionally, this application has been bundled with additional feature selection options. 2009-06-03T06:49:02Z 2009-06-03T06:49:02Z 2008 2008 Research Report http://hdl.handle.net/10356/17252 en 97 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Chan, Kok Hong. Automatic sentiment classification of movie reviews. |
description |
The increasing number of online reviews of goods and services has lead to the development of many approaches for sentiment classification and analysis. This study presents a framework for sentiment classification for movie reviews.
There are several existing approaches for sentiment classification. Sentiment classification using unigrams has being the most successful for most of the previous studies. However, results generated by unigrams could be degraded by negation terms and terms that require users to do inference. To address this problem, there are several studies that indicate that higher order n-grams have good potential of producing better classification. Problems encountered by unigrams such as negation terms could be solved by higher order n-grams such as bigrams because terms like “not good” has being extracted as a single term. In addition, higher order n-grams with feature reduction methods, such as X2 features reduction, are been explored to see if this attempt will produce better results. The movie reviews datasets are selected because they are considered to be one of the most difficult domains to classify. Producing good classification results from the movie review domain will ensure that good results will be achieved when applied on other datasets.
The research methods used for this study will consist of three portions. Firstly, the results from the simple unigram approach in this study are compared with the results presented by Pang, Lee & Vaithyanathan (2002). Secondly, the classification results generated by higher n-grams and adjectives are compared to those presented by Pang et al. (2002). Lastly, the classification results after application of feature reduction methods such as X2 features reduction are compared.
An application has also been developed for non-technical users so that these users are not subjected to the tedious process of creating training set and using sentiment classification. Additionally, this application has been bundled with additional feature selection options. |
author2 |
Wee Kim Wee School of Communication and Information |
author_facet |
Wee Kim Wee School of Communication and Information Chan, Kok Hong. |
format |
Research Report |
author |
Chan, Kok Hong. |
author_sort |
Chan, Kok Hong. |
title |
Automatic sentiment classification of movie reviews. |
title_short |
Automatic sentiment classification of movie reviews. |
title_full |
Automatic sentiment classification of movie reviews. |
title_fullStr |
Automatic sentiment classification of movie reviews. |
title_full_unstemmed |
Automatic sentiment classification of movie reviews. |
title_sort |
automatic sentiment classification of movie reviews. |
publishDate |
2009 |
url |
http://hdl.handle.net/10356/17252 |
_version_ |
1681045436525182976 |