Sentiment analysis and summarization of movie reviews across genres.

With the exponential growth of user-generated content on the Web, there is an increasing need for a way to analyze and summarize online opinions effectively and efficiently. This study is in the area of opinion mining and under the field of social media monitoring. It aims to develop an automatic ap...

Full description

Saved in:
Bibliographic Details
Main Author: Tun, Thura Thet.
Other Authors: Jin Cheon Na
Format: Theses and Dissertations
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10356/48207
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the exponential growth of user-generated content on the Web, there is an increasing need for a way to analyze and summarize online opinions effectively and efficiently. This study is in the area of opinion mining and under the field of social media monitoring. It aims to develop an automatic approach for sentiment analysis and summarization of online opinions across genres (such as critic review, user review, discussion board and blog), and focuses on the movie review domain. The objectives of this study are to: 1) investigate the characteristics and implications of different online genres in expressing opinions; 2) develop specific methods for sentiment analysis of review documents in different online genres; and 3) develop an effective method to combine and summarize the results of sentiment analysis across multiple genres. In order to develop an effective sentiment analysis method, the characteristics and implications of each review genre were analysed carefully. The results indicate that review texts in different genres have different characteristics, and cover various aspects of a movie differently. Two methods for sentiment analysis, one using a machine learning technique and another using a natural language processing and rule-based approach, were developed and evaluated in this study. A supervised machine learning approach was developed for document-level sentiment classification. The approach provides detailed sentiment analysis results (e.g., sentiment orientation towards director and cast aspects). However, this approach can only classify the sentiment orientation (i.e. positive or negative) of relatively long review documents. Therefore, a more sophisticated method for sentence-level sentiment analysis using a natural language processing and rule-based approach was developed. The method adopts a linguistic approach of computing the sentiment of a clause from the prior sentiment scores assigned to individual words, taking into consideration the grammatical dependency structure of the clause. The prior sentiment scores of about 32,000 individual words were derived from SentiWordNet with the help of a subjectivity lexicon. Negation of sentiment terms is also delicately handled. The experimental results prove the effectiveness of the approach. The accuracies of clause level sentiment classification for overall movie, director, cast, story, scene, and music aspects are 75%, 86%, 83%, 80%, 90%, and 81% respectively. A prototype for sentiment summarization of movie review, which utilizes the fine-grained sentiment analysis results, was implemented. Graphical representation methods such as sentiment bar, sentiment meter, thumbs up/down, sentiment term cloud, sentiment treemap, sentiment time series, aspect-based sentiment summarization and sentiment link analysis graph (named SentiGraph) were implemented and evaluated. The evaluation findings indicate that the participants were generally satisfied with the results of sentiment summarization, and found the proposed approach novel and useful. The contributions of the study are the sentiment content analysis results of online genres in expressing opinions, and an automatic approach for sentiment analysis and summarization of online opinions across genres. The proposed approach is capable of providing better performance and more refined sentiment analysis, and summarization results compared to previous studies which focus mainly on sentiment orientation of textual units (i.e. terms, phrases, sentences, snippets or documents) or extraction of feature-opinion pairs for sentiment summarization. This study is motivated by a variety of information needs, and the ability to analyse and summarize online opinions from multiple sources across genres will contribute not only to academic communities but also to many real-world applications.