User-level twitter polarity classification with a hybrid approach

With the objective of extracting useful information from the vast amount of opinion-rich data on Twitter, both supervised learning-based and unsupervised lexicon-based methods for sentiment analysis on Twitter corpus are studied in recent years. However, the unique characteristics of tweets such as...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Fan
Other Authors: Er Meng Joo
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/67830
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the objective of extracting useful information from the vast amount of opinion-rich data on Twitter, both supervised learning-based and unsupervised lexicon-based methods for sentiment analysis on Twitter corpus are studied in recent years. However, the unique characteristics of tweets such as the lack of labels and frequent usage of emoticons in tweets poses challenges to most of the existing learning-based and lexicon-based methods. In addition, studies on Twitter sentiment analysis nowadays mainly focus on domain specific tweets while a larger amount of tweets are about personal feelings and comments on daily life events. Therefore, in this project, a hybrid approach combining augmented lexicon-based and learning-based method is designed to handle the distinctive characteristics of tweets and perform sentiment analysis on a user-level, providing us information of specific Twitter users’ typing habits and their online sentiment fluctuations. Our model is capable of achieving an overall accuracy of 83.3%, largely outperforming current baseline lexicon-based and learning-based models on user-level tweets classification.