Fine-grained sentiment classification of social media

This research is conducted to enhance the sentiment analysis for classifying data that are collected from Twitter into binary classes positive and negative. The project starts with the implementation of valence-based and rule-based method to improve the current simple polarity-based method. The impl...

Full description

Saved in:
Bibliographic Details
Main Author: Le Thi, Nhu Y
Other Authors: Lin Zhiping
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/71001
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This research is conducted to enhance the sentiment analysis for classifying data that are collected from Twitter into binary classes positive and negative. The project starts with the implementation of valence-based and rule-based method to improve the current simple polarity-based method. The implementation includes the tuning method for determining threshold value that gives the best classification results. Then, the results are compared and discussed, which concludes that the valence-based method performs better than the polarity-based method in various datasets. In addition, the Random Forest classifier with word frequency as feature is implemented and evaluated in comparison with other machine learning classifiers consisting of Support Vector Machine, Naïve Bayes, Maximum Entropy and Extreme Learning Machine. The tuning method of hyperparameters for Random Forest in different datasets is also explained, and an idea is introduced about the impact of parameters on its performance as well as its prospective application. The result has shown that with the proper tuning of Random Forest hyperparameters, including the number of decision trees and the maximum number of random features, it can give the highest accuracy for larger datasets in all the five classifiers discussed in this report.