Sentiment analysis in Twitter

Nowadays, social media platforms, such as Facebook, Twitter and Instagram, have gained tremendous popularity. These platforms allow people to post real time messages about their opinions on a variety of topics, discuss current issues, complain, and express positive feelings. A rising trend for compa...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Chen
Other Authors: Lin Weisi
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66784
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-66784
record_format dspace
spelling sg-ntu-dr.10356-667842023-03-03T20:54:30Z Sentiment analysis in Twitter Zhang, Chen Lin Weisi School of Computer Engineering A*STAR Institute for Infocomm Research (I2R) DRNTU::Engineering Nowadays, social media platforms, such as Facebook, Twitter and Instagram, have gained tremendous popularity. These platforms allow people to post real time messages about their opinions on a variety of topics, discuss current issues, complain, and express positive feelings. A rising trend for companies or research institutions to analyse opinions and feelings hidden in messages on social media platform has been seen in recent years. In this thesis, the scope of analysis is narrowed down to opinion mining on the messages on Twitter, the so-called tweets. The specific task is to develop a sentiment analysis system for the three-point scale message polarity subtask of the Twitter sentiment analysis task in Semantic Evaluation Exercises 2016 (SemEval-2016). A baseline system had been developed upon which improvements were made. Three other systems had been integrated into the baseline system via the powerful classifier fusion process. One of the three systems leveraged a new asymmetric SIMPLS (ASIMPLS) based classifier whereas the rest leveraged L2-regularized linear regression. ASIMPLS was proved to be able to identify the minority class well in imbalanced classification problems and L2-regularized linear regression was proved to be efficient and of relatively good performance. Besides those features used in most existing systems, word embedding was introduced. For each word, three word embedding vectors derived from positive, neutral, and negative tweet sets respectively were obtained. These vectors are used as features in the ASIMPLS system. The final fusion system achieved 59.63% accuracy evaluated based on the without-neutral F1-score on the SemEval-2016 test set and ranked 7th among 34 systems in the competition. Bachelor of Engineering (Computer Engineering) 2016-04-26T03:48:25Z 2016-04-26T03:48:25Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66784 en Nanyang Technological University 45 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Zhang, Chen
Sentiment analysis in Twitter
description Nowadays, social media platforms, such as Facebook, Twitter and Instagram, have gained tremendous popularity. These platforms allow people to post real time messages about their opinions on a variety of topics, discuss current issues, complain, and express positive feelings. A rising trend for companies or research institutions to analyse opinions and feelings hidden in messages on social media platform has been seen in recent years. In this thesis, the scope of analysis is narrowed down to opinion mining on the messages on Twitter, the so-called tweets. The specific task is to develop a sentiment analysis system for the three-point scale message polarity subtask of the Twitter sentiment analysis task in Semantic Evaluation Exercises 2016 (SemEval-2016). A baseline system had been developed upon which improvements were made. Three other systems had been integrated into the baseline system via the powerful classifier fusion process. One of the three systems leveraged a new asymmetric SIMPLS (ASIMPLS) based classifier whereas the rest leveraged L2-regularized linear regression. ASIMPLS was proved to be able to identify the minority class well in imbalanced classification problems and L2-regularized linear regression was proved to be efficient and of relatively good performance. Besides those features used in most existing systems, word embedding was introduced. For each word, three word embedding vectors derived from positive, neutral, and negative tweet sets respectively were obtained. These vectors are used as features in the ASIMPLS system. The final fusion system achieved 59.63% accuracy evaluated based on the without-neutral F1-score on the SemEval-2016 test set and ranked 7th among 34 systems in the competition.
author2 Lin Weisi
author_facet Lin Weisi
Zhang, Chen
format Final Year Project
author Zhang, Chen
author_sort Zhang, Chen
title Sentiment analysis in Twitter
title_short Sentiment analysis in Twitter
title_full Sentiment analysis in Twitter
title_fullStr Sentiment analysis in Twitter
title_full_unstemmed Sentiment analysis in Twitter
title_sort sentiment analysis in twitter
publishDate 2016
url http://hdl.handle.net/10356/66784
_version_ 1759856037732548608