Sentiment analysis in Twitter
Nowadays, social media platforms, such as Facebook, Twitter and Instagram, have gained tremendous popularity. These platforms allow people to post real time messages about their opinions on a variety of topics, discuss current issues, complain, and express positive feelings. A rising trend for compa...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/66784 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-66784 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-667842023-03-03T20:54:30Z Sentiment analysis in Twitter Zhang, Chen Lin Weisi School of Computer Engineering A*STAR Institute for Infocomm Research (I2R) DRNTU::Engineering Nowadays, social media platforms, such as Facebook, Twitter and Instagram, have gained tremendous popularity. These platforms allow people to post real time messages about their opinions on a variety of topics, discuss current issues, complain, and express positive feelings. A rising trend for companies or research institutions to analyse opinions and feelings hidden in messages on social media platform has been seen in recent years. In this thesis, the scope of analysis is narrowed down to opinion mining on the messages on Twitter, the so-called tweets. The specific task is to develop a sentiment analysis system for the three-point scale message polarity subtask of the Twitter sentiment analysis task in Semantic Evaluation Exercises 2016 (SemEval-2016). A baseline system had been developed upon which improvements were made. Three other systems had been integrated into the baseline system via the powerful classifier fusion process. One of the three systems leveraged a new asymmetric SIMPLS (ASIMPLS) based classifier whereas the rest leveraged L2-regularized linear regression. ASIMPLS was proved to be able to identify the minority class well in imbalanced classification problems and L2-regularized linear regression was proved to be efficient and of relatively good performance. Besides those features used in most existing systems, word embedding was introduced. For each word, three word embedding vectors derived from positive, neutral, and negative tweet sets respectively were obtained. These vectors are used as features in the ASIMPLS system. The final fusion system achieved 59.63% accuracy evaluated based on the without-neutral F1-score on the SemEval-2016 test set and ranked 7th among 34 systems in the competition. Bachelor of Engineering (Computer Engineering) 2016-04-26T03:48:25Z 2016-04-26T03:48:25Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66784 en Nanyang Technological University 45 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering |
spellingShingle |
DRNTU::Engineering Zhang, Chen Sentiment analysis in Twitter |
description |
Nowadays, social media platforms, such as Facebook, Twitter and Instagram, have gained tremendous popularity. These platforms allow people to post real time messages about their opinions on a variety of topics, discuss current issues, complain, and express positive feelings. A rising trend for companies or research institutions to analyse opinions and feelings hidden in messages on social media platform has been seen in recent years. In this thesis, the scope of analysis is narrowed down to opinion mining on the messages on Twitter, the so-called tweets. The specific task is to develop a sentiment analysis system for the three-point scale message polarity subtask of the Twitter sentiment analysis task in Semantic Evaluation Exercises 2016 (SemEval-2016).
A baseline system had been developed upon which improvements were made. Three other systems had been integrated into the baseline system via the powerful classifier fusion process. One of the three systems leveraged a new asymmetric SIMPLS (ASIMPLS) based classifier whereas the rest leveraged L2-regularized linear regression. ASIMPLS was proved to be able to identify the minority class well in imbalanced classification problems and L2-regularized linear regression was proved to be efficient and of relatively good performance.
Besides those features used in most existing systems, word embedding was introduced. For each word, three word embedding vectors derived from positive, neutral, and negative tweet sets respectively were obtained. These vectors are used as features in the ASIMPLS system.
The final fusion system achieved 59.63% accuracy evaluated based on the without-neutral F1-score on the SemEval-2016 test set and ranked 7th among 34 systems in the competition. |
author2 |
Lin Weisi |
author_facet |
Lin Weisi Zhang, Chen |
format |
Final Year Project |
author |
Zhang, Chen |
author_sort |
Zhang, Chen |
title |
Sentiment analysis in Twitter |
title_short |
Sentiment analysis in Twitter |
title_full |
Sentiment analysis in Twitter |
title_fullStr |
Sentiment analysis in Twitter |
title_full_unstemmed |
Sentiment analysis in Twitter |
title_sort |
sentiment analysis in twitter |
publishDate |
2016 |
url |
http://hdl.handle.net/10356/66784 |
_version_ |
1759856037732548608 |