Modelling public sentiment in Twitter
People often use social media as an outlet for their emotions and opinions. Analysing social media text to extract sentiment can help reveal the thoughts and opinions people have about the world they live in. This thesis contributes to the field of Sentiment Analysis, which aims to understand how pe...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/62812 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-62812 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-628122023-03-03T20:52:28Z Modelling public sentiment in Twitter Chikersal, Prerna Chng Eng Siong Erik Cambria School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences People often use social media as an outlet for their emotions and opinions. Analysing social media text to extract sentiment can help reveal the thoughts and opinions people have about the world they live in. This thesis contributes to the field of Sentiment Analysis, which aims to understand how people convey sentiment in order to ultimately deduce their emotions and opinions. While several sentiment classification methods have been devised, the increasing magnitude and complexity of social data calls for scrutiny and advancement of these methods. The scope of this project is to improve traditional supervised learning methods for Twitter polarity detection by using rule-based classifiers, linguistic patterns, and common-sense knowledge based information.This thesis begins by introducing some terminologies and challenges pertaining to sentiment analysis, followed by sub-tasks or goals of sentiment analysis and a survey of commonly used approaches. In the first phase of the project, we propose a sentiment analysis system that combines a rule-based classifier with supervised learning to classify tweets into positive, negative and neutral using the training set provided by SemEval 2015 and test sets provided by SemEval 2013. We find that the average positive and negative f-measure improves by 0.5 units when we add a rule-based classification layer to the supervised learning classifier. This demonstrates that combining high-confidence linguistic rules with supervised learning can improve classification. In the second phase of this project, we extend our work further by proposing a sentiment analysis system that leverages on complex linguistic rules and common-sense based sentic computing resources to enhance supervised learning, and classify tweets into positive and negative. We train our classifier on the training set provided by Sentiment140 and test it on positive and negative tweets from the test sets provided by SemEval 2013 and SemEval 2014. We find that our system achieves an average positive and negative f-measure that is 4.47 units and 3.32 units more than the standard n-grams model for the two datasets respectively. Supervised learning classifiers often misclassify tweets containing conjunctions like "but" and conditionals like "if", due to their special linguistic characteristics. These classifiers also assign a decision score very close to the decision boundary for a large number tweets, which suggests that they are simply unsure instead of being completely wrong about these tweets. The second system proposed in this thesis attempts to enhance supervised classification by countering these two challenges. An online real-time system (http://www.twitter.gelbukh.com/) is also implemented to demonstrate the results obtained, however it is still primitive and a work-in-progress. Bachelor of Engineering (Computer Science) 2015-04-29T06:50:31Z 2015-04-29T06:50:31Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/62812 en Nanyang Technological University 57 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences Chikersal, Prerna Modelling public sentiment in Twitter |
description |
People often use social media as an outlet for their emotions and opinions. Analysing social media text to extract sentiment can help reveal the thoughts and opinions people have about the world they live in. This thesis contributes to the field of Sentiment Analysis, which aims to understand how people convey sentiment in order to ultimately deduce their emotions and opinions. While several sentiment classification methods have been devised, the increasing magnitude and complexity of social data calls for scrutiny and advancement of these methods. The scope of this project is to improve traditional supervised learning methods for Twitter polarity detection by using rule-based classifiers, linguistic patterns, and common-sense knowledge based information.This thesis begins by introducing some terminologies and challenges pertaining to sentiment analysis, followed by sub-tasks or goals of sentiment analysis and a survey of commonly used approaches. In the first phase of the project, we propose a sentiment analysis system that combines a rule-based classifier with supervised learning to classify tweets into positive, negative and neutral using the training set provided by SemEval 2015 and test sets provided by SemEval 2013. We find that the average positive and negative f-measure improves by 0.5 units when we add a rule-based classification layer to the supervised learning classifier. This demonstrates that combining high-confidence linguistic rules with supervised learning can improve classification. In the second phase of this project, we extend our work further by proposing a sentiment analysis system that leverages on complex linguistic rules and common-sense based sentic computing resources to enhance supervised learning, and classify tweets into positive and negative. We train our classifier on the training set provided by Sentiment140 and test it on positive and negative tweets from the test sets provided by SemEval 2013 and SemEval 2014. We find that our system achieves an average positive and negative f-measure that is 4.47 units and 3.32 units more than the standard n-grams model for the two datasets respectively. Supervised learning classifiers often misclassify tweets containing conjunctions like "but" and conditionals like "if", due to their special linguistic characteristics. These classifiers also assign a decision score very close to the decision boundary for a large number tweets, which suggests that they are simply unsure instead of being completely wrong about these tweets. The second system proposed in this thesis attempts to enhance supervised classification by countering these two challenges. An online real-time system (http://www.twitter.gelbukh.com/) is also implemented to demonstrate the results obtained, however it is still primitive and a work-in-progress. |
author2 |
Chng Eng Siong |
author_facet |
Chng Eng Siong Chikersal, Prerna |
format |
Final Year Project |
author |
Chikersal, Prerna |
author_sort |
Chikersal, Prerna |
title |
Modelling public sentiment in Twitter |
title_short |
Modelling public sentiment in Twitter |
title_full |
Modelling public sentiment in Twitter |
title_fullStr |
Modelling public sentiment in Twitter |
title_full_unstemmed |
Modelling public sentiment in Twitter |
title_sort |
modelling public sentiment in twitter |
publishDate |
2015 |
url |
http://hdl.handle.net/10356/62812 |
_version_ |
1759853021389389824 |