Aggregating and analyzing Indian political tweets

The author’s final year project is a part of the Twitter Data Analysis project which aims to gain insight into Indian politics using data from Twitter Stream and applying NLP and Data Mining Techniques to the same. For developing an analytical engine which does said things, historical as well as cur...

Full description

Saved in:
Bibliographic Details
Main Author: Chirag Ruhela
Other Authors: Anwitaman Datta
Format: Final Year Project
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/10356/59920
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-59920
record_format dspace
spelling sg-ntu-dr.10356-599202023-03-03T20:53:25Z Aggregating and analyzing Indian political tweets Chirag Ruhela Anwitaman Datta School of Computer Engineering Parallel and Distributed Computing Centre DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing The author’s final year project is a part of the Twitter Data Analysis project which aims to gain insight into Indian politics using data from Twitter Stream and applying NLP and Data Mining Techniques to the same. For developing an analytical engine which does said things, historical as well as current data about Indian Politics has to be analysed by building mathematical models to uncover patterns and correlations and be able to understand political events and upheavals. The historical data to be analysed can be huge in size if accurate mathematical models need to be built. Understanding this huge data-set in its raw form is not possible due to the sheer dimensionality of the data-set. Thus dimensionality reduction and clever insightful visualizations are needed to make this data consumable for general public. As part of this project the author has designed and implemented a Sentiment Analysis Engine using Affective Norms for English Words (ANEW) framework for a Natural Language Processing Model based sentiment detection of twitter data. A topic identification module has also been implemented using tf-idf algorithm. The dimensionality reduction of the data set has been done using Scatterplot visualization of tweet sentiments alongside topic clusters. Heat Maps and Word Clouds have been used to simplify the data consumption. The affinity graph has been implemented to show diffusion networks for various topics and people. Lastly, the raw tweets are also presented in a tabular form for those interested in the raw data. Bachelor of Engineering (Computer Science) 2014-05-19T06:53:10Z 2014-05-19T06:53:10Z 2014 2014 Final Year Project (FYP) http://hdl.handle.net/10356/59920 en Nanyang Technological University 54 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Chirag Ruhela
Aggregating and analyzing Indian political tweets
description The author’s final year project is a part of the Twitter Data Analysis project which aims to gain insight into Indian politics using data from Twitter Stream and applying NLP and Data Mining Techniques to the same. For developing an analytical engine which does said things, historical as well as current data about Indian Politics has to be analysed by building mathematical models to uncover patterns and correlations and be able to understand political events and upheavals. The historical data to be analysed can be huge in size if accurate mathematical models need to be built. Understanding this huge data-set in its raw form is not possible due to the sheer dimensionality of the data-set. Thus dimensionality reduction and clever insightful visualizations are needed to make this data consumable for general public. As part of this project the author has designed and implemented a Sentiment Analysis Engine using Affective Norms for English Words (ANEW) framework for a Natural Language Processing Model based sentiment detection of twitter data. A topic identification module has also been implemented using tf-idf algorithm. The dimensionality reduction of the data set has been done using Scatterplot visualization of tweet sentiments alongside topic clusters. Heat Maps and Word Clouds have been used to simplify the data consumption. The affinity graph has been implemented to show diffusion networks for various topics and people. Lastly, the raw tweets are also presented in a tabular form for those interested in the raw data.
author2 Anwitaman Datta
author_facet Anwitaman Datta
Chirag Ruhela
format Final Year Project
author Chirag Ruhela
author_sort Chirag Ruhela
title Aggregating and analyzing Indian political tweets
title_short Aggregating and analyzing Indian political tweets
title_full Aggregating and analyzing Indian political tweets
title_fullStr Aggregating and analyzing Indian political tweets
title_full_unstemmed Aggregating and analyzing Indian political tweets
title_sort aggregating and analyzing indian political tweets
publishDate 2014
url http://hdl.handle.net/10356/59920
_version_ 1759857774600126464