A study on rumour detection on online social networks

This study seeks to identify the key traits of rumours on online social networks such as Twitter and Facebook. The importance of automating the identification of rumours is growing ever-increasingly important, given the rise of the internet’s popularity as a source of news, and the ever-growing amou...

Full description

Saved in:

Bibliographic Details
Main Author:	Cheng, Gibson
Other Authors:	Yeo Chai Kiat
Format:	Final Year Project
Language:	English
Published:	2017
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	http://hdl.handle.net/10356/70315
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-70315
record_format	dspace
spelling	sg-ntu-dr.10356-703152023-03-03T20:42:46Z A study on rumour detection on online social networks Cheng, Gibson Yeo Chai Kiat School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence This study seeks to identify the key traits of rumours on online social networks such as Twitter and Facebook. The importance of automating the identification of rumours is growing ever-increasingly important, given the rise of the internet’s popularity as a source of news, and the ever-growing amount of information on the internet. A set of qualitative and quantitative metrics were developed to better understand the characteristics of each search query and the resultant dataset that it generates. The quantitative metrics indicate the size of the dataset, and the qualitative metrics evaluate the News/Rumour Purity and Contextual Purity of a dataset. The metrics will indicate how much preprocessing effort a dataset requires to dissect the different contexts from a dataset, and to make it more useful for further analysis. Leveraging on existing literature from both Computer Science and from the Social Sciences, three experiments were formulated: 1. What are the general sentiment profiles of the datasets? 2. How well can rumours and non-rumours be separated in rumour-centric datasets? 3. How well can rumours and non-rumours be separated using all datasets? The findings from the experiments indicate the following trends: 1. Features generated from sentiment analysis libraries such as SentiWordNet and AFINN can be as reliable as features generated from a tf-idf model, in terms of resultant classifier performance. 2. Tweets with a high proportion of neutral-sentiment words and a high proportion of punctuations are more likely to be related to the key contexts of their respective datasets. 3. Rumours and non-rumours can be separated with a high degree of accuracy, in the case of having two predefined but significantly different types of datasets (ie. One is rumour-centric, the other is news-centric) Through the course of this project, several custom software was developed. An Android information harvester was developed to automate the task of collecting tweets. A tweet processing and analysis software were developed to automate the testing for the experiments. Lastly, A web user interface for data visualisation was developed to easily gain insights from the experiment results. Bachelor of Engineering (Computer Science) 2017-04-19T04:42:26Z 2017-04-19T04:42:26Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/70315 en Nanyang Technological University 72 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Cheng, Gibson A study on rumour detection on online social networks
description	This study seeks to identify the key traits of rumours on online social networks such as Twitter and Facebook. The importance of automating the identification of rumours is growing ever-increasingly important, given the rise of the internet’s popularity as a source of news, and the ever-growing amount of information on the internet. A set of qualitative and quantitative metrics were developed to better understand the characteristics of each search query and the resultant dataset that it generates. The quantitative metrics indicate the size of the dataset, and the qualitative metrics evaluate the News/Rumour Purity and Contextual Purity of a dataset. The metrics will indicate how much preprocessing effort a dataset requires to dissect the different contexts from a dataset, and to make it more useful for further analysis. Leveraging on existing literature from both Computer Science and from the Social Sciences, three experiments were formulated: 1. What are the general sentiment profiles of the datasets? 2. How well can rumours and non-rumours be separated in rumour-centric datasets? 3. How well can rumours and non-rumours be separated using all datasets? The findings from the experiments indicate the following trends: 1. Features generated from sentiment analysis libraries such as SentiWordNet and AFINN can be as reliable as features generated from a tf-idf model, in terms of resultant classifier performance. 2. Tweets with a high proportion of neutral-sentiment words and a high proportion of punctuations are more likely to be related to the key contexts of their respective datasets. 3. Rumours and non-rumours can be separated with a high degree of accuracy, in the case of having two predefined but significantly different types of datasets (ie. One is rumour-centric, the other is news-centric) Through the course of this project, several custom software was developed. An Android information harvester was developed to automate the task of collecting tweets. A tweet processing and analysis software were developed to automate the testing for the experiments. Lastly, A web user interface for data visualisation was developed to easily gain insights from the experiment results.
author2	Yeo Chai Kiat
author_facet	Yeo Chai Kiat Cheng, Gibson
format	Final Year Project
author	Cheng, Gibson
author_sort	Cheng, Gibson
title	A study on rumour detection on online social networks
title_short	A study on rumour detection on online social networks
title_full	A study on rumour detection on online social networks
title_fullStr	A study on rumour detection on online social networks
title_full_unstemmed	A study on rumour detection on online social networks
title_sort	study on rumour detection on online social networks
publishDate	2017
url	http://hdl.handle.net/10356/70315
_version_	1759854590151360512

A study on rumour detection on online social networks

Similar Items