HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS

The ease of spreading information in the social media growth era not only provides benefits but also gives rise to various threats. One of the biggest threats is the spread of hoaxes or fake news information by irresponsible parties. This final project intends to build a hoax detection model for...

Full description

Saved in:

Bibliographic Details
Main Author:	Fadhlurohman, Aufa
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/67258
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:67258
spelling	id-itb.:672582022-08-19T04:10:25ZHOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS Fadhlurohman, Aufa Indonesia Final Project classification, hoax, fake news, deep learning, siamese network, social network INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/67258 The ease of spreading information in the social media growth era not only provides benefits but also gives rise to various threats. One of the biggest threats is the spread of hoaxes or fake news information by irresponsible parties. This final project intends to build a hoax detection model for Twitter social media by identification of textual and social network characteristics that can assist in this task. Currently, social media hoax detection in social network analysis app still relies on the hoax data text similarity search method with a certain threshold. This final project proposes the use of a combination of textual pattern, textual similarity, user information, and network information to improve detection ability. The baseline model in this study uses shallow learning algorithms with a textual similar level of information input. The main model was developed using deep learning method with various word embedding, feature combinations, and some architecture. One of the architectures being tried is the siamese architecture as an effort to better identify textual similarity based on the context. This study focuses on conducting experiments to get the best configuration for the hoax classification model. The data used in the experiment amounted to 6983 data consisting of three classes of counter hoaxes, non-hoaxes, and hoaxes. Based on the experimental results, the baseline model got an f1-score of 0.6521. The best model is obtained using the siamese similarity architecture with additional user information features. This best model was built using BERT learning and managed to get an f1-score of 0.8086. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	The ease of spreading information in the social media growth era not only provides benefits but also gives rise to various threats. One of the biggest threats is the spread of hoaxes or fake news information by irresponsible parties. This final project intends to build a hoax detection model for Twitter social media by identification of textual and social network characteristics that can assist in this task. Currently, social media hoax detection in social network analysis app still relies on the hoax data text similarity search method with a certain threshold. This final project proposes the use of a combination of textual pattern, textual similarity, user information, and network information to improve detection ability. The baseline model in this study uses shallow learning algorithms with a textual similar level of information input. The main model was developed using deep learning method with various word embedding, feature combinations, and some architecture. One of the architectures being tried is the siamese architecture as an effort to better identify textual similarity based on the context. This study focuses on conducting experiments to get the best configuration for the hoax classification model. The data used in the experiment amounted to 6983 data consisting of three classes of counter hoaxes, non-hoaxes, and hoaxes. Based on the experimental results, the baseline model got an f1-score of 0.6521. The best model is obtained using the siamese similarity architecture with additional user information features. This best model was built using BERT learning and managed to get an f1-score of 0.8086.
format	Final Project
author	Fadhlurohman, Aufa
spellingShingle	Fadhlurohman, Aufa HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
author_facet	Fadhlurohman, Aufa
author_sort	Fadhlurohman, Aufa
title	HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_short	HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_full	HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_fullStr	HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_full_unstemmed	HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_sort	hoax detection on twitter social media based on textual analysis and social network characteristics
url	https://digilib.itb.ac.id/gdl/view/67258
_version_	1822933296704651264

HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS

Similar Items