HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS

The ease of spreading information in the social media growth era not only provides benefits but also gives rise to various threats. One of the biggest threats is the spread of hoaxes or fake news information by irresponsible parties. This final project intends to build a hoax detection model for...

Full description

Saved in:
Bibliographic Details
Main Author: Fadhlurohman, Aufa
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/67258
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:67258
spelling id-itb.:672582022-08-19T04:10:25ZHOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS Fadhlurohman, Aufa Indonesia Final Project classification, hoax, fake news, deep learning, siamese network, social network INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/67258 The ease of spreading information in the social media growth era not only provides benefits but also gives rise to various threats. One of the biggest threats is the spread of hoaxes or fake news information by irresponsible parties. This final project intends to build a hoax detection model for Twitter social media by identification of textual and social network characteristics that can assist in this task. Currently, social media hoax detection in social network analysis app still relies on the hoax data text similarity search method with a certain threshold. This final project proposes the use of a combination of textual pattern, textual similarity, user information, and network information to improve detection ability. The baseline model in this study uses shallow learning algorithms with a textual similar level of information input. The main model was developed using deep learning method with various word embedding, feature combinations, and some architecture. One of the architectures being tried is the siamese architecture as an effort to better identify textual similarity based on the context. This study focuses on conducting experiments to get the best configuration for the hoax classification model. The data used in the experiment amounted to 6983 data consisting of three classes of counter hoaxes, non-hoaxes, and hoaxes. Based on the experimental results, the baseline model got an f1-score of 0.6521. The best model is obtained using the siamese similarity architecture with additional user information features. This best model was built using BERT learning and managed to get an f1-score of 0.8086. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description The ease of spreading information in the social media growth era not only provides benefits but also gives rise to various threats. One of the biggest threats is the spread of hoaxes or fake news information by irresponsible parties. This final project intends to build a hoax detection model for Twitter social media by identification of textual and social network characteristics that can assist in this task. Currently, social media hoax detection in social network analysis app still relies on the hoax data text similarity search method with a certain threshold. This final project proposes the use of a combination of textual pattern, textual similarity, user information, and network information to improve detection ability. The baseline model in this study uses shallow learning algorithms with a textual similar level of information input. The main model was developed using deep learning method with various word embedding, feature combinations, and some architecture. One of the architectures being tried is the siamese architecture as an effort to better identify textual similarity based on the context. This study focuses on conducting experiments to get the best configuration for the hoax classification model. The data used in the experiment amounted to 6983 data consisting of three classes of counter hoaxes, non-hoaxes, and hoaxes. Based on the experimental results, the baseline model got an f1-score of 0.6521. The best model is obtained using the siamese similarity architecture with additional user information features. This best model was built using BERT learning and managed to get an f1-score of 0.8086.
format Final Project
author Fadhlurohman, Aufa
spellingShingle Fadhlurohman, Aufa
HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
author_facet Fadhlurohman, Aufa
author_sort Fadhlurohman, Aufa
title HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_short HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_full HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_fullStr HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_full_unstemmed HOAX DETECTION ON TWITTER SOCIAL MEDIA BASED ON TEXTUAL ANALYSIS AND SOCIAL NETWORK CHARACTERISTICS
title_sort hoax detection on twitter social media based on textual analysis and social network characteristics
url https://digilib.itb.ac.id/gdl/view/67258
_version_ 1822933296704651264