ANALYSIS OF THE USAGE OF DEEP LEARNING FOR CYBERBULLYING DETECTION (TEXTUAL) IN BAHASA INDONESIA

Cyberbullying in Indonesia currently become concern due to the increasing usage of social media. Cyberbullying detection is an important step to make good environment in social media interaction. This research is part of computational linguistics that focuses on the usage of deep learning to dete...

Full description

Saved in:
Bibliographic Details
Main Author: Anindyati, Laksmi
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/40009
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Cyberbullying in Indonesia currently become concern due to the increasing usage of social media. Cyberbullying detection is an important step to make good environment in social media interaction. This research is part of computational linguistics that focuses on the usage of deep learning to detect bullying sentence on Twitter. There are two important processes in this study. First, the process of forming a word representation. Second, the classification process for detecting of bullying sentences. Pre-trained process to build new representation of term/word is performed independently. Word2vec is used as a tool for pre-trained procees. There are two types of data used in the pre-training process. The first type of data only used testing data and training data, while the second type of data is the overall data, total 26,800 unique Twitter sentences including test data and training data. The classification process is formed using three main algorithms that are popular for text classification: LSTM, bi-LSTM and CNN. 9.854 labeled sentences are extracted from 2.584 Twitter conversations used as dataset. The dataset consists of 1.680 sentences are labeled as bully and 6.343 sentences are labeled as neutral. A total of 252 experiments are conducted in this research by exploiting the preprocessing stage for determining machine learning features and the algorithms of deep learning. The experiments show that the accuracy score reaches 92.28% while the recall score for bully class reaches 81.65%.