ANALYSIS OF THE USAGE OF DEEP LEARNING FOR CYBERBULLYING DETECTION (TEXTUAL) IN BAHASA INDONESIA
Cyberbullying in Indonesia currently become concern due to the increasing usage of social media. Cyberbullying detection is an important step to make good environment in social media interaction. This research is part of computational linguistics that focuses on the usage of deep learning to dete...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/40009 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Cyberbullying in Indonesia currently become concern due to the increasing usage
of social media. Cyberbullying detection is an important step to make good
environment in social media interaction. This research is part of computational
linguistics that focuses on the usage of deep learning to detect bullying sentence on
Twitter.
There are two important processes in this study. First, the process of forming a
word representation. Second, the classification process for detecting of bullying
sentences. Pre-trained process to build new representation of term/word is
performed independently. Word2vec is used as a tool for pre-trained procees.
There are two types of data used in the pre-training process. The first type of data
only used testing data and training data, while the second type of data is the overall
data, total 26,800 unique Twitter sentences including test data and training data.
The classification process is formed using three main algorithms that are popular
for text classification: LSTM, bi-LSTM and CNN.
9.854 labeled sentences are extracted from 2.584 Twitter conversations used as
dataset. The dataset consists of 1.680 sentences are labeled as bully and 6.343
sentences are labeled as neutral. A total of 252 experiments are conducted in this
research by exploiting the preprocessing stage for determining machine learning
features and the algorithms of deep learning. The experiments show that the
accuracy score reaches 92.28% while the recall score for bully class reaches
81.65%. |
---|