PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)

The increasing amount of social media content has relevance to the increasing prevalence of hate speech. This presents a challenge in the form of the complexity of distinguishing between recognized freedom of speech and expressions that encourage hatred. Thus, an accurate hate speech content iden...

Full description

Saved in:

Bibliographic Details
Main Author:	Donny Ericson, Muhammad
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/80974
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:80974
spelling	id-itb.:809742024-03-17T04:39:42ZPERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT) Donny Ericson, Muhammad Indonesia Theses Preprocessing, Contextual Preprocessing, Hate Speech, BERT, Fine-Tuning. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/80974 The increasing amount of social media content has relevance to the increasing prevalence of hate speech. This presents a challenge in the form of the complexity of distinguishing between recognized freedom of speech and expressions that encourage hatred. Thus, an accurate hate speech content identification system is required. However, the presence of bias in the system development process leads to inaccuracies in identifying content that should be considered as hate speech. To overcome this, it is important to set clear standards for the criteria of hate speech, thus reducing the risk of bias in the detection process. This research aims to formulate hate speech criteria based on the concepts of speech, hatred, and hate speech itself. In the initial stage, the criteria are translated into linguistic context and then implemented in a programming algorithm in the natural language processing pre-process that aims to provide automatic labeling based on the formulated criteria. The next stage involves training using a BERT-based pre-training language model approach and fine-tuning to adapt the model to domains relevant to hate speech. The evaluation was conducted by looking at the accuracy, precision, recall, and f1-score of the developed model, while analyzing the bias reduction that the model might produce. This research produced a model with superior accuracy, precision, and recall compared to previous research. The success of this model is due to the establishment of more assertive and linguistically specific hate speech criteria, allowing the model to more precisely identify hateful content and significantly improve detection performance. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	The increasing amount of social media content has relevance to the increasing prevalence of hate speech. This presents a challenge in the form of the complexity of distinguishing between recognized freedom of speech and expressions that encourage hatred. Thus, an accurate hate speech content identification system is required. However, the presence of bias in the system development process leads to inaccuracies in identifying content that should be considered as hate speech. To overcome this, it is important to set clear standards for the criteria of hate speech, thus reducing the risk of bias in the detection process. This research aims to formulate hate speech criteria based on the concepts of speech, hatred, and hate speech itself. In the initial stage, the criteria are translated into linguistic context and then implemented in a programming algorithm in the natural language processing pre-process that aims to provide automatic labeling based on the formulated criteria. The next stage involves training using a BERT-based pre-training language model approach and fine-tuning to adapt the model to domains relevant to hate speech. The evaluation was conducted by looking at the accuracy, precision, recall, and f1-score of the developed model, while analyzing the bias reduction that the model might produce. This research produced a model with superior accuracy, precision, and recall compared to previous research. The success of this model is due to the establishment of more assertive and linguistically specific hate speech criteria, allowing the model to more precisely identify hateful content and significantly improve detection performance.
format	Theses
author	Donny Ericson, Muhammad
spellingShingle	Donny Ericson, Muhammad PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)
author_facet	Donny Ericson, Muhammad
author_sort	Donny Ericson, Muhammad
title	PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)
title_short	PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)
title_full	PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)
title_fullStr	PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)
title_full_unstemmed	PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)
title_sort	performance improvement of hate speech detection for hateful statements using contextual preprocessing and fine-tuning strategy on pre-trained language model (bert)
url	https://digilib.itb.ac.id/gdl/view/80974
_version_	1822009339779481600

PERFORMANCE IMPROVEMENT OF HATE SPEECH DETECTION FOR HATEFUL STATEMENTS USING CONTEXTUAL PREPROCESSING AND FINE-TUNING STRATEGY ON PRE-TRAINED LANGUAGE MODEL (BERT)

Similar Items