Deep learning techniques for hate speech detection

Considering the prevalence of hate speech in social media platforms, automatic hate speech detection is a crucial tool in the fight against hate speech proliferation. Several techniques, such as the recent surge in deep learning-based methods, have been developed for the task. Different datasets tha...

Full description

Saved in:

Bibliographic Details
Main Author:	Sam, Jared Mun Kit
Other Authors:	Luu Anh Tuan
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Online Access:	https://hdl.handle.net/10356/172646
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-172646
record_format	dspace
spelling	sg-ntu-dr.10356-1726462023-12-22T15:38:13Z Deep learning techniques for hate speech detection Sam, Jared Mun Kit Luu Anh Tuan School of Computer Science and Engineering anhtuan.luu@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Document and text processing Considering the prevalence of hate speech in social media platforms, automatic hate speech detection is a crucial tool in the fight against hate speech proliferation. Several techniques, such as the recent surge in deep learning-based methods, have been developed for the task. Different datasets that represent different facets of the hate speech detection issue have also been created. Using three prominent public datasets, a comprehensive empirical analysis of hate speech detection techniques is presented in this study. The implementation and comparison of current models offered pivotal insights into machine learning models’ efficacy, word representation models, and their performance variance across different datasets. Convolutional Neural Networks (CNN) emerged as a consistent performer, especially when coupled with Bidirectional Encoder Representations from Transformers (BERT) embeddings. The performance of Multi-Layer Perceptron (MLP) was notably affected by the chosen word representation method, with the BERT combination being superior. Word representation evaluation underscored BERT’s superior capability, attributable to its pre-training on extensive corpora and its provision of contextual word representations, outclassing fixed embeddings like Global Vectors for Word Representation (Glove) and Term-Frequency-Inverse Document Frequency (TF-IDF). Despite BERT’s strengths, its low macro average scores highlight the challenges in accurately identifying minority hateful tweets amidst vast tweet volumes. Bachelor of Engineering (Computer Science) 2023-12-19T11:18:52Z 2023-12-19T11:18:52Z 2023 Final Year Project (FYP) Sam, J. M. K. (2023). Deep learning techniques for hate speech detection. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/172646 https://hdl.handle.net/10356/172646 en SCSE22-1111 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Document and text processing Sam, Jared Mun Kit Deep learning techniques for hate speech detection
description	Considering the prevalence of hate speech in social media platforms, automatic hate speech detection is a crucial tool in the fight against hate speech proliferation. Several techniques, such as the recent surge in deep learning-based methods, have been developed for the task. Different datasets that represent different facets of the hate speech detection issue have also been created. Using three prominent public datasets, a comprehensive empirical analysis of hate speech detection techniques is presented in this study. The implementation and comparison of current models offered pivotal insights into machine learning models’ efficacy, word representation models, and their performance variance across different datasets. Convolutional Neural Networks (CNN) emerged as a consistent performer, especially when coupled with Bidirectional Encoder Representations from Transformers (BERT) embeddings. The performance of Multi-Layer Perceptron (MLP) was notably affected by the chosen word representation method, with the BERT combination being superior. Word representation evaluation underscored BERT’s superior capability, attributable to its pre-training on extensive corpora and its provision of contextual word representations, outclassing fixed embeddings like Global Vectors for Word Representation (Glove) and Term-Frequency-Inverse Document Frequency (TF-IDF). Despite BERT’s strengths, its low macro average scores highlight the challenges in accurately identifying minority hateful tweets amidst vast tweet volumes.
author2	Luu Anh Tuan
author_facet	Luu Anh Tuan Sam, Jared Mun Kit
format	Final Year Project
author	Sam, Jared Mun Kit
author_sort	Sam, Jared Mun Kit
title	Deep learning techniques for hate speech detection
title_short	Deep learning techniques for hate speech detection
title_full	Deep learning techniques for hate speech detection
title_fullStr	Deep learning techniques for hate speech detection
title_full_unstemmed	Deep learning techniques for hate speech detection
title_sort	deep learning techniques for hate speech detection
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/172646
_version_	1787136653325762560

Deep learning techniques for hate speech detection

Similar Items