Deep learning techniques for hate speech detection

With the rapid growth of the Internet and continuous expansion of online content, the proliferation of hate speech also increases. Hate speech has severe implications on social polarization, as well as physical and mental safety, warranting an urgent need for effective automated detection. In this s...

Full description

Saved in:

Bibliographic Details
Main Author:	Chang, Timothy Zu'En
Other Authors:	Luu Anh Tuan
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science
Online Access:	https://hdl.handle.net/10356/175272
Tags:	Add Tag No Tags, Be the first to tag this record!

id	sg-ntu-dr.10356-175272
record_format	dspace
spelling	sg-ntu-dr.10356-1752722024-04-26T15:44:03Z Deep learning techniques for hate speech detection Chang, Timothy Zu'En Luu Anh Tuan School of Computer Science and Engineering anhtuan.luu@ntu.edu.sg Computer and Information Science With the rapid growth of the Internet and continuous expansion of online content, the proliferation of hate speech also increases. Hate speech has severe implications on social polarization, as well as physical and mental safety, warranting an urgent need for effective automated detection. In this study, we analyze current efforts by the scientific community in developing automated methods for detecting online hate speech. This led us to the discovery of machine learning-based approaches for automatic hate speech detection, in particular deep learning approaches which were popularized for their robustness and ability to learn newly evolving slang. In doing so, we also investigate the challenges faced, including language nuances, varying definitions of hate speech, and data constraints. This study aims to answer the following questions: How do we define and distinguish hate speech from other classes of speech? What is currently being done in the scientific community for its detection? Lastly, how effective are they in classifying hate speech? In this study, we conducted literary research on the following topics. Firstly, research on the various definitions of hate speech. This provided insight on how hate speech can be distinguished from normal speech, as well as the evolutions in language that make the identification of hate speech challenging. Secondly, research on the natural language processing (NLP) methodology. We examined existing studies on the use of NLP in hate speech detection to learn about popular and state-of-the-art methods used. Lastly, we studied benchmark datasets collated by the research community that could be used in our own experiments. Following this, we followed the NLP pipeline in introducing popular machine learning techniques for hate speech classification. This included text processing and representation techniques used to make text data understandable by models. Experiments were then conducted utilizing a mix of feature engineering and traditional machine learning classifiers to obtain a set of baseline classification metrics. Finally, deep learning frameworks such as neural networks and transformer models were then introduced in an attempt to outperform this benchmark. Additionally, we evaluated the performance of neural network-based models and pre-trained models and attempted to reach a conclusion on the most suitable strategies for detecting hate speech. Ultimately, this study provides a benchmarked assessment of existing deep learning techniques and their use in the field of hate speech detection, hoping to contribute valuable knowledge to the research community. By enhancing our understanding of the strengths and limitations of various models and pre-trained language models, it will contribute to the creation of a more inclusive and secure online community and serve as a significant step towards mitigating the adverse impact of hate speech on digital platforms. Bachelor's degree 2024-04-23T05:17:27Z 2024-04-23T05:17:27Z 2024 Final Year Project (FYP) Chang, T. Z. (2024). Deep learning techniques for hate speech detection. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175272 https://hdl.handle.net/10356/175272 en SCSE23-0732 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science
spellingShingle	Computer and Information Science Chang, Timothy Zu'En Deep learning techniques for hate speech detection
description	With the rapid growth of the Internet and continuous expansion of online content, the proliferation of hate speech also increases. Hate speech has severe implications on social polarization, as well as physical and mental safety, warranting an urgent need for effective automated detection. In this study, we analyze current efforts by the scientific community in developing automated methods for detecting online hate speech. This led us to the discovery of machine learning-based approaches for automatic hate speech detection, in particular deep learning approaches which were popularized for their robustness and ability to learn newly evolving slang. In doing so, we also investigate the challenges faced, including language nuances, varying definitions of hate speech, and data constraints. This study aims to answer the following questions: How do we define and distinguish hate speech from other classes of speech? What is currently being done in the scientific community for its detection? Lastly, how effective are they in classifying hate speech? In this study, we conducted literary research on the following topics. Firstly, research on the various definitions of hate speech. This provided insight on how hate speech can be distinguished from normal speech, as well as the evolutions in language that make the identification of hate speech challenging. Secondly, research on the natural language processing (NLP) methodology. We examined existing studies on the use of NLP in hate speech detection to learn about popular and state-of-the-art methods used. Lastly, we studied benchmark datasets collated by the research community that could be used in our own experiments. Following this, we followed the NLP pipeline in introducing popular machine learning techniques for hate speech classification. This included text processing and representation techniques used to make text data understandable by models. Experiments were then conducted utilizing a mix of feature engineering and traditional machine learning classifiers to obtain a set of baseline classification metrics. Finally, deep learning frameworks such as neural networks and transformer models were then introduced in an attempt to outperform this benchmark. Additionally, we evaluated the performance of neural network-based models and pre-trained models and attempted to reach a conclusion on the most suitable strategies for detecting hate speech. Ultimately, this study provides a benchmarked assessment of existing deep learning techniques and their use in the field of hate speech detection, hoping to contribute valuable knowledge to the research community. By enhancing our understanding of the strengths and limitations of various models and pre-trained language models, it will contribute to the creation of a more inclusive and secure online community and serve as a significant step towards mitigating the adverse impact of hate speech on digital platforms.
author2	Luu Anh Tuan
author_facet	Luu Anh Tuan Chang, Timothy Zu'En
format	Final Year Project
author	Chang, Timothy Zu'En
author_sort	Chang, Timothy Zu'En
title	Deep learning techniques for hate speech detection
title_short	Deep learning techniques for hate speech detection
title_full	Deep learning techniques for hate speech detection
title_fullStr	Deep learning techniques for hate speech detection
title_full_unstemmed	Deep learning techniques for hate speech detection
title_sort	deep learning techniques for hate speech detection
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175272
_version_	1800916301079642112

Deep learning techniques for hate speech detection

Similar Items