Deep learning techniques for hate speech detection
With the rapid growth of the Internet and continuous expansion of online content, the proliferation of hate speech also increases. Hate speech has severe implications on social polarization, as well as physical and mental safety, warranting an urgent need for effective automated detection. In this s...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175272 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175272 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1752722024-04-26T15:44:03Z Deep learning techniques for hate speech detection Chang, Timothy Zu'En Luu Anh Tuan School of Computer Science and Engineering anhtuan.luu@ntu.edu.sg Computer and Information Science With the rapid growth of the Internet and continuous expansion of online content, the proliferation of hate speech also increases. Hate speech has severe implications on social polarization, as well as physical and mental safety, warranting an urgent need for effective automated detection. In this study, we analyze current efforts by the scientific community in developing automated methods for detecting online hate speech. This led us to the discovery of machine learning-based approaches for automatic hate speech detection, in particular deep learning approaches which were popularized for their robustness and ability to learn newly evolving slang. In doing so, we also investigate the challenges faced, including language nuances, varying definitions of hate speech, and data constraints. This study aims to answer the following questions: How do we define and distinguish hate speech from other classes of speech? What is currently being done in the scientific community for its detection? Lastly, how effective are they in classifying hate speech? In this study, we conducted literary research on the following topics. Firstly, research on the various definitions of hate speech. This provided insight on how hate speech can be distinguished from normal speech, as well as the evolutions in language that make the identification of hate speech challenging. Secondly, research on the natural language processing (NLP) methodology. We examined existing studies on the use of NLP in hate speech detection to learn about popular and state-of-the-art methods used. Lastly, we studied benchmark datasets collated by the research community that could be used in our own experiments. Following this, we followed the NLP pipeline in introducing popular machine learning techniques for hate speech classification. This included text processing and representation techniques used to make text data understandable by models. Experiments were then conducted utilizing a mix of feature engineering and traditional machine learning classifiers to obtain a set of baseline classification metrics. Finally, deep learning frameworks such as neural networks and transformer models were then introduced in an attempt to outperform this benchmark. Additionally, we evaluated the performance of neural network-based models and pre-trained models and attempted to reach a conclusion on the most suitable strategies for detecting hate speech. Ultimately, this study provides a benchmarked assessment of existing deep learning techniques and their use in the field of hate speech detection, hoping to contribute valuable knowledge to the research community. By enhancing our understanding of the strengths and limitations of various models and pre-trained language models, it will contribute to the creation of a more inclusive and secure online community and serve as a significant step towards mitigating the adverse impact of hate speech on digital platforms. Bachelor's degree 2024-04-23T05:17:27Z 2024-04-23T05:17:27Z 2024 Final Year Project (FYP) Chang, T. Z. (2024). Deep learning techniques for hate speech detection. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175272 https://hdl.handle.net/10356/175272 en SCSE23-0732 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Chang, Timothy Zu'En Deep learning techniques for hate speech detection |
description |
With the rapid growth of the Internet and continuous expansion of online content, the proliferation of hate speech also increases. Hate speech has severe implications on social polarization, as well as physical and mental safety, warranting an urgent need for effective automated detection. In this study, we analyze current efforts by the scientific community in developing automated methods for detecting online hate speech. This led us to the discovery of machine learning-based approaches for automatic hate speech detection, in particular deep learning approaches which were popularized for their robustness and ability
to learn newly evolving slang. In doing so, we also investigate the challenges faced, including language nuances, varying definitions of hate speech, and data constraints.
This study aims to answer the following questions: How do we define and distinguish hate speech from other classes of speech? What is currently being done in the scientific community for its detection? Lastly, how effective are they in classifying hate speech?
In this study, we conducted literary research on the following topics. Firstly, research on the various definitions of hate speech. This provided insight on how hate speech can be distinguished from normal speech, as well as the evolutions in language that make the
identification of hate speech challenging. Secondly, research on the natural language processing (NLP) methodology. We examined existing studies on the use of NLP in hate speech detection to learn about popular and state-of-the-art methods used. Lastly, we studied benchmark datasets collated by the research community that could be used in our own experiments.
Following this, we followed the NLP pipeline in introducing popular machine learning techniques for hate speech classification. This included text processing and representation techniques used to make text data understandable by models. Experiments were then conducted utilizing a mix of feature engineering and traditional machine learning classifiers to obtain a set of baseline classification metrics. Finally, deep learning frameworks such as neural networks and transformer models were then introduced in an attempt to outperform this benchmark. Additionally, we evaluated the performance of neural network-based models and pre-trained models and attempted to reach a conclusion on the most suitable strategies for detecting hate speech.
Ultimately, this study provides a benchmarked assessment of existing deep learning techniques and their use in the field of hate speech detection, hoping to contribute valuable knowledge to the research community. By enhancing our understanding of the strengths and limitations of various models and pre-trained language models, it will contribute to the creation of a more inclusive and secure online community and serve as a significant step towards mitigating the adverse impact of hate speech on digital platforms. |
author2 |
Luu Anh Tuan |
author_facet |
Luu Anh Tuan Chang, Timothy Zu'En |
format |
Final Year Project |
author |
Chang, Timothy Zu'En |
author_sort |
Chang, Timothy Zu'En |
title |
Deep learning techniques for hate speech detection |
title_short |
Deep learning techniques for hate speech detection |
title_full |
Deep learning techniques for hate speech detection |
title_fullStr |
Deep learning techniques for hate speech detection |
title_full_unstemmed |
Deep learning techniques for hate speech detection |
title_sort |
deep learning techniques for hate speech detection |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175272 |
_version_ |
1800916301079642112 |