Improving spam detection on Twitter using deep learning

The advancement of technology in a modern era has allowed Internet users to access social media easily. However, the number of content polluters also known as spammers have increased rapidly over the years. Spammers attract Internet users’ attention by broadcasting unsolicited content repetitively o...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Yi Rong
Other Authors: Ponnuthurai Nagaratnam Suganthan
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148957
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-148957
record_format dspace
spelling sg-ntu-dr.10356-1489572023-07-07T16:35:27Z Improving spam detection on Twitter using deep learning Ng, Yi Rong Ponnuthurai Nagaratnam Suganthan School of Electrical and Electronic Engineering EPNSugan@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Computing methodologies::Document and text processing The advancement of technology in a modern era has allowed Internet users to access social media easily. However, the number of content polluters also known as spammers have increased rapidly over the years. Spammers attract Internet users’ attention by broadcasting unsolicited content repetitively on social media platforms. Their actions have caused negative social experience for legitimate Internet users. As a result, spam detection models are required to deter social media spammers. The goal of spam detection is to automatically classify content such as tweets into spam or non-spam. Past studies have shown that the success of spam detection models was built by numerous types of machine learning and deep learning methods. In this project, deep learning models such as LSTM, CNN, and Transformer were experimented on publicly available Twitter dataset. Strategic text processing techniques were performed on original dataset to create 3 modified datasets for experiment. Word embedding techniques such as Word2Vec model, pre-trained GloVe vectors, and random embedding weight initialisation were evaluated. Lastly, classification performances of LSTM, CNN, and Transformer were compared with related works. Experimental results have showed that LSTM with random embedding weight initialisation achieved the best spam precision and specificity scores of 80% and 87%, respectively. Furthermore, my LSTM experimental results have shown comparable performance to other related works. Bachelor of Engineering (Electrical and Electronic Engineering) 2021-05-21T06:27:11Z 2021-05-21T06:27:11Z 2021 Final Year Project (FYP) Ng, Y. R. (2021). Improving spam detection on Twitter using deep learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148957 https://hdl.handle.net/10356/148957 en A1111-201 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Ng, Yi Rong
Improving spam detection on Twitter using deep learning
description The advancement of technology in a modern era has allowed Internet users to access social media easily. However, the number of content polluters also known as spammers have increased rapidly over the years. Spammers attract Internet users’ attention by broadcasting unsolicited content repetitively on social media platforms. Their actions have caused negative social experience for legitimate Internet users. As a result, spam detection models are required to deter social media spammers. The goal of spam detection is to automatically classify content such as tweets into spam or non-spam. Past studies have shown that the success of spam detection models was built by numerous types of machine learning and deep learning methods. In this project, deep learning models such as LSTM, CNN, and Transformer were experimented on publicly available Twitter dataset. Strategic text processing techniques were performed on original dataset to create 3 modified datasets for experiment. Word embedding techniques such as Word2Vec model, pre-trained GloVe vectors, and random embedding weight initialisation were evaluated. Lastly, classification performances of LSTM, CNN, and Transformer were compared with related works. Experimental results have showed that LSTM with random embedding weight initialisation achieved the best spam precision and specificity scores of 80% and 87%, respectively. Furthermore, my LSTM experimental results have shown comparable performance to other related works.
author2 Ponnuthurai Nagaratnam Suganthan
author_facet Ponnuthurai Nagaratnam Suganthan
Ng, Yi Rong
format Final Year Project
author Ng, Yi Rong
author_sort Ng, Yi Rong
title Improving spam detection on Twitter using deep learning
title_short Improving spam detection on Twitter using deep learning
title_full Improving spam detection on Twitter using deep learning
title_fullStr Improving spam detection on Twitter using deep learning
title_full_unstemmed Improving spam detection on Twitter using deep learning
title_sort improving spam detection on twitter using deep learning
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/148957
_version_ 1772828313517031424