A comparative analysis of machine learning techniques for cyberbullying detection on twitter

The advent of social media, particularly Twitter, raises many issues due to a misunderstanding regarding the concept of freedom of speech. One of these issues is cyberbullying, which is a critical global issue that affects both individual victims and societies. Many attempts have been introduced in...

Full description

Saved in:

Bibliographic Details
Main Authors:	Muneer, A., Fati, S.M.
Format:	Article
Published:	MDPI AG 2020
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85094647882&doi=10.3390%2ffi12110187&partnerID=40&md5=acfb2035f75c97ceb36c1ef6c292b2c8 http://eprints.utp.edu.my/29810/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Teknologi Petronas

id	my.utp.eprints.29810
record_format	eprints
spelling	my.utp.eprints.298102022-03-25T02:56:45Z A comparative analysis of machine learning techniques for cyberbullying detection on twitter Muneer, A. Fati, S.M. The advent of social media, particularly Twitter, raises many issues due to a misunderstanding regarding the concept of freedom of speech. One of these issues is cyberbullying, which is a critical global issue that affects both individual victims and societies. Many attempts have been introduced in the literature to intervene in, prevent, or mitigate cyberbullying; however, because these attempts rely on the victimsâ�� interactions, they are not practical. Therefore, detection of cyberbullying without the involvement of the victims is necessary. In this study, we attempted to explore this issue by compiling a global dataset of 37,373 unique tweets from Twitter. Moreover, seven machine learning classifiers were used, namely, Logistic Regression (LR), Light Gradient Boosting Machine (LGBM), Stochastic Gradient Descent (SGD), Random Forest (RF), AdaBoost (ADB), Naive Bayes (NB), and Support Vector Machine (SVM). Each of these algorithms was evaluated using accuracy, precision, recall, and F1 score as the performance metrics to determine the classifiersâ�� recognition rates applied to the global dataset. The experimental results show the superiority of LR, which achieved a median accuracy of around 90.57. Among the classifiers, logistic regression achieved the best F1 score (0.928), SGD achieved the best precision (0.968), and SVM achieved the best recall (1.00). Â© 2020 by the authors. Licensee MDPI, Basel, Switzerland. MDPI AG 2020 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85094647882&doi=10.3390%2ffi12110187&partnerID=40&md5=acfb2035f75c97ceb36c1ef6c292b2c8 Muneer, A. and Fati, S.M. (2020) A comparative analysis of machine learning techniques for cyberbullying detection on twitter. Future Internet, 12 (11). pp. 1-21. http://eprints.utp.edu.my/29810/
institution	Universiti Teknologi Petronas
building	UTP Resource Centre
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Petronas
content_source	UTP Institutional Repository
url_provider	http://eprints.utp.edu.my/
description	The advent of social media, particularly Twitter, raises many issues due to a misunderstanding regarding the concept of freedom of speech. One of these issues is cyberbullying, which is a critical global issue that affects both individual victims and societies. Many attempts have been introduced in the literature to intervene in, prevent, or mitigate cyberbullying; however, because these attempts rely on the victimsâ�� interactions, they are not practical. Therefore, detection of cyberbullying without the involvement of the victims is necessary. In this study, we attempted to explore this issue by compiling a global dataset of 37,373 unique tweets from Twitter. Moreover, seven machine learning classifiers were used, namely, Logistic Regression (LR), Light Gradient Boosting Machine (LGBM), Stochastic Gradient Descent (SGD), Random Forest (RF), AdaBoost (ADB), Naive Bayes (NB), and Support Vector Machine (SVM). Each of these algorithms was evaluated using accuracy, precision, recall, and F1 score as the performance metrics to determine the classifiersâ�� recognition rates applied to the global dataset. The experimental results show the superiority of LR, which achieved a median accuracy of around 90.57. Among the classifiers, logistic regression achieved the best F1 score (0.928), SGD achieved the best precision (0.968), and SVM achieved the best recall (1.00). Â© 2020 by the authors. Licensee MDPI, Basel, Switzerland.
format	Article
author	Muneer, A. Fati, S.M.
spellingShingle	Muneer, A. Fati, S.M. A comparative analysis of machine learning techniques for cyberbullying detection on twitter
author_facet	Muneer, A. Fati, S.M.
author_sort	Muneer, A.
title	A comparative analysis of machine learning techniques for cyberbullying detection on twitter
title_short	A comparative analysis of machine learning techniques for cyberbullying detection on twitter
title_full	A comparative analysis of machine learning techniques for cyberbullying detection on twitter
title_fullStr	A comparative analysis of machine learning techniques for cyberbullying detection on twitter
title_full_unstemmed	A comparative analysis of machine learning techniques for cyberbullying detection on twitter
title_sort	comparative analysis of machine learning techniques for cyberbullying detection on twitter
publisher	MDPI AG
publishDate	2020
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85094647882&doi=10.3390%2ffi12110187&partnerID=40&md5=acfb2035f75c97ceb36c1ef6c292b2c8 http://eprints.utp.edu.my/29810/
_version_	1738657018676248576

A comparative analysis of machine learning techniques for cyberbullying detection on twitter

Similar Items