Mechanism for sarcasm detection and classification in Malay social media

The classification of users’ sentiment from social media data can be used to learn public opinion on certain issues. The presence of sarcasm in sentences can hamper the performance of the classification as it tends to “fool” the system. In this paper, we investigate mechanisms for detecting sarcasm...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Suhairi Md Suhaimin, Mohd Hanafi Ahmad Hijazi, Rayner Alfred, Frans Coenen
Format: Article
Language:English
Published: American Scientific Publishers 2018
Subjects:
Online Access:https://eprints.ums.edu.my/id/eprint/22265/1/Mechanism%20for%20Sarcasm%20Detection%20and%20Classification%20in%20Malay%20Social%20Media.pdf
https://eprints.ums.edu.my/id/eprint/22265/
https://www.ingentaconnect.com/content/asp/asl/2018/00000024/00000002/art00129;jsessionid=bfgnc2q403kdk.x-ic-live-01
https://doi.org/10.1166/asl.2018.10755
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sabah
Language: English
id my.ums.eprints.22265
record_format eprints
spelling my.ums.eprints.222652021-07-19T01:38:40Z https://eprints.ums.edu.my/id/eprint/22265/ Mechanism for sarcasm detection and classification in Malay social media Mohd Suhairi Md Suhaimin Mohd Hanafi Ahmad Hijazi Rayner Alfred Frans Coenen DS Asia The classification of users’ sentiment from social media data can be used to learn public opinion on certain issues. The presence of sarcasm in sentences can hamper the performance of the classification as it tends to “fool” the system. In this paper, we investigate mechanisms for detecting sarcasm in Malay social media data that contain sarcastic contents; more specifically the public comments on economic related posts on Facebook. Two features were investigated; the n-gram and punctuation marks. Features selection in the form of Pearson’s correlation was then applied to reduce the features size. To measure the performances of the selected features, two supervised classification techniques were employed which are k-Nearest Neighbors and non-linear Support Vector Machine. Experiments on sarcasm detection and classification were conducted. Results show that combination of n-gram and punctuation marks produced the best F -measure and Area Under Curve of 0.818 for sarcasm detection. Extended experiment on sarcasm classification recorded F -measure of 0.991 with Area Under Curve of 0.994 for sarcasm positivity while F -measure of 0.902 with Area Under Curve of 0.846 for sarcasm negativity. American Scientific Publishers 2018 Article PeerReviewed text en https://eprints.ums.edu.my/id/eprint/22265/1/Mechanism%20for%20Sarcasm%20Detection%20and%20Classification%20in%20Malay%20Social%20Media.pdf Mohd Suhairi Md Suhaimin and Mohd Hanafi Ahmad Hijazi and Rayner Alfred and Frans Coenen (2018) Mechanism for sarcasm detection and classification in Malay social media. Advanced Science Letters, 24 (2). pp. 1388-1392. ISSN 1936-6612 https://www.ingentaconnect.com/content/asp/asl/2018/00000024/00000002/art00129;jsessionid=bfgnc2q403kdk.x-ic-live-01 https://doi.org/10.1166/asl.2018.10755
institution Universiti Malaysia Sabah
building UMS Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sabah
content_source UMS Institutional Repository
url_provider http://eprints.ums.edu.my/
language English
topic DS Asia
spellingShingle DS Asia
Mohd Suhairi Md Suhaimin
Mohd Hanafi Ahmad Hijazi
Rayner Alfred
Frans Coenen
Mechanism for sarcasm detection and classification in Malay social media
description The classification of users’ sentiment from social media data can be used to learn public opinion on certain issues. The presence of sarcasm in sentences can hamper the performance of the classification as it tends to “fool” the system. In this paper, we investigate mechanisms for detecting sarcasm in Malay social media data that contain sarcastic contents; more specifically the public comments on economic related posts on Facebook. Two features were investigated; the n-gram and punctuation marks. Features selection in the form of Pearson’s correlation was then applied to reduce the features size. To measure the performances of the selected features, two supervised classification techniques were employed which are k-Nearest Neighbors and non-linear Support Vector Machine. Experiments on sarcasm detection and classification were conducted. Results show that combination of n-gram and punctuation marks produced the best F -measure and Area Under Curve of 0.818 for sarcasm detection. Extended experiment on sarcasm classification recorded F -measure of 0.991 with Area Under Curve of 0.994 for sarcasm positivity while F -measure of 0.902 with Area Under Curve of 0.846 for sarcasm negativity.
format Article
author Mohd Suhairi Md Suhaimin
Mohd Hanafi Ahmad Hijazi
Rayner Alfred
Frans Coenen
author_facet Mohd Suhairi Md Suhaimin
Mohd Hanafi Ahmad Hijazi
Rayner Alfred
Frans Coenen
author_sort Mohd Suhairi Md Suhaimin
title Mechanism for sarcasm detection and classification in Malay social media
title_short Mechanism for sarcasm detection and classification in Malay social media
title_full Mechanism for sarcasm detection and classification in Malay social media
title_fullStr Mechanism for sarcasm detection and classification in Malay social media
title_full_unstemmed Mechanism for sarcasm detection and classification in Malay social media
title_sort mechanism for sarcasm detection and classification in malay social media
publisher American Scientific Publishers
publishDate 2018
url https://eprints.ums.edu.my/id/eprint/22265/1/Mechanism%20for%20Sarcasm%20Detection%20and%20Classification%20in%20Malay%20Social%20Media.pdf
https://eprints.ums.edu.my/id/eprint/22265/
https://www.ingentaconnect.com/content/asp/asl/2018/00000024/00000002/art00129;jsessionid=bfgnc2q403kdk.x-ic-live-01
https://doi.org/10.1166/asl.2018.10755
_version_ 1760229946171588608