Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms

Satire is a mode of communication that integrates irony, humor and exaggeration to criticize and ridicule people's stupidity, contemporary politics, or other topical issues; therefore, these satirical sentiments often possess meanings that are in opposition to their literal interpretations. It...

Full description

Saved in:
Bibliographic Details
Main Author: Nguyen, Ha Quan
Other Authors: Tan Chee Wah Wesley
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74973
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-74973
record_format dspace
spelling sg-ntu-dr.10356-749732023-07-07T17:34:58Z Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms Nguyen, Ha Quan Tan Chee Wah Wesley School of Electrical and Electronic Engineering DRNTU::Engineering Satire is a mode of communication that integrates irony, humor and exaggeration to criticize and ridicule people's stupidity, contemporary politics, or other topical issues; therefore, these satirical sentiments often possess meanings that are in opposition to their literal interpretations. It is even very difficult for us to detect satire in a certain sentiment, not to mention computers, if we do not have knowledge about the topic discussed satirically. In fact, nowadays, satire detection becomes a very challenging issue in Natural Language Processing (NLP) area. Hence, the objective of this project is to build and evaluate the performance of the new satire detection model based on a psycholinguistic approach for English social media content in Twitter. The most distinguished part of this research project in comparison with previous research works on satire detection is in the feature extraction. In the Feature Extraction phase, each tweet in the pre-processed labeled data will be analyzed using the Linguistic Inquiry & Word Count (LIWC) software to generate the feature vector with its dimension depending on the categories of LIWC. LIWC has more than 93 categories which are classified into 5 main groups: (1) Linguistic Processes, (2) Psychological Processes, (3) Personal Concerns, (4) Spoken Categories and (5) Punctuation Marks. The predictive model will be generated and validated by the 10-fold crossvalidation method, and the performance of this model will be measured using the popular metrics: Precision, Recall, F-measure, and Accuracy. Experiments in this project are conducted as follows. The first experiment will focus on determining the importance of the Punctuation Marks in building the satire detection model by training the model using the corpus which has excluded all punctuation marks in the text preprocessing phase. Secondly, the experiment is conducted to identify the combination of categories among the groups which generates the best result measured by the metrics. Finally, the most contributing features and the classifier which is most suitable for English Satire Detection are also determined. Bachelor of Engineering 2018-05-25T06:57:33Z 2018-05-25T06:57:33Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74973 en Nanyang Technological University 66 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Nguyen, Ha Quan
Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
description Satire is a mode of communication that integrates irony, humor and exaggeration to criticize and ridicule people's stupidity, contemporary politics, or other topical issues; therefore, these satirical sentiments often possess meanings that are in opposition to their literal interpretations. It is even very difficult for us to detect satire in a certain sentiment, not to mention computers, if we do not have knowledge about the topic discussed satirically. In fact, nowadays, satire detection becomes a very challenging issue in Natural Language Processing (NLP) area. Hence, the objective of this project is to build and evaluate the performance of the new satire detection model based on a psycholinguistic approach for English social media content in Twitter. The most distinguished part of this research project in comparison with previous research works on satire detection is in the feature extraction. In the Feature Extraction phase, each tweet in the pre-processed labeled data will be analyzed using the Linguistic Inquiry & Word Count (LIWC) software to generate the feature vector with its dimension depending on the categories of LIWC. LIWC has more than 93 categories which are classified into 5 main groups: (1) Linguistic Processes, (2) Psychological Processes, (3) Personal Concerns, (4) Spoken Categories and (5) Punctuation Marks. The predictive model will be generated and validated by the 10-fold crossvalidation method, and the performance of this model will be measured using the popular metrics: Precision, Recall, F-measure, and Accuracy. Experiments in this project are conducted as follows. The first experiment will focus on determining the importance of the Punctuation Marks in building the satire detection model by training the model using the corpus which has excluded all punctuation marks in the text preprocessing phase. Secondly, the experiment is conducted to identify the combination of categories among the groups which generates the best result measured by the metrics. Finally, the most contributing features and the classifier which is most suitable for English Satire Detection are also determined.
author2 Tan Chee Wah Wesley
author_facet Tan Chee Wah Wesley
Nguyen, Ha Quan
format Final Year Project
author Nguyen, Ha Quan
author_sort Nguyen, Ha Quan
title Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
title_short Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
title_full Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
title_fullStr Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
title_full_unstemmed Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
title_sort analysis of the performance of the psycholinguistic-based approach to automatically detect english satire in social media platforms
publishDate 2018
url http://hdl.handle.net/10356/74973
_version_ 1772825410596241408