Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms
Satire is a mode of communication that integrates irony, humor and exaggeration to criticize and ridicule people's stupidity, contemporary politics, or other topical issues; therefore, these satirical sentiments often possess meanings that are in opposition to their literal interpretations. It...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74973 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-74973 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-749732023-07-07T17:34:58Z Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms Nguyen, Ha Quan Tan Chee Wah Wesley School of Electrical and Electronic Engineering DRNTU::Engineering Satire is a mode of communication that integrates irony, humor and exaggeration to criticize and ridicule people's stupidity, contemporary politics, or other topical issues; therefore, these satirical sentiments often possess meanings that are in opposition to their literal interpretations. It is even very difficult for us to detect satire in a certain sentiment, not to mention computers, if we do not have knowledge about the topic discussed satirically. In fact, nowadays, satire detection becomes a very challenging issue in Natural Language Processing (NLP) area. Hence, the objective of this project is to build and evaluate the performance of the new satire detection model based on a psycholinguistic approach for English social media content in Twitter. The most distinguished part of this research project in comparison with previous research works on satire detection is in the feature extraction. In the Feature Extraction phase, each tweet in the pre-processed labeled data will be analyzed using the Linguistic Inquiry & Word Count (LIWC) software to generate the feature vector with its dimension depending on the categories of LIWC. LIWC has more than 93 categories which are classified into 5 main groups: (1) Linguistic Processes, (2) Psychological Processes, (3) Personal Concerns, (4) Spoken Categories and (5) Punctuation Marks. The predictive model will be generated and validated by the 10-fold crossvalidation method, and the performance of this model will be measured using the popular metrics: Precision, Recall, F-measure, and Accuracy. Experiments in this project are conducted as follows. The first experiment will focus on determining the importance of the Punctuation Marks in building the satire detection model by training the model using the corpus which has excluded all punctuation marks in the text preprocessing phase. Secondly, the experiment is conducted to identify the combination of categories among the groups which generates the best result measured by the metrics. Finally, the most contributing features and the classifier which is most suitable for English Satire Detection are also determined. Bachelor of Engineering 2018-05-25T06:57:33Z 2018-05-25T06:57:33Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74973 en Nanyang Technological University 66 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering |
spellingShingle |
DRNTU::Engineering Nguyen, Ha Quan Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms |
description |
Satire is a mode of communication that integrates irony, humor and exaggeration to criticize and ridicule people's stupidity, contemporary politics, or other topical issues; therefore, these satirical sentiments often possess meanings that are in opposition to their literal interpretations. It is even very difficult for us to detect satire in a certain sentiment, not to mention computers, if we do not have knowledge about the topic discussed satirically. In fact, nowadays, satire detection becomes a very challenging issue in Natural Language Processing (NLP) area. Hence, the objective of this project is to build and evaluate the performance of the new satire detection model based on a psycholinguistic approach for English social media content in Twitter. The most distinguished part of this research project in comparison with previous research works on satire detection is in the feature extraction. In the Feature Extraction phase, each tweet in the pre-processed labeled data will be analyzed using the Linguistic Inquiry & Word Count (LIWC) software to generate the feature vector with its dimension depending on the categories of LIWC. LIWC has more than 93 categories which are classified into 5 main groups: (1) Linguistic Processes, (2) Psychological Processes, (3) Personal Concerns, (4) Spoken Categories and (5) Punctuation Marks. The predictive model will be generated and validated by the 10-fold crossvalidation method, and the performance of this model will be measured using the popular metrics: Precision, Recall, F-measure, and Accuracy. Experiments in this project are conducted as follows. The first experiment will focus on determining the importance of the Punctuation Marks in building the satire detection model by training the model using the corpus which has excluded all punctuation marks in the text preprocessing phase. Secondly, the experiment is conducted to identify the combination of categories among the groups which generates the best result measured by the metrics. Finally, the most contributing features and the classifier which is most suitable for English Satire Detection are also determined. |
author2 |
Tan Chee Wah Wesley |
author_facet |
Tan Chee Wah Wesley Nguyen, Ha Quan |
format |
Final Year Project |
author |
Nguyen, Ha Quan |
author_sort |
Nguyen, Ha Quan |
title |
Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms |
title_short |
Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms |
title_full |
Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms |
title_fullStr |
Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms |
title_full_unstemmed |
Analysis of the performance of the Psycholinguistic-based approach to automatically detect English satire in social media platforms |
title_sort |
analysis of the performance of the psycholinguistic-based approach to automatically detect english satire in social media platforms |
publishDate |
2018 |
url |
http://hdl.handle.net/10356/74973 |
_version_ |
1772825410596241408 |