Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines

Social media can be used to understand how the public is responding to the ongoing nationwide COVID-19 vaccination campaign, allowing policymakers to respond effectively through informed decisions. However, conducting social media analysis in the Philippine-context presents a challenge because natur...

Full description

Saved in:
Bibliographic Details
Main Authors: Co, Nicole Allison S, Estuar, Ma. Regina Justina, Tan, Hans Calvin L, Tan, Austin Sebastien, Abao, Roland P, Aureus, Jelly P
Format: text
Published: Archīum Ateneo 2022
Subjects:
NLP
Online Access:https://archium.ateneo.edu/discs-faculty-pubs/339
https://doi.org/10.1007/978-3-031-05061-9_18
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
id ph-ateneo-arc.discs-faculty-pubs-1339
record_format eprints
spelling ph-ateneo-arc.discs-faculty-pubs-13392022-12-02T05:41:51Z Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines Co, Nicole Allison S Estuar, Ma. Regina Justina Tan, Hans Calvin L Tan, Austin Sebastien Abao, Roland P Aureus, Jelly P Social media can be used to understand how the public is responding to the ongoing nationwide COVID-19 vaccination campaign, allowing policymakers to respond effectively through informed decisions. However, conducting social media analysis in the Philippine-context presents a challenge because natural informal conversations make use of a combination of English and local language. This study addresses this challenge by including part-of-speech tags, frequency of code switching and language dominance features to represent bilingualism in training machine learning models with COVID-19 vaccination-related Tweets for sentiment and emotion analysis. Results showed that the English-Tagalog Logistic Regression sentiment classification model performed better than Textblob, VADER and Polyglot with an accuracy of 70.36%. Similarly, the English-Tagalog SVM emotion classification model performed better than Text2emotion, NRC Affect Intensity Lexicon and EmoTFIDF with an average mean-squared error of 0.049. The added bilingual features only improved these performance metrics by a small margin. Nevertheless, SHAP analysis still revealed that sentiment and emotion classes exhibit varying levels of these bilingual features, which shows the potential in exploring similar linguistic features to distinguish between classes better during text classification for future studies. Finally, Tweets from September 2021 to January 2022 shows negative, mainly anger and sadness, perceptions towards COVID-19 vaccinations. 2022-01-01T08:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/339 https://doi.org/10.1007/978-3-031-05061-9_18 Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo Covid-19 Social computing Social media NLP Sentiment analysis Emotion analysis Communication Computer Engineering Digital Communications and Networking Engineering Social and Behavioral Sciences Social Media
institution Ateneo De Manila University
building Ateneo De Manila University Library
continent Asia
country Philippines
Philippines
content_provider Ateneo De Manila University Library
collection archium.Ateneo Institutional Repository
topic Covid-19
Social computing
Social media
NLP
Sentiment analysis
Emotion analysis
Communication
Computer Engineering
Digital Communications and Networking
Engineering
Social and Behavioral Sciences
Social Media
spellingShingle Covid-19
Social computing
Social media
NLP
Sentiment analysis
Emotion analysis
Communication
Computer Engineering
Digital Communications and Networking
Engineering
Social and Behavioral Sciences
Social Media
Co, Nicole Allison S
Estuar, Ma. Regina Justina
Tan, Hans Calvin L
Tan, Austin Sebastien
Abao, Roland P
Aureus, Jelly P
Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines
description Social media can be used to understand how the public is responding to the ongoing nationwide COVID-19 vaccination campaign, allowing policymakers to respond effectively through informed decisions. However, conducting social media analysis in the Philippine-context presents a challenge because natural informal conversations make use of a combination of English and local language. This study addresses this challenge by including part-of-speech tags, frequency of code switching and language dominance features to represent bilingualism in training machine learning models with COVID-19 vaccination-related Tweets for sentiment and emotion analysis. Results showed that the English-Tagalog Logistic Regression sentiment classification model performed better than Textblob, VADER and Polyglot with an accuracy of 70.36%. Similarly, the English-Tagalog SVM emotion classification model performed better than Text2emotion, NRC Affect Intensity Lexicon and EmoTFIDF with an average mean-squared error of 0.049. The added bilingual features only improved these performance metrics by a small margin. Nevertheless, SHAP analysis still revealed that sentiment and emotion classes exhibit varying levels of these bilingual features, which shows the potential in exploring similar linguistic features to distinguish between classes better during text classification for future studies. Finally, Tweets from September 2021 to January 2022 shows negative, mainly anger and sadness, perceptions towards COVID-19 vaccinations.
format text
author Co, Nicole Allison S
Estuar, Ma. Regina Justina
Tan, Hans Calvin L
Tan, Austin Sebastien
Abao, Roland P
Aureus, Jelly P
author_facet Co, Nicole Allison S
Estuar, Ma. Regina Justina
Tan, Hans Calvin L
Tan, Austin Sebastien
Abao, Roland P
Aureus, Jelly P
author_sort Co, Nicole Allison S
title Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines
title_short Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines
title_full Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines
title_fullStr Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines
title_full_unstemmed Development of Bilingual Sentiment and Emotion Text Classification Models from COVID-19 Vaccination Tweets in the Philippines
title_sort development of bilingual sentiment and emotion text classification models from covid-19 vaccination tweets in the philippines
publisher Archīum Ateneo
publishDate 2022
url https://archium.ateneo.edu/discs-faculty-pubs/339
https://doi.org/10.1007/978-3-031-05061-9_18
_version_ 1751550476841648128