Development of a Text Classification Model to Detect Disinformation About COVID-19 in Social Media: Understanding the Features and Narratives of Disinformation in the Philippines
As of January 02, 2022, the Philippines is combating another surge in COVID-19 cases. With vaccinations still ongoing, the country remains vigilant and the government continues to promote compliance to minimum health standards as preventive measures to minimize the spread. Disinformation remains a c...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Published: |
Archīum Ateneo
2022
|
Subjects: | |
Online Access: | https://archium.ateneo.edu/discs-faculty-pubs/338 https://doi.org/10.1007/978-3-031-05061-9_27 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Ateneo De Manila University |
Summary: | As of January 02, 2022, the Philippines is combating another surge in COVID-19 cases. With vaccinations still ongoing, the country remains vigilant and the government continues to promote compliance to minimum health standards as preventive measures to minimize the spread. Disinformation remains a challenge especially if compliance to minimum health standards and adoption of health interventions are necessary to curb the spread of COVID-19. Incorrect and unverified information about the virus increased as well which continues to run rampant in social media and with minimal models to detect disinformation in a Philippine context. The study aimed to understand the features of disinformation of COVID-19 in a Philippine context with the goal of creating a text classification model to detect disinformation of COVID-19 in social media to promote vaccine usage in the country. The usage of social network analysis was performed to understand the narratives present regarding COVID-19 disinformation. Words related to vaccines, government corruption, and government mismanagement were prevalent under the disinformation categories of “False” and “Mostly False” while words related to health information such as cases or vaccine counts were prevalent under the “Mostly True” and “True” category. Linear SVM text classification model performed the best through accuracy, precision, and recall in detecting disinformation by using TF-IDF as a feature compared to using both TF-IDF and n-grams. Disinformation narratives revolved around the idea of COVID-19 cases/vaccines, government mismanagement, and regulations. Results showed that disinformation caused distrust of the government’s management over the pandemic. Moreover, the spread of disinformation was contained to the user itself and spread to at least one other user. |
---|