Topic extraction and sentiment analysis of subreddit (r/Coronavirus)

As the COVID-19 pandemic hits the one-year mark, this study takes a look at the content on Reddit’s COVID-19 community, r/Coronavirus. The aim of this study was to gain insight on the public’s sentiments towards COVID-19, the topics that emerged, as well as how they have changed during the course of...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Zachary Meng Jie
Other Authors: Anwitaman Datta
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148153
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:As the COVID-19 pandemic hits the one-year mark, this study takes a look at the content on Reddit’s COVID-19 community, r/Coronavirus. The aim of this study was to gain insight on the public’s sentiments towards COVID-19, the topics that emerged, as well as how they have changed during the course of the pandemic. Based on 356,690 submissions and 9,413,331 comments collected between 20th January 2020 and 31st January 202, analysis was subsequently conducted on each dataset based on lexical sentiment and topics generated from unsupervised topic modelling. The study found that negative sentiments show higher ratio in submissions while negative sentiments were of the same ratio as positive ones in the comments. Terms associated more positively or negatively were identified. Upon assessment of the upvotes and downvotes, this study also uncovered contentious topics such as “fake news”. Through topic modelling, 9 distinct topics were identified from submissions while 20 were identified from comments. Overall, this study provided a clear overview on the dominating topics and popular sentiments that would provide governments and health authorities a deeper understanding of the public’s concerns throughout the year.