Topic extraction and sentiment analysis of subreddit (r/Coronavirus)

As the COVID-19 pandemic hits the one-year mark, this study takes a look at the content on Reddit’s COVID-19 community, r/Coronavirus. The aim of this study was to gain insight on the public’s sentiments towards COVID-19, the topics that emerged, as well as how they have changed during the course of...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Zachary Meng Jie
Other Authors: Anwitaman Datta
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148153
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-148153
record_format dspace
spelling sg-ntu-dr.10356-1481532021-04-24T06:16:31Z Topic extraction and sentiment analysis of subreddit (r/Coronavirus) Tan, Zachary Meng Jie Anwitaman Datta School of Computer Science and Engineering Anwitaman@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence As the COVID-19 pandemic hits the one-year mark, this study takes a look at the content on Reddit’s COVID-19 community, r/Coronavirus. The aim of this study was to gain insight on the public’s sentiments towards COVID-19, the topics that emerged, as well as how they have changed during the course of the pandemic. Based on 356,690 submissions and 9,413,331 comments collected between 20th January 2020 and 31st January 202, analysis was subsequently conducted on each dataset based on lexical sentiment and topics generated from unsupervised topic modelling. The study found that negative sentiments show higher ratio in submissions while negative sentiments were of the same ratio as positive ones in the comments. Terms associated more positively or negatively were identified. Upon assessment of the upvotes and downvotes, this study also uncovered contentious topics such as “fake news”. Through topic modelling, 9 distinct topics were identified from submissions while 20 were identified from comments. Overall, this study provided a clear overview on the dominating topics and popular sentiments that would provide governments and health authorities a deeper understanding of the public’s concerns throughout the year. Bachelor of Engineering (Computer Science) 2021-04-24T06:16:30Z 2021-04-24T06:16:30Z 2021 Final Year Project (FYP) Tan, Z. M. J. (2021). Topic extraction and sentiment analysis of subreddit (r/Coronavirus). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148153 https://hdl.handle.net/10356/148153 en SCSE20-0564 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Tan, Zachary Meng Jie
Topic extraction and sentiment analysis of subreddit (r/Coronavirus)
description As the COVID-19 pandemic hits the one-year mark, this study takes a look at the content on Reddit’s COVID-19 community, r/Coronavirus. The aim of this study was to gain insight on the public’s sentiments towards COVID-19, the topics that emerged, as well as how they have changed during the course of the pandemic. Based on 356,690 submissions and 9,413,331 comments collected between 20th January 2020 and 31st January 202, analysis was subsequently conducted on each dataset based on lexical sentiment and topics generated from unsupervised topic modelling. The study found that negative sentiments show higher ratio in submissions while negative sentiments were of the same ratio as positive ones in the comments. Terms associated more positively or negatively were identified. Upon assessment of the upvotes and downvotes, this study also uncovered contentious topics such as “fake news”. Through topic modelling, 9 distinct topics were identified from submissions while 20 were identified from comments. Overall, this study provided a clear overview on the dominating topics and popular sentiments that would provide governments and health authorities a deeper understanding of the public’s concerns throughout the year.
author2 Anwitaman Datta
author_facet Anwitaman Datta
Tan, Zachary Meng Jie
format Final Year Project
author Tan, Zachary Meng Jie
author_sort Tan, Zachary Meng Jie
title Topic extraction and sentiment analysis of subreddit (r/Coronavirus)
title_short Topic extraction and sentiment analysis of subreddit (r/Coronavirus)
title_full Topic extraction and sentiment analysis of subreddit (r/Coronavirus)
title_fullStr Topic extraction and sentiment analysis of subreddit (r/Coronavirus)
title_full_unstemmed Topic extraction and sentiment analysis of subreddit (r/Coronavirus)
title_sort topic extraction and sentiment analysis of subreddit (r/coronavirus)
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/148153
_version_ 1698713709910687744