Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets

In this paper, we present a framework that combines automatic and manual approaches to discover themes in disaster-related tweets. As case study, we decided to focus on tweets related to typhoon Haiyan, which caused billions of dollars in damages. We collected tweets from November 2013 to March 2014...

Full description

Saved in:
Bibliographic Details
Main Authors: Syliongka, Leif Romeritch, Oco, Nathaniel, Lam, Alron Jan, Soriano, Cheryll Ruth, Roldan, Ma. Divina Gracia, Magno, Francisco, Cheng, Charibeth
Format: text
Published: Animo Repository 2015
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/2745
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-3744
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-37442021-10-29T03:26:37Z Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets Syliongka, Leif Romeritch Oco, Nathaniel Lam, Alron Jan Soriano, Cheryll Ruth Roldan, Ma. Divina Gracia Magno, Francisco Cheng, Charibeth In this paper, we present a framework that combines automatic and manual approaches to discover themes in disaster-related tweets. As case study, we decided to focus on tweets related to typhoon Haiyan, which caused billions of dollars in damages. We collected tweets from November 2013 to March 2014 and used the local typhoon name "Yolanda" as the filter. Data association was used to expand the tweet set and k-means clustering was then applied. Clusters with high number of instances were subjected to open coding for labeling. The Silhouette indices ranged from 0.27 to 0.50. Analyses reveal that the use of automated Natural Language Processing (NLP) approach has the potential to deal with huge volumes of tweets by clustering frequently occurring words and phrases. This complements the manual approach to surface themes from a more manageable set of tweet pool, allowing for a more nuanced analysis of tweets from a human expert. As application, the themes identified during open coding were used as labels to train a classifier system. Future work could explore on using topic models and focusing on specific content or issues, such as natural calamities and citizen's participation in addressing these. 2015-05-18T07:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/2745 Faculty Research Work Animo Repository cNatural language processing (Computer science) Document clustering Typhoon Haiyan, 2013 Microblogs Computer Sciences Emergency and Disaster Management Models and Methods
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic cNatural language processing (Computer science)
Document clustering
Typhoon Haiyan, 2013
Microblogs
Computer Sciences
Emergency and Disaster Management
Models and Methods
spellingShingle cNatural language processing (Computer science)
Document clustering
Typhoon Haiyan, 2013
Microblogs
Computer Sciences
Emergency and Disaster Management
Models and Methods
Syliongka, Leif Romeritch
Oco, Nathaniel
Lam, Alron Jan
Soriano, Cheryll Ruth
Roldan, Ma. Divina Gracia
Magno, Francisco
Cheng, Charibeth
Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets
description In this paper, we present a framework that combines automatic and manual approaches to discover themes in disaster-related tweets. As case study, we decided to focus on tweets related to typhoon Haiyan, which caused billions of dollars in damages. We collected tweets from November 2013 to March 2014 and used the local typhoon name "Yolanda" as the filter. Data association was used to expand the tweet set and k-means clustering was then applied. Clusters with high number of instances were subjected to open coding for labeling. The Silhouette indices ranged from 0.27 to 0.50. Analyses reveal that the use of automated Natural Language Processing (NLP) approach has the potential to deal with huge volumes of tweets by clustering frequently occurring words and phrases. This complements the manual approach to surface themes from a more manageable set of tweet pool, allowing for a more nuanced analysis of tweets from a human expert. As application, the themes identified during open coding were used as labels to train a classifier system. Future work could explore on using topic models and focusing on specific content or issues, such as natural calamities and citizen's participation in addressing these.
format text
author Syliongka, Leif Romeritch
Oco, Nathaniel
Lam, Alron Jan
Soriano, Cheryll Ruth
Roldan, Ma. Divina Gracia
Magno, Francisco
Cheng, Charibeth
author_facet Syliongka, Leif Romeritch
Oco, Nathaniel
Lam, Alron Jan
Soriano, Cheryll Ruth
Roldan, Ma. Divina Gracia
Magno, Francisco
Cheng, Charibeth
author_sort Syliongka, Leif Romeritch
title Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets
title_short Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets
title_full Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets
title_fullStr Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets
title_full_unstemmed Combining automatic and manual approaches: Towards a framework for discovering themes in disaster-related tweets
title_sort combining automatic and manual approaches: towards a framework for discovering themes in disaster-related tweets
publisher Animo Repository
publishDate 2015
url https://animorepository.dlsu.edu.ph/faculty_research/2745
_version_ 1715215744263782400