GENERATING OF AUTOMATIC DISASTER HASHTAG BASED ON OCHA STANDARD

Twitter has been widely used as a communication tool for emergency response when disasters occur in various countries. Emergency response teams or researchers use hashtags for emergency disaster searches. The use of disaster hashtags in particular Twitter does not have a standard format, this makes...

Full description

Saved in:
Bibliographic Details
Main Author: WIATI GUSTI - NIM: 23515041 , KHARISMA
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/28476
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Twitter has been widely used as a communication tool for emergency response when disasters occur in various countries. Emergency response teams or researchers use hashtags for emergency disaster searches. The use of disaster hashtags in particular Twitter does not have a standard format, this makes it difficult to search and collect data for emergency response. OCHA (Office for Coordination of Humanitarian Affairs) proposes to standardize the hashtag for emergency response. <br /> <br /> <br /> <br /> <br /> This study proposes to generate automatic disaster hashtag in accordance with OCHA standards. 2,685 tweets preprocessed and resulting 1,309 tweets in a clean dataset. The research uses the word representation method with Skip Gram model and SMOTE filter for handling imbalanced datasets. Then the classification of tweets into the category of emergency, non-emergency, and other uses various classifiers namely Naïve Bayes, Support Vector Machine, Instance-Based Learning, and Logistic Regression. <br /> <br /> <br /> <br /> <br /> Data emergency and non-emergency categories are used for the introduction of entities, names of disasters and disaster locations. Of the 257 relevant tweets, tokenization and labeled with BIO standardized. As many as 3,856 tokens become inputs for the introduction of named entities using the Conditional Random Field (CRF) model. Furthermore, automatic hashtag generation is performed using the results of the classification and introduction of named entities. <br /> <br /> <br /> <br /> <br /> The results show that the use of skip gram models can improve accuracy. The highest average accuracy of 83.9695% is obtained by using instance-based learning with k 15. The named entity recognition with 70.3% recall, 89.4% precision and 77.1% f-measure. Automatic hashtag generation has good results with an average of 61.2% recall, 87.4% precision and 66.9% f-measure.