Information extraction of hazard events

Information Extraction (IE) is the process of extracting structured information from unstructured text. Since news articles on hazard events consist of useful information and are usually reported in real-time, identifying, and extracting such information would allow the government and emergency resp...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Joanna Jia Yi
Other Authors: Mao Kezhi
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/149108
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Information Extraction (IE) is the process of extracting structured information from unstructured text. Since news articles on hazard events consist of useful information and are usually reported in real-time, identifying, and extracting such information would allow the government and emergency response teams to better allocate resources and support to affected areas. In this project, several deep learning models were explored to identify occurrences of information like Deaths, Injury, Location, Date and Time in hazard events related news sentences. News articles of hazard events like attacks, earthquakes, typhoon, hurricanes, road accidents were first identified and filtered using keywords. Next, sentences of interest from these news articles were isolated and labelled to form the hazard events database. The labelled training data is then used to train deep neural network models. Two schemes were explored in this project. In the first scheme, one single model was trained to handle multi-class samples. While in the second scheme, multiple binary classifiers were trained. Discussion and comparison of results between the two schemes were carried out. Finally, information like location, date and time was extracted using spaCy’s named entity recognition.