Information extraction of hazard events
Information Extraction (IE) is the process of extracting structured information from unstructured text. Since news articles on hazard events consist of useful information and are usually reported in real-time, identifying, and extracting such information would allow the government and emergency resp...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/149108 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Information Extraction (IE) is the process of extracting structured information from unstructured text. Since news articles on hazard events consist of useful information and are usually reported in real-time, identifying, and extracting such information would allow the government and emergency response teams to better allocate resources and support to affected areas.
In this project, several deep learning models were explored to identify occurrences of information like Deaths, Injury, Location, Date and Time in hazard events related news sentences. News articles of hazard events like attacks, earthquakes, typhoon, hurricanes, road accidents were first identified and filtered using keywords. Next, sentences of interest from these news articles were isolated and labelled to form the hazard events database. The labelled training data is then used to train deep neural network models. Two schemes were explored in this project.
In the first scheme, one single model was trained to handle multi-class samples. While in the second scheme, multiple binary classifiers were trained. Discussion and comparison of results between the two schemes were carried out. Finally, information like location, date and time was extracted using spaCy’s named entity recognition. |
---|