Neural networks based pattern classification system for information extraction on disaster news

With the volume increases of online disaster news, a common need when analyzing these online data is obtaining the disaster information from the news articles. It could be considered as the information extraction of Natural Language Processing (NLP). In recent years, the monumental growth of NLP bri...

Full description

Saved in:

Bibliographic Details
Main Author:	Li, Qi
Other Authors:	Mao Kezhi
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Online Access:	https://hdl.handle.net/10356/156198
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	With the volume increases of online disaster news, a common need when analyzing these online data is obtaining the disaster information from the news articles. It could be considered as the information extraction of Natural Language Processing (NLP). In recent years, the monumental growth of NLP brings the potentials to solve the information extraction on disaster news with neural networks based pattern classification systems. In this thesis, we decompose the information extraction system into three NLP pattern classification sub-tasks, which are sentence classification, named entity recognition and typing, and event argument identification.In our constructed system, we design several novel neural networks to solve these pattern classification tasks on top of the disaster news characteristics. Firstly, for sentence classification, the news articles are usually full of disaster-irrelevant words, which poses a great challenge to the neural networks because of the over-fitting problem. In our investigation, it was found that Convolutional Neural Networks (CNNs) may misfit to disaster-irrelevant words in the news, which leads to unsatisfactory performance. To alleviate this problem, the attention mechanism can be integrated into CNNs, but this takes up the limited resources. We propose to address the misfitting problem from a novel angle - pruning disaster-irrelevant words from the news. The proposed method applies Recursive Data-Pruning strategy on CNN (ReDP-CNN), which evaluates the performance of each convolutional filter based on its discriminative power of the feature generated at the pooling layer, and prunes words captured by the poorly-performed convolutional filters. Experiment results show that our proposed model significantly outperforms the CNN baseline model. Moreover, our proposed model produces performance similar to or better than the benchmark models (attention integrated CNNs) while demanding less parameters and FLOPs, and is therefore a choice model for resource-limited scenarios, such as mobile applications. Secondly, for named entity typing, the news articles commonly demonstrate their rich contents through amounts of named entities. Among these entities, merely a small portion is related to the concerned disaster, while the rest demonstrates the disaster-unrelated information and occupies a huge portion. Therefore, it brings a novel named entity typing problem, which is the fine-grained classification of disaster-related entities co-existing with disaster-unrelated entities in a news report. The traditional pipeline framework decomposes this problem into two sub-tasks: disaster-unrelated entities filtering and disaster-related entities fine-grained classification, and sequentially addresses the two sub-tasks. In our system, we propose an end-to-end framework to solve the two sub-tasks simultaneously. To build the end-to-end framework, an Improved Radial Basis Function (ImRBF) classifier with the novel training scheme is developed to jointly solve disaster-unrelated entity filtering and fine-grained classification of disaster-related entities. Because of the end-to-end framework, the interaction of disaster related and unrelated entities could be captured in the Mention-Mention (MM) relationship learning to produce more discriminative features. With these two novel network structures, the proposed end-to-end model is named MM-ImRBF. Experimental evaluation shows that our proposed end-to-end model outperforms the pipeline methods significantly. Thirdly, for event argument identification, news articles have the characteristic that several documents focus on reporting one specific event, which refers to the strong cross-document interaction on arguments of one event. Traditional methods rely heavily on the context information in the single document or the relationship of different events across the documents, which could not make full use of the cross-document interaction of one event. Therefore, we propose a novel cross-document feature called Argument Frequency Vector (AFV) to solve this problem. We build the AFV feature based on the high frequency of the same event argument across the documents. The argument frequency is encoded as the cross-document feature by our proposed frequency statistics on the documents which focus on one event. Experiments show that our proposed cross-document AFV feature is complementary to the single-document features, and hence simultaneously applying these two features brings better performance than only uses either one feature. In summary, this thesis demonstrates the pattern classification system using neural networks to solve the information extraction on disaster news. Based on the unique characteristics of the disaster news articles, we propose ReDP-CNN, MM-ImRBF, and cross-document AFV feature for the sub-tasks of sentence classification, named entity typing, and event argument identification in our constructed system.

Neural networks based pattern classification system for information extraction on disaster news

Similar Items