Deep learning techniques for information extraction
The explosive growth of unstructured textual information allows us to explore knowledge in depth and use it for our interests. Information extraction emerged as an effective solution to obtain accurate and timely information. The information extraction (IE) process involves extracting structured dat...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/178783 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The explosive growth of unstructured textual information allows us to explore knowledge in depth and use it for our interests. Information extraction emerged as an effective solution to obtain accurate and timely information. The information extraction (IE) process involves extracting structured data from unstructured text (e.g., news articles, government documents, social media posts, medical alerts, and patient records). IE has several sub-tasks, such as name entity recognition (NER), relation extraction (RE), and coreference (CR). With the rapid development of deep learning technologies, many supervised IE systems emerged. Due to the size of the labeled data volume required by the current IE model, however, a large number of labor resources are required. Despite the development of semisupervised learning methods (SSL), generalization remains a challenge. Further, most current IE models disregard the interrelationships between subtasks, which results in an insufficient level of detail in global contexts. This thesis presents a series of works aimed at solving the above-mentioned problem.
• We focus on building an IE framework based on graphs that facilitates the interaction between multiple IE tasks capable of capturing both local and global information. Graphs were constructed by selecting the most confident entity spans and coupling them with a confidence-weighted relation type and a confidence-weighted coreference. Additionally, in the study, a span graph approach was employed, where span updates were propagated across both the coreference and the relation graph. This allowed useful information to be learned from a broader context by enhancing interaction across different IE tasks. The input data were globally shared, and the interaction between subtasks was fully exploited, avoiding cascading errors. Experiments demonstrate that the proposed multitask IE framework outperforms the state-of-the-art in multiple information extraction tasks spanning a variety of datasets.
• We alleviate the problem that current semi-supervised works handle the two tasks (i.e., Named Entity Recognition and Relation Extraction) separately, therefore ignoring the cross-correlations of entity and relation instances as well as the occurrence of similar instances in unlabeled data. We propose a Heterogeneous Graph-based Propagation framework for joint semi-supervised entity and relation extraction, which captures the global structure information between individual tasks and exploits interactions within unlabeled data. Specifically, we construct a unified span-based heterogeneous graph from entity and relation candidates and propagate class labels based on confidence scores. We then employ a propagation learning scheme to leverage the affinities between labeled and unlabeled samples. Experiments on benchmark datasets show that our framework outperforms the state-of-the-art semi-supervised approaches on NER and RE tasks. We show that the joint semi-supervised learning of the two tasks benefits from their codependency and validates the importance of utilizing the shared information between unlabeled data.
• We present a method that considers the semantics of relation types as well as the information contained within different relation groups. In the proposed method, we limit the model’s attention to the semantic information contained within a relationship. To achieve a more coherent semantic representation, we employ a contrastive learning strategy based on group-wise and instance-wise perspectives. Under limited annotation settings, the proposed group-wise contrastive learning minimizes discrepancies between the template and original sentences in the same label group. It maximizes differences between those from separate label groups. Based on our experiments on two public datasets, we demonstrate that our model achieves state-of-the-art results for low resource relation extraction. |
---|