Deep learning techniques for information extraction

The explosive growth of unstructured textual information allows us to explore knowledge in depth and use it for our interests. Information extraction emerged as an effective solution to obtain accurate and timely information. The information extraction (IE) process involves extracting structured dat...

Full description

Saved in:
Bibliographic Details
Main Author: Zheng, Yandan
Other Authors: Jiang Xudong
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/178783
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-178783
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Zheng, Yandan
Deep learning techniques for information extraction
description The explosive growth of unstructured textual information allows us to explore knowledge in depth and use it for our interests. Information extraction emerged as an effective solution to obtain accurate and timely information. The information extraction (IE) process involves extracting structured data from unstructured text (e.g., news articles, government documents, social media posts, medical alerts, and patient records). IE has several sub-tasks, such as name entity recognition (NER), relation extraction (RE), and coreference (CR). With the rapid development of deep learning technologies, many supervised IE systems emerged. Due to the size of the labeled data volume required by the current IE model, however, a large number of labor resources are required. Despite the development of semisupervised learning methods (SSL), generalization remains a challenge. Further, most current IE models disregard the interrelationships between subtasks, which results in an insufficient level of detail in global contexts. This thesis presents a series of works aimed at solving the above-mentioned problem. • We focus on building an IE framework based on graphs that facilitates the interaction between multiple IE tasks capable of capturing both local and global information. Graphs were constructed by selecting the most confident entity spans and coupling them with a confidence-weighted relation type and a confidence-weighted coreference. Additionally, in the study, a span graph approach was employed, where span updates were propagated across both the coreference and the relation graph. This allowed useful information to be learned from a broader context by enhancing interaction across different IE tasks. The input data were globally shared, and the interaction between subtasks was fully exploited, avoiding cascading errors. Experiments demonstrate that the proposed multitask IE framework outperforms the state-of-the-art in multiple information extraction tasks spanning a variety of datasets. • We alleviate the problem that current semi-supervised works handle the two tasks (i.e., Named Entity Recognition and Relation Extraction) separately, therefore ignoring the cross-correlations of entity and relation instances as well as the occurrence of similar instances in unlabeled data. We propose a Heterogeneous Graph-based Propagation framework for joint semi-supervised entity and relation extraction, which captures the global structure information between individual tasks and exploits interactions within unlabeled data. Specifically, we construct a unified span-based heterogeneous graph from entity and relation candidates and propagate class labels based on confidence scores. We then employ a propagation learning scheme to leverage the affinities between labeled and unlabeled samples. Experiments on benchmark datasets show that our framework outperforms the state-of-the-art semi-supervised approaches on NER and RE tasks. We show that the joint semi-supervised learning of the two tasks benefits from their codependency and validates the importance of utilizing the shared information between unlabeled data. • We present a method that considers the semantics of relation types as well as the information contained within different relation groups. In the proposed method, we limit the model’s attention to the semantic information contained within a relationship. To achieve a more coherent semantic representation, we employ a contrastive learning strategy based on group-wise and instance-wise perspectives. Under limited annotation settings, the proposed group-wise contrastive learning minimizes discrepancies between the template and original sentences in the same label group. It maximizes differences between those from separate label groups. Based on our experiments on two public datasets, we demonstrate that our model achieves state-of-the-art results for low resource relation extraction.
author2 Jiang Xudong
author_facet Jiang Xudong
Zheng, Yandan
format Thesis-Doctor of Philosophy
author Zheng, Yandan
author_sort Zheng, Yandan
title Deep learning techniques for information extraction
title_short Deep learning techniques for information extraction
title_full Deep learning techniques for information extraction
title_fullStr Deep learning techniques for information extraction
title_full_unstemmed Deep learning techniques for information extraction
title_sort deep learning techniques for information extraction
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/178783
_version_ 1806059928948834304
spelling sg-ntu-dr.10356-1787832024-07-07T15:36:36Z Deep learning techniques for information extraction Zheng, Yandan Jiang Xudong Luu Anh Tuan Interdisciplinary Graduate School (IGS) HealthTech EXDJiang@ntu.edu.sg, anhtuan.luu@ntu.edu.sg Computer and Information Science The explosive growth of unstructured textual information allows us to explore knowledge in depth and use it for our interests. Information extraction emerged as an effective solution to obtain accurate and timely information. The information extraction (IE) process involves extracting structured data from unstructured text (e.g., news articles, government documents, social media posts, medical alerts, and patient records). IE has several sub-tasks, such as name entity recognition (NER), relation extraction (RE), and coreference (CR). With the rapid development of deep learning technologies, many supervised IE systems emerged. Due to the size of the labeled data volume required by the current IE model, however, a large number of labor resources are required. Despite the development of semisupervised learning methods (SSL), generalization remains a challenge. Further, most current IE models disregard the interrelationships between subtasks, which results in an insufficient level of detail in global contexts. This thesis presents a series of works aimed at solving the above-mentioned problem. • We focus on building an IE framework based on graphs that facilitates the interaction between multiple IE tasks capable of capturing both local and global information. Graphs were constructed by selecting the most confident entity spans and coupling them with a confidence-weighted relation type and a confidence-weighted coreference. Additionally, in the study, a span graph approach was employed, where span updates were propagated across both the coreference and the relation graph. This allowed useful information to be learned from a broader context by enhancing interaction across different IE tasks. The input data were globally shared, and the interaction between subtasks was fully exploited, avoiding cascading errors. Experiments demonstrate that the proposed multitask IE framework outperforms the state-of-the-art in multiple information extraction tasks spanning a variety of datasets. • We alleviate the problem that current semi-supervised works handle the two tasks (i.e., Named Entity Recognition and Relation Extraction) separately, therefore ignoring the cross-correlations of entity and relation instances as well as the occurrence of similar instances in unlabeled data. We propose a Heterogeneous Graph-based Propagation framework for joint semi-supervised entity and relation extraction, which captures the global structure information between individual tasks and exploits interactions within unlabeled data. Specifically, we construct a unified span-based heterogeneous graph from entity and relation candidates and propagate class labels based on confidence scores. We then employ a propagation learning scheme to leverage the affinities between labeled and unlabeled samples. Experiments on benchmark datasets show that our framework outperforms the state-of-the-art semi-supervised approaches on NER and RE tasks. We show that the joint semi-supervised learning of the two tasks benefits from their codependency and validates the importance of utilizing the shared information between unlabeled data. • We present a method that considers the semantics of relation types as well as the information contained within different relation groups. In the proposed method, we limit the model’s attention to the semantic information contained within a relationship. To achieve a more coherent semantic representation, we employ a contrastive learning strategy based on group-wise and instance-wise perspectives. Under limited annotation settings, the proposed group-wise contrastive learning minimizes discrepancies between the template and original sentences in the same label group. It maximizes differences between those from separate label groups. Based on our experiments on two public datasets, we demonstrate that our model achieves state-of-the-art results for low resource relation extraction. Doctor of Philosophy 2024-07-04T06:38:57Z 2024-07-04T06:38:57Z 2024 Thesis-Doctor of Philosophy Zheng, Y. (2024). Deep learning techniques for information extraction. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/178783 https://hdl.handle.net/10356/178783 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University