Event extraction and beyond: from conventional NLP to large language models
The digital age has ushered in an era of unprecedented information explosion, characterized by the generation of vast volumes of text. This development has significantly heightened the demands for processing and analyzing large-scale text data. As a result, the technology of event extraction (EE) wa...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182132 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-182132 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Event extraction Large language models Natural language processing In-context learning |
spellingShingle |
Computer and Information Science Event extraction Large language models Natural language processing In-context learning Zhou, Hanzhang Event extraction and beyond: from conventional NLP to large language models |
description |
The digital age has ushered in an era of unprecedented information explosion, characterized by the generation of vast volumes of text. This development has significantly heightened the demands for processing and analyzing large-scale text data. As a result, the technology of event extraction (EE) was developed to transform unstructured event information from text data into structured formats, facilitating the interpretation and application of event information in various domains.
However, extracting structured information from noisy and unstructured data presents several crucial challenges. Firstly, conventional event extraction research generally focuses on the sentence level, whereas real-world text data are typically at the document level. Thus, effectively and accurately extracting event information at the document level is both crucial and challenging. Secondly, conventional EE research necessitates a substantial amount of training data, which is particularly burdensome and costly due to the complexity inherent in EE data annotation.
Additionally, as a typical real-world task, investigating the EE task has led to identification and resolution for common problems that exist across many natural language processing (NLP) domains. For example, the Universum class, often referred to as the Other or Miscellaneous class, widely exists in the EE task and many classification-based NLP tasks. The Universum class exhibits distinct properties; however, existing works often treat it equivalently to the classes of interest. We find this treatment leads to issues such as overfitting, misclassification, and diminished model robustness. Furthermore, when applying large language models (LLMs) to EE tasks, we find that the effectiveness of LLMs is often compromised by their inherent biases, leading to issues of prompt brittleness—sensitivity to design settings such as example selection, order, and prompt formatting.
Moreover, following the revolutionary impact of LLMs since the release of ChatGPT in November 2022, our research has undergone a paradigm shift towards exploring the capabilities of LLMs. Consequently, this thesis incorporates both traditional NLP methods based on supervised learning and the latest paradigms utilizing LLMs.
In this thesis, several works are presented to address the aforementioned challenges:
A document-level event argument extraction method utilizing graph neural networks. This method leverages redundant event information within documents along with coreference information to enhance accuracy in document-level EE.
A prompting strategy tailored for EE to alleviate the need for large-scale labeled data. We explore what LLMs learn from in-context learning (ICL), finding that LLMs learn task heuristics from demonstrations via ICL. Based on this insight, we propose a novel heuristic-driven prompting strategy.
A closed boundary learning framework designed to address the unique properties of the Universum class in classification tasks. We highlight an understudied problem regarding the Universum class. Then, we propose a method that applies closed decision boundaries to classes of interest and designates the area outside all closed boundaries in the feature space as the space of the Universum class.
An inference-only method that addresses the inherent bias of LLMs. We investigate how feedforward neural networks (FFNs) and attention heads result in the bias of LLMs. To mitigate these effects, we introduce the UniBias method, which effectively identifies and eliminates biased FFN vectors and attention heads. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Zhou, Hanzhang |
format |
Thesis-Doctor of Philosophy |
author |
Zhou, Hanzhang |
author_sort |
Zhou, Hanzhang |
title |
Event extraction and beyond: from conventional NLP to large language models |
title_short |
Event extraction and beyond: from conventional NLP to large language models |
title_full |
Event extraction and beyond: from conventional NLP to large language models |
title_fullStr |
Event extraction and beyond: from conventional NLP to large language models |
title_full_unstemmed |
Event extraction and beyond: from conventional NLP to large language models |
title_sort |
event extraction and beyond: from conventional nlp to large language models |
publisher |
Nanyang Technological University |
publishDate |
2025 |
url |
https://hdl.handle.net/10356/182132 |
_version_ |
1823807348318142464 |
spelling |
sg-ntu-dr.10356-1821322025-02-05T01:58:52Z Event extraction and beyond: from conventional NLP to large language models Zhou, Hanzhang Mao Kezhi Interdisciplinary Graduate School (IGS) Institute of Catastrophe Risk Management (ICRM) EKZMao@ntu.edu.sg Computer and Information Science Event extraction Large language models Natural language processing In-context learning The digital age has ushered in an era of unprecedented information explosion, characterized by the generation of vast volumes of text. This development has significantly heightened the demands for processing and analyzing large-scale text data. As a result, the technology of event extraction (EE) was developed to transform unstructured event information from text data into structured formats, facilitating the interpretation and application of event information in various domains. However, extracting structured information from noisy and unstructured data presents several crucial challenges. Firstly, conventional event extraction research generally focuses on the sentence level, whereas real-world text data are typically at the document level. Thus, effectively and accurately extracting event information at the document level is both crucial and challenging. Secondly, conventional EE research necessitates a substantial amount of training data, which is particularly burdensome and costly due to the complexity inherent in EE data annotation. Additionally, as a typical real-world task, investigating the EE task has led to identification and resolution for common problems that exist across many natural language processing (NLP) domains. For example, the Universum class, often referred to as the Other or Miscellaneous class, widely exists in the EE task and many classification-based NLP tasks. The Universum class exhibits distinct properties; however, existing works often treat it equivalently to the classes of interest. We find this treatment leads to issues such as overfitting, misclassification, and diminished model robustness. Furthermore, when applying large language models (LLMs) to EE tasks, we find that the effectiveness of LLMs is often compromised by their inherent biases, leading to issues of prompt brittleness—sensitivity to design settings such as example selection, order, and prompt formatting. Moreover, following the revolutionary impact of LLMs since the release of ChatGPT in November 2022, our research has undergone a paradigm shift towards exploring the capabilities of LLMs. Consequently, this thesis incorporates both traditional NLP methods based on supervised learning and the latest paradigms utilizing LLMs. In this thesis, several works are presented to address the aforementioned challenges: A document-level event argument extraction method utilizing graph neural networks. This method leverages redundant event information within documents along with coreference information to enhance accuracy in document-level EE. A prompting strategy tailored for EE to alleviate the need for large-scale labeled data. We explore what LLMs learn from in-context learning (ICL), finding that LLMs learn task heuristics from demonstrations via ICL. Based on this insight, we propose a novel heuristic-driven prompting strategy. A closed boundary learning framework designed to address the unique properties of the Universum class in classification tasks. We highlight an understudied problem regarding the Universum class. Then, we propose a method that applies closed decision boundaries to classes of interest and designates the area outside all closed boundaries in the feature space as the space of the Universum class. An inference-only method that addresses the inherent bias of LLMs. We investigate how feedforward neural networks (FFNs) and attention heads result in the bias of LLMs. To mitigate these effects, we introduce the UniBias method, which effectively identifies and eliminates biased FFN vectors and attention heads. Doctor of Philosophy 2025-01-09T04:48:43Z 2025-01-09T04:48:43Z 2024 Thesis-Doctor of Philosophy Zhou, H. (2024). Event extraction and beyond: from conventional NLP to large language models. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182132 https://hdl.handle.net/10356/182132 10.32657/10356/182132 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |