Event extraction and beyond: from conventional NLP to large language models

The digital age has ushered in an era of unprecedented information explosion, characterized by the generation of vast volumes of text. This development has significantly heightened the demands for processing and analyzing large-scale text data. As a result, the technology of event extraction (EE) wa...

Full description

Saved in:
Bibliographic Details
Main Author: Zhou, Hanzhang
Other Authors: Mao Kezhi
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182132
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-182132
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Event extraction
Large language models
Natural language processing
In-context learning
spellingShingle Computer and Information Science
Event extraction
Large language models
Natural language processing
In-context learning
Zhou, Hanzhang
Event extraction and beyond: from conventional NLP to large language models
description The digital age has ushered in an era of unprecedented information explosion, characterized by the generation of vast volumes of text. This development has significantly heightened the demands for processing and analyzing large-scale text data. As a result, the technology of event extraction (EE) was developed to transform unstructured event information from text data into structured formats, facilitating the interpretation and application of event information in various domains. However, extracting structured information from noisy and unstructured data presents several crucial challenges. Firstly, conventional event extraction research generally focuses on the sentence level, whereas real-world text data are typically at the document level. Thus, effectively and accurately extracting event information at the document level is both crucial and challenging. Secondly, conventional EE research necessitates a substantial amount of training data, which is particularly burdensome and costly due to the complexity inherent in EE data annotation. Additionally, as a typical real-world task, investigating the EE task has led to identification and resolution for common problems that exist across many natural language processing (NLP) domains. For example, the Universum class, often referred to as the Other or Miscellaneous class, widely exists in the EE task and many classification-based NLP tasks. The Universum class exhibits distinct properties; however, existing works often treat it equivalently to the classes of interest. We find this treatment leads to issues such as overfitting, misclassification, and diminished model robustness. Furthermore, when applying large language models (LLMs) to EE tasks, we find that the effectiveness of LLMs is often compromised by their inherent biases, leading to issues of prompt brittleness—sensitivity to design settings such as example selection, order, and prompt formatting. Moreover, following the revolutionary impact of LLMs since the release of ChatGPT in November 2022, our research has undergone a paradigm shift towards exploring the capabilities of LLMs. Consequently, this thesis incorporates both traditional NLP methods based on supervised learning and the latest paradigms utilizing LLMs. In this thesis, several works are presented to address the aforementioned challenges: A document-level event argument extraction method utilizing graph neural networks. This method leverages redundant event information within documents along with coreference information to enhance accuracy in document-level EE. A prompting strategy tailored for EE to alleviate the need for large-scale labeled data. We explore what LLMs learn from in-context learning (ICL), finding that LLMs learn task heuristics from demonstrations via ICL. Based on this insight, we propose a novel heuristic-driven prompting strategy. A closed boundary learning framework designed to address the unique properties of the Universum class in classification tasks. We highlight an understudied problem regarding the Universum class. Then, we propose a method that applies closed decision boundaries to classes of interest and designates the area outside all closed boundaries in the feature space as the space of the Universum class. An inference-only method that addresses the inherent bias of LLMs. We investigate how feedforward neural networks (FFNs) and attention heads result in the bias of LLMs. To mitigate these effects, we introduce the UniBias method, which effectively identifies and eliminates biased FFN vectors and attention heads.
author2 Mao Kezhi
author_facet Mao Kezhi
Zhou, Hanzhang
format Thesis-Doctor of Philosophy
author Zhou, Hanzhang
author_sort Zhou, Hanzhang
title Event extraction and beyond: from conventional NLP to large language models
title_short Event extraction and beyond: from conventional NLP to large language models
title_full Event extraction and beyond: from conventional NLP to large language models
title_fullStr Event extraction and beyond: from conventional NLP to large language models
title_full_unstemmed Event extraction and beyond: from conventional NLP to large language models
title_sort event extraction and beyond: from conventional nlp to large language models
publisher Nanyang Technological University
publishDate 2025
url https://hdl.handle.net/10356/182132
_version_ 1823807348318142464
spelling sg-ntu-dr.10356-1821322025-02-05T01:58:52Z Event extraction and beyond: from conventional NLP to large language models Zhou, Hanzhang Mao Kezhi Interdisciplinary Graduate School (IGS) Institute of Catastrophe Risk Management (ICRM) EKZMao@ntu.edu.sg Computer and Information Science Event extraction Large language models Natural language processing In-context learning The digital age has ushered in an era of unprecedented information explosion, characterized by the generation of vast volumes of text. This development has significantly heightened the demands for processing and analyzing large-scale text data. As a result, the technology of event extraction (EE) was developed to transform unstructured event information from text data into structured formats, facilitating the interpretation and application of event information in various domains. However, extracting structured information from noisy and unstructured data presents several crucial challenges. Firstly, conventional event extraction research generally focuses on the sentence level, whereas real-world text data are typically at the document level. Thus, effectively and accurately extracting event information at the document level is both crucial and challenging. Secondly, conventional EE research necessitates a substantial amount of training data, which is particularly burdensome and costly due to the complexity inherent in EE data annotation. Additionally, as a typical real-world task, investigating the EE task has led to identification and resolution for common problems that exist across many natural language processing (NLP) domains. For example, the Universum class, often referred to as the Other or Miscellaneous class, widely exists in the EE task and many classification-based NLP tasks. The Universum class exhibits distinct properties; however, existing works often treat it equivalently to the classes of interest. We find this treatment leads to issues such as overfitting, misclassification, and diminished model robustness. Furthermore, when applying large language models (LLMs) to EE tasks, we find that the effectiveness of LLMs is often compromised by their inherent biases, leading to issues of prompt brittleness—sensitivity to design settings such as example selection, order, and prompt formatting. Moreover, following the revolutionary impact of LLMs since the release of ChatGPT in November 2022, our research has undergone a paradigm shift towards exploring the capabilities of LLMs. Consequently, this thesis incorporates both traditional NLP methods based on supervised learning and the latest paradigms utilizing LLMs. In this thesis, several works are presented to address the aforementioned challenges: A document-level event argument extraction method utilizing graph neural networks. This method leverages redundant event information within documents along with coreference information to enhance accuracy in document-level EE. A prompting strategy tailored for EE to alleviate the need for large-scale labeled data. We explore what LLMs learn from in-context learning (ICL), finding that LLMs learn task heuristics from demonstrations via ICL. Based on this insight, we propose a novel heuristic-driven prompting strategy. A closed boundary learning framework designed to address the unique properties of the Universum class in classification tasks. We highlight an understudied problem regarding the Universum class. Then, we propose a method that applies closed decision boundaries to classes of interest and designates the area outside all closed boundaries in the feature space as the space of the Universum class. An inference-only method that addresses the inherent bias of LLMs. We investigate how feedforward neural networks (FFNs) and attention heads result in the bias of LLMs. To mitigate these effects, we introduce the UniBias method, which effectively identifies and eliminates biased FFN vectors and attention heads. Doctor of Philosophy 2025-01-09T04:48:43Z 2025-01-09T04:48:43Z 2024 Thesis-Doctor of Philosophy Zhou, H. (2024). Event extraction and beyond: from conventional NLP to large language models. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182132 https://hdl.handle.net/10356/182132 10.32657/10356/182132 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University