Document level relationship extraction

The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of...

Full description

Saved in:
Bibliographic Details
Main Author: Leong, Marcus Yu Zhen
Other Authors: Lihui Chen
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167545
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of context from multiple sentences. The current baseline method uses a BiLSTM model to encode the entire document. However, the BiLSTM model must be trained from scratch and will not be able to accurately capture the intricacies between entities when trained only on the given dataset. To properly capture the context of the interaction, we propose incorporating a state-of-the-art RoBERTa-Large model, a variant of BERT that is already pretrained on a corpus that is a magnitude larger than the original corpus and further finetuned with the dataset. Additionally, we will be incorporating the concept of limiting the input into the encoder to only three sentences rather than the whole document as a recent study proved that most Entity Relationships (ER) can be inferred using only context from three sentences of a document. The result of implementing the proposed changes leads to a reduction in the memory required to process the input, increase the accuracy of the predicted ER and improve the transferability of the model when provided with input from a domain not found in the training corpus.