Document level relationship extraction
The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/167545 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of context from multiple sentences.
The current baseline method uses a BiLSTM model to encode the entire document. However, the BiLSTM model must be trained from scratch and will not be able to accurately capture the intricacies between entities when trained only on the given dataset. To properly capture the context of the interaction, we propose incorporating a state-of-the-art RoBERTa-Large model, a variant of BERT that is already pretrained on a corpus that is a magnitude larger than the original corpus and further finetuned with the dataset.
Additionally, we will be incorporating the concept of limiting the input into the encoder to only three sentences rather than the whole document as a recent study proved that most Entity Relationships (ER) can be inferred using only context from three sentences of a document.
The result of implementing the proposed changes leads to a reduction in the memory required to process the input, increase the accuracy of the predicted ER and improve the transferability of the model when provided with input from a domain not found in the training corpus. |
---|