Document level relationship extraction

The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of...

Full description

Saved in:

Bibliographic Details
Main Author:	Leong, Marcus Yu Zhen
Other Authors:	Lihui Chen
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/167545
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-167545
record_format	dspace
spelling	sg-ntu-dr.10356-1675452023-07-07T15:44:53Z Document level relationship extraction Leong, Marcus Yu Zhen Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Engineering::Electrical and electronic engineering The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of context from multiple sentences. The current baseline method uses a BiLSTM model to encode the entire document. However, the BiLSTM model must be trained from scratch and will not be able to accurately capture the intricacies between entities when trained only on the given dataset. To properly capture the context of the interaction, we propose incorporating a state-of-the-art RoBERTa-Large model, a variant of BERT that is already pretrained on a corpus that is a magnitude larger than the original corpus and further finetuned with the dataset. Additionally, we will be incorporating the concept of limiting the input into the encoder to only three sentences rather than the whole document as a recent study proved that most Entity Relationships (ER) can be inferred using only context from three sentences of a document. The result of implementing the proposed changes leads to a reduction in the memory required to process the input, increase the accuracy of the predicted ER and improve the transferability of the model when provided with input from a domain not found in the training corpus. Bachelor of Engineering (Information Engineering and Media) 2023-05-29T05:32:34Z 2023-05-29T05:32:34Z 2023 Final Year Project (FYP) Leong, M. Y. Z. (2023). Document level relationship extraction. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/167545 https://hdl.handle.net/10356/167545 en A3062-221 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering
spellingShingle	Engineering::Electrical and electronic engineering Leong, Marcus Yu Zhen Document level relationship extraction
description	The process of Document Level Relationship Extraction (RE) consists of inputting multiple sentences into an RE model to output a relationship between entities that otherwise cannot be determined using context from only a single sentence. This is a more challenging task as it requires the analysis of context from multiple sentences. The current baseline method uses a BiLSTM model to encode the entire document. However, the BiLSTM model must be trained from scratch and will not be able to accurately capture the intricacies between entities when trained only on the given dataset. To properly capture the context of the interaction, we propose incorporating a state-of-the-art RoBERTa-Large model, a variant of BERT that is already pretrained on a corpus that is a magnitude larger than the original corpus and further finetuned with the dataset. Additionally, we will be incorporating the concept of limiting the input into the encoder to only three sentences rather than the whole document as a recent study proved that most Entity Relationships (ER) can be inferred using only context from three sentences of a document. The result of implementing the proposed changes leads to a reduction in the memory required to process the input, increase the accuracy of the predicted ER and improve the transferability of the model when provided with input from a domain not found in the training corpus.
author2	Lihui Chen
author_facet	Lihui Chen Leong, Marcus Yu Zhen
format	Final Year Project
author	Leong, Marcus Yu Zhen
author_sort	Leong, Marcus Yu Zhen
title	Document level relationship extraction
title_short	Document level relationship extraction
title_full	Document level relationship extraction
title_fullStr	Document level relationship extraction
title_full_unstemmed	Document level relationship extraction
title_sort	document level relationship extraction
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/167545
_version_	1772828274998640640

Document level relationship extraction

Similar Items