Knowledge graph construction from text

Open Information Extraction (OpenIE) has been the go-to tool for making sense and structuring of the otherwise unstructured text documents. The goal of an OpenIE system is to extract semantic triples (Subject-Relation->Object) from texts. Subject and Object in a semantic triple are typically enti...

Full description

Saved in:
Bibliographic Details
Main Author: Yong, Shan Jie
Other Authors: Sun Aixin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/153229
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-153229
record_format dspace
spelling sg-ntu-dr.10356-1532292021-11-16T06:57:23Z Knowledge graph construction from text Yong, Shan Jie Sun Aixin School of Computer Science and Engineering AXSun@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Document and text processing Open Information Extraction (OpenIE) has been the go-to tool for making sense and structuring of the otherwise unstructured text documents. The goal of an OpenIE system is to extract semantic triples (Subject-Relation->Object) from texts. Subject and Object in a semantic triple are typically entities with Noun, Proper Noun, or Pronoun Part-of-Speech (POS) tag. In English texts, Proper Nouns, for example, are often referred to with Pronouns after its first mention. These substitutions undoubtedly ease written and verbal communication. However, in Information Extraction, it may result in ambiguity during semantic triple extraction. Pronouns may be seen as an independent entity from its antecedent. This project aims to resolve the aforementioned ambiguity by integrating OpenIE systems with Coreference Resolution, thereby allowing the extraction of relations between entities across the entire document. Additionally, across all coreference mention of an entity, there is one term among them that best represent the entity. Existing methods to identify this representative term include picking the longest term, or picking the first term. This project will experiment with methods that extract features of each coreference term in order to select the likeliest representative term, allowing for both anaphoric and cataphoric references to be resolved with a higher degree of certainty. Bachelor of Engineering (Computer Engineering) 2021-11-16T06:57:23Z 2021-11-16T06:57:23Z 2021 Final Year Project (FYP) Yong, S. J. (2021). Knowledge graph construction from text. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/153229 https://hdl.handle.net/10356/153229 en SCSE20-0954 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Yong, Shan Jie
Knowledge graph construction from text
description Open Information Extraction (OpenIE) has been the go-to tool for making sense and structuring of the otherwise unstructured text documents. The goal of an OpenIE system is to extract semantic triples (Subject-Relation->Object) from texts. Subject and Object in a semantic triple are typically entities with Noun, Proper Noun, or Pronoun Part-of-Speech (POS) tag. In English texts, Proper Nouns, for example, are often referred to with Pronouns after its first mention. These substitutions undoubtedly ease written and verbal communication. However, in Information Extraction, it may result in ambiguity during semantic triple extraction. Pronouns may be seen as an independent entity from its antecedent. This project aims to resolve the aforementioned ambiguity by integrating OpenIE systems with Coreference Resolution, thereby allowing the extraction of relations between entities across the entire document. Additionally, across all coreference mention of an entity, there is one term among them that best represent the entity. Existing methods to identify this representative term include picking the longest term, or picking the first term. This project will experiment with methods that extract features of each coreference term in order to select the likeliest representative term, allowing for both anaphoric and cataphoric references to be resolved with a higher degree of certainty.
author2 Sun Aixin
author_facet Sun Aixin
Yong, Shan Jie
format Final Year Project
author Yong, Shan Jie
author_sort Yong, Shan Jie
title Knowledge graph construction from text
title_short Knowledge graph construction from text
title_full Knowledge graph construction from text
title_fullStr Knowledge graph construction from text
title_full_unstemmed Knowledge graph construction from text
title_sort knowledge graph construction from text
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/153229
_version_ 1718368033054916608