Knowledge graph construction from text
Open Information Extraction (OpenIE) has been the go-to tool for making sense and structuring of the otherwise unstructured text documents. The goal of an OpenIE system is to extract semantic triples (Subject-Relation->Object) from texts. Subject and Object in a semantic triple are typically enti...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/153229 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-153229 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1532292021-11-16T06:57:23Z Knowledge graph construction from text Yong, Shan Jie Sun Aixin School of Computer Science and Engineering AXSun@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Document and text processing Open Information Extraction (OpenIE) has been the go-to tool for making sense and structuring of the otherwise unstructured text documents. The goal of an OpenIE system is to extract semantic triples (Subject-Relation->Object) from texts. Subject and Object in a semantic triple are typically entities with Noun, Proper Noun, or Pronoun Part-of-Speech (POS) tag. In English texts, Proper Nouns, for example, are often referred to with Pronouns after its first mention. These substitutions undoubtedly ease written and verbal communication. However, in Information Extraction, it may result in ambiguity during semantic triple extraction. Pronouns may be seen as an independent entity from its antecedent. This project aims to resolve the aforementioned ambiguity by integrating OpenIE systems with Coreference Resolution, thereby allowing the extraction of relations between entities across the entire document. Additionally, across all coreference mention of an entity, there is one term among them that best represent the entity. Existing methods to identify this representative term include picking the longest term, or picking the first term. This project will experiment with methods that extract features of each coreference term in order to select the likeliest representative term, allowing for both anaphoric and cataphoric references to be resolved with a higher degree of certainty. Bachelor of Engineering (Computer Engineering) 2021-11-16T06:57:23Z 2021-11-16T06:57:23Z 2021 Final Year Project (FYP) Yong, S. J. (2021). Knowledge graph construction from text. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/153229 https://hdl.handle.net/10356/153229 en SCSE20-0954 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Document and text processing |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Document and text processing Yong, Shan Jie Knowledge graph construction from text |
description |
Open Information Extraction (OpenIE) has been the go-to tool for making sense and structuring of the otherwise unstructured text documents. The goal of an OpenIE system is to extract semantic triples (Subject-Relation->Object) from texts. Subject and Object in a semantic triple are typically entities with Noun, Proper Noun, or Pronoun Part-of-Speech (POS) tag. In English texts, Proper Nouns, for example, are often referred to with Pronouns after its first mention. These substitutions undoubtedly ease written and verbal communication. However, in Information Extraction, it may result in ambiguity during semantic triple extraction. Pronouns may be seen as an independent entity from its antecedent. This project aims to resolve the aforementioned ambiguity by integrating OpenIE systems with Coreference Resolution, thereby allowing the extraction of relations between entities across the entire document. Additionally, across all coreference mention of an entity, there is one term among them that best represent the entity. Existing methods to identify this representative term include picking the longest term, or picking the first term. This project will experiment with methods that extract features of each coreference term in order to select the likeliest representative term, allowing for both anaphoric and cataphoric references to be resolved with a higher degree of certainty. |
author2 |
Sun Aixin |
author_facet |
Sun Aixin Yong, Shan Jie |
format |
Final Year Project |
author |
Yong, Shan Jie |
author_sort |
Yong, Shan Jie |
title |
Knowledge graph construction from text |
title_short |
Knowledge graph construction from text |
title_full |
Knowledge graph construction from text |
title_fullStr |
Knowledge graph construction from text |
title_full_unstemmed |
Knowledge graph construction from text |
title_sort |
knowledge graph construction from text |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/153229 |
_version_ |
1718368033054916608 |