Extracting vocabulary for ontology learning using text mining

Studies on ontologies are receiving a growing attention due to their well-known nature of explicit knowledge representation, sharing common understanding of the structure of information and reusability of domain knowledge. However, manual construction of new ontologies is a time consuming and resour...

Full description

Saved in:
Bibliographic Details
Main Author: Kaythi Myo Naing
Other Authors: Erik Cambria
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66746
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-66746
record_format dspace
spelling sg-ntu-dr.10356-667462023-03-03T20:29:34Z Extracting vocabulary for ontology learning using text mining Kaythi Myo Naing Erik Cambria School of Computer Engineering DRNTU::Engineering Studies on ontologies are receiving a growing attention due to their well-known nature of explicit knowledge representation, sharing common understanding of the structure of information and reusability of domain knowledge. However, manual construction of new ontologies is a time consuming and resource costly task. Hence, it rises a focus to develop the ontology learning to automate the construction of new ontologies as well as to maintain the existing ontologies with additional extended knowledge available. The ontology learning which helps enriching existing ontologies comprises processes from the collection of domain-specific literatures, selecting relevant documents and text mining in order to refine the concept vocabularies. Since the World Wide Web is considered as a rich repository of information that can be fed as useful information to the ontology learning, the corpus for this project was built upon the information crawled from the web. Nevertheless, availability of massive amounts of web pages which possesses varied content quality has become an issue in filtering the domain relevant information from the web. The main objective of this project is to develop a system to retrieve the web pages from the internet and provide an automatic classification process to label them according to their relevance to the domain. In this work, data was collected for the domain “Knowledge Management”. This project includes the procedures of crawling web data, conducting relevance classification on web textual documents and finally evaluating the results of experiments on selecting different classifiers upon different feature representations which are bag-of-word model based TF-IDF weights and dependency-based word embeddings. Bachelor of Engineering (Computer Science) 2016-04-25T03:05:56Z 2016-04-25T03:05:56Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66746 en Nanyang Technological University application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Kaythi Myo Naing
Extracting vocabulary for ontology learning using text mining
description Studies on ontologies are receiving a growing attention due to their well-known nature of explicit knowledge representation, sharing common understanding of the structure of information and reusability of domain knowledge. However, manual construction of new ontologies is a time consuming and resource costly task. Hence, it rises a focus to develop the ontology learning to automate the construction of new ontologies as well as to maintain the existing ontologies with additional extended knowledge available. The ontology learning which helps enriching existing ontologies comprises processes from the collection of domain-specific literatures, selecting relevant documents and text mining in order to refine the concept vocabularies. Since the World Wide Web is considered as a rich repository of information that can be fed as useful information to the ontology learning, the corpus for this project was built upon the information crawled from the web. Nevertheless, availability of massive amounts of web pages which possesses varied content quality has become an issue in filtering the domain relevant information from the web. The main objective of this project is to develop a system to retrieve the web pages from the internet and provide an automatic classification process to label them according to their relevance to the domain. In this work, data was collected for the domain “Knowledge Management”. This project includes the procedures of crawling web data, conducting relevance classification on web textual documents and finally evaluating the results of experiments on selecting different classifiers upon different feature representations which are bag-of-word model based TF-IDF weights and dependency-based word embeddings.
author2 Erik Cambria
author_facet Erik Cambria
Kaythi Myo Naing
format Final Year Project
author Kaythi Myo Naing
author_sort Kaythi Myo Naing
title Extracting vocabulary for ontology learning using text mining
title_short Extracting vocabulary for ontology learning using text mining
title_full Extracting vocabulary for ontology learning using text mining
title_fullStr Extracting vocabulary for ontology learning using text mining
title_full_unstemmed Extracting vocabulary for ontology learning using text mining
title_sort extracting vocabulary for ontology learning using text mining
publishDate 2016
url http://hdl.handle.net/10356/66746
_version_ 1759856656808673280