CRCTOL: A semantic based domain ontology learning system

Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing te...

Full description

Saved in:
Bibliographic Details
Main Authors: JIANG, Xing, TAN, Ah-hwee
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2010
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5223
https://ink.library.smu.edu.sg/context/sis_research/article/6226/viewcontent/CRCTOL.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-6226
record_format dspace
spelling sg-smu-ink.sis_research-62262020-07-23T18:32:38Z CRCTOL: A semantic based domain ontology learning system JIANG, Xing TAN, Ah-hwee Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept‐Relation‐Concept Tuple‐based Ontology Learning (CRCTOL), for mining ontologies automatically from domain‐specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico‐syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, a word sense disambiguation algorithm that disambiguates words in the key concepts, a rule‐based algorithm that extracts relations between the key concepts, and a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. As a result, the ontologies learned by CRCTOL are more concise and contain a richer semantics in terms of the range and number of semantic relations compared with alternative systems. We present two case studies where CRCTOL is used to build a terrorism domain ontology and a sport event domain ontology. At the component level, quantitative evaluation by comparing with Text‐To‐Onto and its successor Text2Onto has shown that CRCTOL is able to extract concepts and semantic relations with a significantly higher level of accuracy. At the ontology level, the quality of the learned ontologies is evaluated by either employing a set of quantitative and qualitative methods including analyzing the graph structural property, comparison to WordNet, and expert rating, or directly comparing with a human‐edited benchmark ontology, demonstrating the high quality of the ontologies learned. 2010-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5223 info:doi/10.1002/asi.21231 https://ink.library.smu.edu.sg/context/sis_research/article/6226/viewcontent/CRCTOL.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computer and Systems Architecture Computer Engineering Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Computer and Systems Architecture
Computer Engineering
Databases and Information Systems
spellingShingle Computer and Systems Architecture
Computer Engineering
Databases and Information Systems
JIANG, Xing
TAN, Ah-hwee
CRCTOL: A semantic based domain ontology learning system
description Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept‐Relation‐Concept Tuple‐based Ontology Learning (CRCTOL), for mining ontologies automatically from domain‐specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico‐syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, a word sense disambiguation algorithm that disambiguates words in the key concepts, a rule‐based algorithm that extracts relations between the key concepts, and a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. As a result, the ontologies learned by CRCTOL are more concise and contain a richer semantics in terms of the range and number of semantic relations compared with alternative systems. We present two case studies where CRCTOL is used to build a terrorism domain ontology and a sport event domain ontology. At the component level, quantitative evaluation by comparing with Text‐To‐Onto and its successor Text2Onto has shown that CRCTOL is able to extract concepts and semantic relations with a significantly higher level of accuracy. At the ontology level, the quality of the learned ontologies is evaluated by either employing a set of quantitative and qualitative methods including analyzing the graph structural property, comparison to WordNet, and expert rating, or directly comparing with a human‐edited benchmark ontology, demonstrating the high quality of the ontologies learned.
format text
author JIANG, Xing
TAN, Ah-hwee
author_facet JIANG, Xing
TAN, Ah-hwee
author_sort JIANG, Xing
title CRCTOL: A semantic based domain ontology learning system
title_short CRCTOL: A semantic based domain ontology learning system
title_full CRCTOL: A semantic based domain ontology learning system
title_fullStr CRCTOL: A semantic based domain ontology learning system
title_full_unstemmed CRCTOL: A semantic based domain ontology learning system
title_sort crctol: a semantic based domain ontology learning system
publisher Institutional Knowledge at Singapore Management University
publishDate 2010
url https://ink.library.smu.edu.sg/sis_research/5223
https://ink.library.smu.edu.sg/context/sis_research/article/6226/viewcontent/CRCTOL.pdf
_version_ 1770575338130112512