CRCTOL: A semantic based domain ontology learning system
Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing te...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2010
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5223 https://ink.library.smu.edu.sg/context/sis_research/article/6226/viewcontent/CRCTOL.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-6226 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-62262020-07-23T18:32:38Z CRCTOL: A semantic based domain ontology learning system JIANG, Xing TAN, Ah-hwee Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept‐Relation‐Concept Tuple‐based Ontology Learning (CRCTOL), for mining ontologies automatically from domain‐specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico‐syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, a word sense disambiguation algorithm that disambiguates words in the key concepts, a rule‐based algorithm that extracts relations between the key concepts, and a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. As a result, the ontologies learned by CRCTOL are more concise and contain a richer semantics in terms of the range and number of semantic relations compared with alternative systems. We present two case studies where CRCTOL is used to build a terrorism domain ontology and a sport event domain ontology. At the component level, quantitative evaluation by comparing with Text‐To‐Onto and its successor Text2Onto has shown that CRCTOL is able to extract concepts and semantic relations with a significantly higher level of accuracy. At the ontology level, the quality of the learned ontologies is evaluated by either employing a set of quantitative and qualitative methods including analyzing the graph structural property, comparison to WordNet, and expert rating, or directly comparing with a human‐edited benchmark ontology, demonstrating the high quality of the ontologies learned. 2010-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5223 info:doi/10.1002/asi.21231 https://ink.library.smu.edu.sg/context/sis_research/article/6226/viewcontent/CRCTOL.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computer and Systems Architecture Computer Engineering Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Computer and Systems Architecture Computer Engineering Databases and Information Systems |
spellingShingle |
Computer and Systems Architecture Computer Engineering Databases and Information Systems JIANG, Xing TAN, Ah-hwee CRCTOL: A semantic based domain ontology learning system |
description |
Domain ontologies play an important role in supporting knowledge‐based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept‐Relation‐Concept Tuple‐based Ontology Learning (CRCTOL), for mining ontologies automatically from domain‐specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico‐syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, a word sense disambiguation algorithm that disambiguates words in the key concepts, a rule‐based algorithm that extracts relations between the key concepts, and a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. As a result, the ontologies learned by CRCTOL are more concise and contain a richer semantics in terms of the range and number of semantic relations compared with alternative systems. We present two case studies where CRCTOL is used to build a terrorism domain ontology and a sport event domain ontology. At the component level, quantitative evaluation by comparing with Text‐To‐Onto and its successor Text2Onto has shown that CRCTOL is able to extract concepts and semantic relations with a significantly higher level of accuracy. At the ontology level, the quality of the learned ontologies is evaluated by either employing a set of quantitative and qualitative methods including analyzing the graph structural property, comparison to WordNet, and expert rating, or directly comparing with a human‐edited benchmark ontology, demonstrating the high quality of the ontologies learned. |
format |
text |
author |
JIANG, Xing TAN, Ah-hwee |
author_facet |
JIANG, Xing TAN, Ah-hwee |
author_sort |
JIANG, Xing |
title |
CRCTOL: A semantic based domain ontology learning system |
title_short |
CRCTOL: A semantic based domain ontology learning system |
title_full |
CRCTOL: A semantic based domain ontology learning system |
title_fullStr |
CRCTOL: A semantic based domain ontology learning system |
title_full_unstemmed |
CRCTOL: A semantic based domain ontology learning system |
title_sort |
crctol: a semantic based domain ontology learning system |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2010 |
url |
https://ink.library.smu.edu.sg/sis_research/5223 https://ink.library.smu.edu.sg/context/sis_research/article/6226/viewcontent/CRCTOL.pdf |
_version_ |
1770575338130112512 |