Semantic querying over knowledge in biomedical text corpora annotated with multiple ontologies
Existing ontology-based knowledge representations systems have achieved considerable success in semantic querying on large biomedical text corpora over keyword-based systems. However, their query expressivity is limited due to the lack of cross-ontology integration and semantic relations. We present...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/96180 http://hdl.handle.net/10220/11920 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Existing ontology-based knowledge representations systems have achieved considerable success in semantic querying on large biomedical text corpora over keyword-based systems. However, their query expressivity is limited due to the lack of cross-ontology integration and semantic relations. We present a System for Multiple-Ontology Knowledge Representation (SMOKR) to alleviate the problem. The system first performs annotations of phrases and the semantic relations between them using different domain ontologies, before instantiating the ontologies with the annotated phrases. It then integrates the ontologies by matching their instances using simple NLP techniques, and also by matching their concepts using the state-of-the-art Biomedical Ontology Alignment Tool (BOAT). SMOKR performs inconsistency detection to remove conflicting axioms in order to create a consistent ontology for querying. We evaluate the performance of the system by testing it with a set of semantic queries, and the results are compared to a keyword-based search engine, Lucene, and a hybrid system, SSOKR_Luc, which combines a knowledge representation system using a single ontology and the keyword-based search engine, Lucene. SMOKR shows the best performance of F-Measures 0.7 and 0.87 on the GRO Corpus and the GENIA Corpus, respectively, compared to that of SSOKR_Luc at 0.62 and 0.33, and that of Lucene at 0.36 and 0.12. |
---|