Hybrid pattern matching for complex ontology term recognition

Ontology term recognition is a key task of ontology-based text mining. Previous approaches of statistical analysis and syntactic pattern matching have such limitations that they do not consider relations between words and that their handcrafted patterns are expensive and show low coverage, respectiv...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Kim, Jung-jae., Tuan, Luu Anh.
مؤلفون آخرون: School of Computer Engineering
التنسيق: Conference or Workshop Item
اللغة:English
منشور في: 2013
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/97427
http://hdl.handle.net/10220/11870
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:Ontology term recognition is a key task of ontology-based text mining. Previous approaches of statistical analysis and syntactic pattern matching have such limitations that they do not consider relations between words and that their handcrafted patterns are expensive and show low coverage, respectively. These limitations are critical especially when dealing with long and complex ontology terms. We propose a hybrid approach that combines the two approaches sequentially: It first uses syntactic pattern matching and, when its results are partial due to lack of required patterns, then completes them with supplementary evidence from a statistical method. Additionally, we present a novel method that automatically learns syntactic patterns from an annotated corpus. We tested the proposed approach for the tasks of recognizing Gene Ontology (GO) terms in text and also of associating the GO terms with proteins. When compared with existing systems of statistical analysis and syntactic pattern matching, it significantly improves 'relative' recall by 11%~13% and F-score by 7%.