Converting a text mining system into a UIMA framework

The UIMA Framework aids in discovering knowledge from unstructured information by coordinating Analysis Engines. This suggests that Analysis Engines are independent modules which are shielded from the integration details, and they can focus on their own functions. A known Natural Language Processing...

Full description

Saved in:
Bibliographic Details
Main Author: Choo, Zhen Ying.
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/55027
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The UIMA Framework aids in discovering knowledge from unstructured information by coordinating Analysis Engines. This suggests that Analysis Engines are independent modules which are shielded from the integration details, and they can focus on their own functions. A known Natural Language Processing system, U-Compare, is also utilizing the UIMA Framework. Parts of an existing Ontology-driven Software Engineering Environment (OSEE) pattern matching system that infers implied events from direct events from biomedical text, are converted into the UIMA structure and deployed into U-Compare system in this project to employ the benefits of the UIMA Framework. These parts include converting a Named Entity Recognition (NER) module and parsing the input sentences to obtain their semantic structures, as well as a Pattern Matching module to identify the relations between the named entities. A separate parser to generate the dictionary file for Chemical terms was also modified and utilized. Converting these into the UIMA structure eases distributability of the OSEE system to a larger community. Further enhancements can be made in the future to automate the creation of Annotation types according to those in a user-specified settings file. A Graphic User Interface could also be developed to allow users to select the required files for the OSEE system, and the parallel processing of tasks in the OSEE system that are independent of each other could be further explored.