Converting a text mining system into a UIMA framework
The UIMA Framework aids in discovering knowledge from unstructured information by coordinating Analysis Engines. This suggests that Analysis Engines are independent modules which are shielded from the integration details, and they can focus on their own functions. A known Natural Language Processing...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/55027 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The UIMA Framework aids in discovering knowledge from unstructured information by coordinating Analysis Engines. This suggests that Analysis Engines are independent modules which are shielded from the integration details, and they can focus on their own functions. A known Natural Language Processing system, U-Compare, is also utilizing the UIMA Framework. Parts of an existing Ontology-driven Software Engineering Environment (OSEE) pattern matching system that infers implied events from direct events from biomedical text, are converted into the UIMA structure and deployed into U-Compare system in this project to employ the benefits of the UIMA Framework. These parts include converting a Named Entity Recognition (NER) module and parsing the input sentences to obtain their semantic structures, as well as a Pattern Matching module to identify the relations between the named entities. A separate parser to generate the dictionary file for Chemical terms was also modified and utilized. Converting these into the UIMA structure eases distributability of the OSEE system to a larger community. Further enhancements can be made in the future to automate the creation of Annotation types according to those in a user-specified settings file. A Graphic User Interface could also be developed to allow users to select the required files for the OSEE system, and the parallel processing of tasks in the OSEE system that are independent of each other could be further explored. |
---|