Building a concept network for biomedical literature

Text mining in the biomedical domain is still a nascent research field and has gained importance lately with rising volumes of scientific literature available through online repositories. This provides a unique opportunity to analyze unstructured data in biomedical literature and represent it in a s...

Full description

Saved in:
Bibliographic Details
Main Author: Rohit Samarth.
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10356/48592
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Text mining in the biomedical domain is still a nascent research field and has gained importance lately with rising volumes of scientific literature available through online repositories. This provides a unique opportunity to analyze unstructured data in biomedical literature and represent it in a structured manner that can be easily accessed. Such a structured data representation has been suitably named as a “concept network” where biomedical concepts are associated with each other through relations extracted from literature forming a vast network. Such a network is useful to researchers in visualizing concepts similar and relevant to their domain. It can also possibly be used to retrieve previously unknown associations between various concepts. This report details the research behind the development of a biomedical concept network which can potentially retrieve hidden relations between concepts. In the future, such a concept network can be used as part of a concept-based search engine to assist researchers in the biomedical field to retrieve research papers relevant to their domain. For the purposes of this project, the field of Autism within biomedicine is selected as a case study. Autism is a neurological disease affecting the brain, leading to varied levels of decreased social interaction amongst patients. This project focuses largely on identifying and implementing efficient methodologies to aid in the creation and visualization of a biomedical concept network. The important steps involved such as building a high-quality corpus, creating a control dictionary, developing algorithms to identify biomedical entities, extracting the relevant associations and finally visualizing the network are discussed in detail. The report touches on the challenges faced and the knowledge gained through the course of this project. The report includes some recommendations for future work to be carried out on this project to improve the system.