Deriving links between clinical/laboratory tests and antibody/gene names of dengue by literature mining

DENV is known to be one of the most prevalent and important arbovirus of humans and has become the leading cause of illness and death in many regions of the world. Therefore, the rapid spread of this virus needs to be controlled. The alarming emergence of the virus and health impact it has on the hu...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Sue-Ean
Other Authors: Rajapakse Jagath Chandana
Format: Final Year Project
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/62707
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:DENV is known to be one of the most prevalent and important arbovirus of humans and has become the leading cause of illness and death in many regions of the world. Therefore, the rapid spread of this virus needs to be controlled. The alarming emergence of the virus and health impact it has on the human population has led to the invention of clinical and laboratory tests to diagnose patients with Dengue. Scientists and researchers have come up with inexpensive and specific Dengue diagnostic tests that would help in early intervention to treat patients and prevent or control epidemics. Up till today, there are no vaccines to prevent infection with Dengue virus and the most effective method being those that avoid mosquito bites. In this report, I focused on deriving events related to clinical and laboratory tests with its corresponding gene or antibody that it is targeting. A corpus of tests and antibody/genes names were obtained from PubMed and was manually labeled with events that show relationships between tests and gene/antibodies. Next, sentences were parsed and represented in dependency trees which shows dependencies between words in the sentence. Sentence simplification rules were then implemented and sentence textual patterns were identified from the positive classes. The effectiveness of the proposed feature mining and textual pattern matching methods on detecting links between test names and antibody/gene names achieved a recall score of 61.5%, a precision score of 88.7% and a F-measure score of 72.6% using the 3-fold cross validation method. The created ontology summarized relationships between tests and antibody/gene names and corresponding positive and negative textual patterns from the corpus. They are demonstrated to be effective in identifying clinical and laboratory tests which target specific Dengue antibodies/genes which shows the benefits of summarizing textual patterns and biological knowledge on extracting relationships between tests and the antibodies/genes they target.