Deriving links between clinical/laboratory tests and antibody/gene names of dengue by literature mining
DENV is known to be one of the most prevalent and important arbovirus of humans and has become the leading cause of illness and death in many regions of the world. Therefore, the rapid spread of this virus needs to be controlled. The alarming emergence of the virus and health impact it has on the hu...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/62707 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | DENV is known to be one of the most prevalent and important arbovirus of humans and has become the leading cause of illness and death in many regions of the world. Therefore, the rapid spread of this virus needs to be controlled. The alarming emergence of the virus and health impact it has on the human population has led to the invention of clinical and laboratory tests to diagnose patients with Dengue. Scientists and researchers have come up with inexpensive and specific Dengue diagnostic tests that would help in early intervention to treat patients and prevent or control epidemics. Up till today, there are no vaccines to prevent infection with Dengue virus and the most effective method being those that avoid mosquito bites. In this report, I focused on deriving events related to clinical and laboratory tests with its corresponding gene or antibody that it is targeting. A corpus of tests and antibody/genes names were obtained from PubMed and was manually labeled with events that show relationships between tests and gene/antibodies. Next, sentences were parsed and represented in dependency trees which shows dependencies between words in the sentence. Sentence simplification rules were then implemented and sentence textual patterns were identified from the positive classes. The effectiveness of the proposed feature mining and textual pattern matching methods on detecting links between test names and antibody/gene names achieved a recall score of 61.5%, a precision score of 88.7% and a F-measure score of 72.6% using the 3-fold cross validation method. The created ontology summarized relationships between tests and antibody/gene names and corresponding positive and negative textual patterns from the corpus. They are demonstrated to be effective in identifying clinical and laboratory tests which target specific Dengue antibodies/genes which shows the benefits of summarizing textual patterns and biological knowledge on extracting relationships between tests and the antibodies/genes they target. |
---|