Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines

Predicting novel associations between biomedical entities, such as genes, drugs and diseases, can suggest new topics for experiments and new insights in drug design. Due to the massive amounts of relevant data available, a computational approach is well-suited for this task. Initial data can be take...

Full description

Saved in:
Bibliographic Details
Main Author: Laron, Andrew V.
Format: text
Language:English
Published: Animo Repository 2017
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/5765
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-12603
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-126032024-07-16T05:08:26Z Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines Laron, Andrew V. Predicting novel associations between biomedical entities, such as genes, drugs and diseases, can suggest new topics for experiments and new insights in drug design. Due to the massive amounts of relevant data available, a computational approach is well-suited for this task. Initial data can be taken either from curated databases of biomedical terms and the relations between them, or directly from the text of research articles. Existing studies on predicting associations between diseases based on published articles generally use a co-occurrence-based approach, such as extracting the names of diseases and other entities from articles. The weighting scheme for such an approach is based on how many times entity pairs occur together in different documents. This paper describes a semantic analysis- based approach. It extracts biological events and relations between biochemical entities and diseases from texts, and only identifes general associations between entities if instances of relation between them were extracted. The system had an overall accuracy of 84.35% when tested with ve-fold cross-validation on 86 articles from PubMed Central Open Access. The effectiveness of several instance features on improving relation extraction was tested, and a 1-token-window bag of words around tokens indicating biomedical entities was found to improve accuracy, while entity distance, token distance, and syntactic dependency subtree had little effect on accuracy. 2017-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/5765 Master's Theses English Animo Repository Semantic computing Vector processing (Computer science)
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Semantic computing
Vector processing (Computer science)
spellingShingle Semantic computing
Vector processing (Computer science)
Laron, Andrew V.
Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
description Predicting novel associations between biomedical entities, such as genes, drugs and diseases, can suggest new topics for experiments and new insights in drug design. Due to the massive amounts of relevant data available, a computational approach is well-suited for this task. Initial data can be taken either from curated databases of biomedical terms and the relations between them, or directly from the text of research articles. Existing studies on predicting associations between diseases based on published articles generally use a co-occurrence-based approach, such as extracting the names of diseases and other entities from articles. The weighting scheme for such an approach is based on how many times entity pairs occur together in different documents. This paper describes a semantic analysis- based approach. It extracts biological events and relations between biochemical entities and diseases from texts, and only identifes general associations between entities if instances of relation between them were extracted. The system had an overall accuracy of 84.35% when tested with ve-fold cross-validation on 86 articles from PubMed Central Open Access. The effectiveness of several instance features on improving relation extraction was tested, and a 1-token-window bag of words around tokens indicating biomedical entities was found to improve accuracy, while entity distance, token distance, and syntactic dependency subtree had little effect on accuracy.
format text
author Laron, Andrew V.
author_facet Laron, Andrew V.
author_sort Laron, Andrew V.
title Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
title_short Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
title_full Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
title_fullStr Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
title_full_unstemmed Prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
title_sort prediction of disease-disease associations based on relation extraction from biomedical journals using support vector machines
publisher Animo Repository
publishDate 2017
url https://animorepository.dlsu.edu.ph/etd_masteral/5765
_version_ 1806061284670570496