Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-ind...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
H. : ĐHQGHN
2020
|
Subjects: | |
Online Access: | http://repository.vnu.edu.vn/handle/VNU_123/89096 https://doi.org/10.25073/2588-1086/vnucsce.237 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Vietnam National University, Hanoi |
Language: | English |
id |
oai:112.137.131.14:VNU_123-89096 |
---|---|
record_format |
dspace |
spelling |
oai:112.137.131.14:VNU_123-890962020-06-23T03:27:18Z Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature Pham, Thi Quynh Trang Bui, Manh Thang Dang, Thanh Hai Chemical disease relation prediction Convolutional neural network Biomedical text mining Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-induced disease relations (CDR), which is essential for drug safety and toxicity. It is, however, a catastrophic burden for them as PubMed is a huge database of unstructured texts, growing steadily very fast (~28 millions scientific articles currently, approximately two deposited per minute). As a result, biomedical text mining has been empirically demonstrated its great implications in biomedical research communities. Biomedical text has its own distinct challenging properties, attracting much attetion from natural language processing communities. A large-scale study recently in 2018 showed that incorporating information into indenpendent multiple-input layers outperforms concatenating them into a single input layer (for biLSTM), producing better performance when compared to state-of-the-art CDR classifying models. This paper demonstrates that for a CNN it is vice-versa, in which concatenation is better for CDR classification. To this end, we develop a CNN based model with multiple input concatenated for CDR classification. Experimental results on the benchmark dataset demonstrate its outperformance over other recent state-of-the-art CDR classification models. 2020-06-23T03:27:18Z 2020-06-23T03:27:18Z 2020 Article Pham, T. Q. T. (2020). Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature. VNU Journal of Science: Comp. Science & Com. Eng, Vol. 36, No. 1 (2020) 11-16. 2588-1086 http://repository.vnu.edu.vn/handle/VNU_123/89096 https://doi.org/10.25073/2588-1086/vnucsce.237 en Computer Science and Communication Engineering; application/pdf H. : ĐHQGHN |
institution |
Vietnam National University, Hanoi |
building |
VNU Library & Information Center |
country |
Vietnam |
collection |
VNU Digital Repository |
language |
English |
topic |
Chemical disease relation prediction Convolutional neural network Biomedical text mining |
spellingShingle |
Chemical disease relation prediction Convolutional neural network Biomedical text mining Pham, Thi Quynh Trang Bui, Manh Thang Dang, Thanh Hai Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature |
description |
Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-induced disease relations (CDR), which is essential for drug safety and toxicity. It is, however, a catastrophic burden for them as PubMed is a huge database of unstructured texts, growing steadily very fast (~28 millions scientific articles currently, approximately two deposited per minute). As a result, biomedical text mining has been empirically demonstrated its great implications in biomedical research communities. Biomedical text has its own distinct challenging properties, attracting much attetion from natural language processing communities. A large-scale study recently in 2018 showed that incorporating information into indenpendent multiple-input layers outperforms concatenating them into a single input layer (for biLSTM), producing better performance when compared to state-of-the-art CDR classifying models. This paper demonstrates that for a CNN it is vice-versa, in which concatenation is better for CDR classification. To this end, we develop a CNN based model with multiple input concatenated for CDR classification. Experimental results on the benchmark dataset demonstrate its outperformance over other recent state-of-the-art CDR classification models. |
format |
Article |
author |
Pham, Thi Quynh Trang Bui, Manh Thang Dang, Thanh Hai |
author_facet |
Pham, Thi Quynh Trang Bui, Manh Thang Dang, Thanh Hai |
author_sort |
Pham, Thi Quynh Trang |
title |
Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature |
title_short |
Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature |
title_full |
Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature |
title_fullStr |
Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature |
title_full_unstemmed |
Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature |
title_sort |
single concatenated input is better than indenpendent multiple-input for cnns to predict chemical-induced disease relation from literature |
publisher |
H. : ĐHQGHN |
publishDate |
2020 |
url |
http://repository.vnu.edu.vn/handle/VNU_123/89096 https://doi.org/10.25073/2588-1086/vnucsce.237 |
_version_ |
1680968406083305472 |