Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature

Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-ind...

Full description

Saved in:
Bibliographic Details
Main Authors: Pham, Thi Quynh Trang, Bui, Manh Thang, Dang, Thanh Hai
Format: Article
Language:English
Published: H. : ĐHQGHN 2020
Subjects:
Online Access:http://repository.vnu.edu.vn/handle/VNU_123/89096
https://doi.org/10.25073/2588-1086/vnucsce.237
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Vietnam National University, Hanoi
Language: English
id oai:112.137.131.14:VNU_123-89096
record_format dspace
spelling oai:112.137.131.14:VNU_123-890962020-06-23T03:27:18Z Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature Pham, Thi Quynh Trang Bui, Manh Thang Dang, Thanh Hai Chemical disease relation prediction Convolutional neural network Biomedical text mining Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-induced disease relations (CDR), which is essential for drug safety and toxicity. It is, however, a catastrophic burden for them as PubMed is a huge database of unstructured texts, growing steadily very fast (~28 millions scientific articles currently, approximately two deposited per minute). As a result, biomedical text mining has been empirically demonstrated its great implications in biomedical research communities. Biomedical text has its own distinct challenging properties, attracting much attetion from natural language processing communities. A large-scale study recently in 2018 showed that incorporating information into indenpendent multiple-input layers outperforms concatenating them into a single input layer (for biLSTM), producing better performance when compared to state-of-the-art CDR classifying models. This paper demonstrates that for a CNN it is vice-versa, in which concatenation is better for CDR classification. To this end, we develop a CNN based model with multiple input concatenated for CDR classification. Experimental results on the benchmark dataset demonstrate its outperformance over other recent state-of-the-art CDR classification models. 2020-06-23T03:27:18Z 2020-06-23T03:27:18Z 2020 Article Pham, T. Q. T. (2020). Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature. VNU Journal of Science: Comp. Science & Com. Eng, Vol. 36, No. 1 (2020) 11-16. 2588-1086 http://repository.vnu.edu.vn/handle/VNU_123/89096 https://doi.org/10.25073/2588-1086/vnucsce.237 en Computer Science and Communication Engineering; application/pdf H. : ĐHQGHN
institution Vietnam National University, Hanoi
building VNU Library & Information Center
country Vietnam
collection VNU Digital Repository
language English
topic Chemical disease relation prediction
Convolutional neural network
Biomedical text mining
spellingShingle Chemical disease relation prediction
Convolutional neural network
Biomedical text mining
Pham, Thi Quynh Trang
Bui, Manh Thang
Dang, Thanh Hai
Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
description Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-induced disease relations (CDR), which is essential for drug safety and toxicity. It is, however, a catastrophic burden for them as PubMed is a huge database of unstructured texts, growing steadily very fast (~28 millions scientific articles currently, approximately two deposited per minute). As a result, biomedical text mining has been empirically demonstrated its great implications in biomedical research communities. Biomedical text has its own distinct challenging properties, attracting much attetion from natural language processing communities. A large-scale study recently in 2018 showed that incorporating information into indenpendent multiple-input layers outperforms concatenating them into a single input layer (for biLSTM), producing better performance when compared to state-of-the-art CDR classifying models. This paper demonstrates that for a CNN it is vice-versa, in which concatenation is better for CDR classification. To this end, we develop a CNN based model with multiple input concatenated for CDR classification. Experimental results on the benchmark dataset demonstrate its outperformance over other recent state-of-the-art CDR classification models.
format Article
author Pham, Thi Quynh Trang
Bui, Manh Thang
Dang, Thanh Hai
author_facet Pham, Thi Quynh Trang
Bui, Manh Thang
Dang, Thanh Hai
author_sort Pham, Thi Quynh Trang
title Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
title_short Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
title_full Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
title_fullStr Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
title_full_unstemmed Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature
title_sort single concatenated input is better than indenpendent multiple-input for cnns to predict chemical-induced disease relation from literature
publisher H. : ĐHQGHN
publishDate 2020
url http://repository.vnu.edu.vn/handle/VNU_123/89096
https://doi.org/10.25073/2588-1086/vnucsce.237
_version_ 1680968406083305472