Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors

Motivation: Identification of virulence factors (VFs) is critical to the elucidation of bacterial pathogenesis and prevention of related infectious diseases. Current computational methods for VF prediction focus on binary classification or involve only several class(es) of VFs with sufficient sample...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHENG, Dandan, PANG, Guansong, LIU, Bo, CHEN, Lihong, YANG, Jian
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7038
https://ink.library.smu.edu.sg/context/sis_research/article/8041/viewcontent/btaa230.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8041
record_format dspace
spelling sg-smu-ink.sis_research-80412022-03-24T07:12:34Z Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors ZHENG, Dandan PANG, Guansong LIU, Bo CHEN, Lihong YANG, Jian Motivation: Identification of virulence factors (VFs) is critical to the elucidation of bacterial pathogenesis and prevention of related infectious diseases. Current computational methods for VF prediction focus on binary classification or involve only several class(es) of VFs with sufficient samples. However, thousands of VF classes are present in real-world scenarios, and many of them only have a very limited number of samples available.Results: We first construct a large VF dataset, covering 3446 VF classes with 160 495 sequences, and then propose deep convolutional neural network models for VF classification. We show that (i) for common VF classes with sufficient samples, our models can achieve state-of-the-art performance with an overall accuracy of 0.9831 and an F1-score of 0.9803; (ii) for uncommon VF classes with limited samples, our models can learn transferable features from auxiliary data and achieve good performance with accuracy ranging from 0.9277 to 0.9512 and F1-score ranging from 0.9168 to 0.9446 when combined with different predefined features, outperforming traditional classifiers by 1-13% in accuracy and by 1-16% in F1-score.Availability and implementation: All of our datasets are made publicly available at http://www.mgc.ac.cn/VFNet/, and the source code of our models is publicly available at https://github.com/zhengdd0422/VFNet.Supplementary information: Supplementary data are available at Bioinformatics online. 2020-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7038 info:doi/10.1093/bioinformatics/btaa230 https://ink.library.smu.edu.sg/context/sis_research/article/8041/viewcontent/btaa230.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics OS and Networks
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
OS and Networks
spellingShingle Artificial Intelligence and Robotics
OS and Networks
ZHENG, Dandan
PANG, Guansong
LIU, Bo
CHEN, Lihong
YANG, Jian
Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
description Motivation: Identification of virulence factors (VFs) is critical to the elucidation of bacterial pathogenesis and prevention of related infectious diseases. Current computational methods for VF prediction focus on binary classification or involve only several class(es) of VFs with sufficient samples. However, thousands of VF classes are present in real-world scenarios, and many of them only have a very limited number of samples available.Results: We first construct a large VF dataset, covering 3446 VF classes with 160 495 sequences, and then propose deep convolutional neural network models for VF classification. We show that (i) for common VF classes with sufficient samples, our models can achieve state-of-the-art performance with an overall accuracy of 0.9831 and an F1-score of 0.9803; (ii) for uncommon VF classes with limited samples, our models can learn transferable features from auxiliary data and achieve good performance with accuracy ranging from 0.9277 to 0.9512 and F1-score ranging from 0.9168 to 0.9446 when combined with different predefined features, outperforming traditional classifiers by 1-13% in accuracy and by 1-16% in F1-score.Availability and implementation: All of our datasets are made publicly available at http://www.mgc.ac.cn/VFNet/, and the source code of our models is publicly available at https://github.com/zhengdd0422/VFNet.Supplementary information: Supplementary data are available at Bioinformatics online.
format text
author ZHENG, Dandan
PANG, Guansong
LIU, Bo
CHEN, Lihong
YANG, Jian
author_facet ZHENG, Dandan
PANG, Guansong
LIU, Bo
CHEN, Lihong
YANG, Jian
author_sort ZHENG, Dandan
title Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
title_short Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
title_full Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
title_fullStr Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
title_full_unstemmed Learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
title_sort learning transferable deep convolutional neural networks for the classification of bacterial virulence factors
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/7038
https://ink.library.smu.edu.sg/context/sis_research/article/8041/viewcontent/btaa230.pdf
_version_ 1770576192822312960