A deep learning method to predict bacterial ADP-ribosyltransferase toxins

Motivation: ADP-ribosylation is a critical modification involved in regulating diverse cellular processes, including chromatin structure regulation, RNA transcription, and cell death. Bacterial ADP-ribosyltransferase toxins (bARTTs) serve as potent virulence factors that orchestrate the manipulation...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHENG, Dandan, ZHOU, Siyu, CHEN, Lihong, PANG, Guansong, YANG, Jian
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9033
https://ink.library.smu.edu.sg/context/sis_research/article/10036/viewcontent/btae378_pvoa_cc_by.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10036
record_format dspace
spelling sg-smu-ink.sis_research-100362024-07-25T07:58:55Z A deep learning method to predict bacterial ADP-ribosyltransferase toxins ZHENG, Dandan ZHOU, Siyu CHEN, Lihong PANG, Guansong YANG, Jian Motivation: ADP-ribosylation is a critical modification involved in regulating diverse cellular processes, including chromatin structure regulation, RNA transcription, and cell death. Bacterial ADP-ribosyltransferase toxins (bARTTs) serve as potent virulence factors that orchestrate the manipulation of host cell functions to facilitate bacterial pathogenesis. Despite their pivotal role, the bioinformatic identification of novel bARTTs poses a formidable challenge due to limited verified data and the inherent sequence diversity among bARTT members. Results: We proposed a deep learning-based model, ARTNet, specifically engineered to predict bARTTs from bacterial genomes. Initially, we introduced an effective data augmentation method to address the issue of data scarcity in training ARTNet. Subsequently, we employed a data optimization strategy by utilizing ART-related domain subsequences instead of the primary full sequences, thereby significantly enhancing the performance of ARTNet. ARTNet achieved a Matthew’s correlation coefficient (MCC) of 0.9351 and an F1-score (macro) of 0.9666 on repeated independent test datasets, outperforming three other deep learning models and six traditional machine learning models in terms of time efficiency and accuracy. Furthermore, we empirically demonstrated the ability of ARTNet to predict novel bARTTs across domain superfamilies without sequence similarity. We anticipate that ARTNet will greatly facilitate the screening and identification of novel bARTTs from bacterial genomes. 2024-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9033 info:doi/10.1093/bioinformatics/btae378 https://ink.library.smu.edu.sg/context/sis_research/article/10036/viewcontent/btae378_pvoa_cc_by.pdf http://creativecommons.org/licenses/by/3.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Bioinformatics Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
Bioinformatics
Databases and Information Systems
spellingShingle Artificial Intelligence and Robotics
Bioinformatics
Databases and Information Systems
ZHENG, Dandan
ZHOU, Siyu
CHEN, Lihong
PANG, Guansong
YANG, Jian
A deep learning method to predict bacterial ADP-ribosyltransferase toxins
description Motivation: ADP-ribosylation is a critical modification involved in regulating diverse cellular processes, including chromatin structure regulation, RNA transcription, and cell death. Bacterial ADP-ribosyltransferase toxins (bARTTs) serve as potent virulence factors that orchestrate the manipulation of host cell functions to facilitate bacterial pathogenesis. Despite their pivotal role, the bioinformatic identification of novel bARTTs poses a formidable challenge due to limited verified data and the inherent sequence diversity among bARTT members. Results: We proposed a deep learning-based model, ARTNet, specifically engineered to predict bARTTs from bacterial genomes. Initially, we introduced an effective data augmentation method to address the issue of data scarcity in training ARTNet. Subsequently, we employed a data optimization strategy by utilizing ART-related domain subsequences instead of the primary full sequences, thereby significantly enhancing the performance of ARTNet. ARTNet achieved a Matthew’s correlation coefficient (MCC) of 0.9351 and an F1-score (macro) of 0.9666 on repeated independent test datasets, outperforming three other deep learning models and six traditional machine learning models in terms of time efficiency and accuracy. Furthermore, we empirically demonstrated the ability of ARTNet to predict novel bARTTs across domain superfamilies without sequence similarity. We anticipate that ARTNet will greatly facilitate the screening and identification of novel bARTTs from bacterial genomes.
format text
author ZHENG, Dandan
ZHOU, Siyu
CHEN, Lihong
PANG, Guansong
YANG, Jian
author_facet ZHENG, Dandan
ZHOU, Siyu
CHEN, Lihong
PANG, Guansong
YANG, Jian
author_sort ZHENG, Dandan
title A deep learning method to predict bacterial ADP-ribosyltransferase toxins
title_short A deep learning method to predict bacterial ADP-ribosyltransferase toxins
title_full A deep learning method to predict bacterial ADP-ribosyltransferase toxins
title_fullStr A deep learning method to predict bacterial ADP-ribosyltransferase toxins
title_full_unstemmed A deep learning method to predict bacterial ADP-ribosyltransferase toxins
title_sort deep learning method to predict bacterial adp-ribosyltransferase toxins
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9033
https://ink.library.smu.edu.sg/context/sis_research/article/10036/viewcontent/btae378_pvoa_cc_by.pdf
_version_ 1814047713198604288