DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites
Identification of phosphorylation sites is an important step in the function study and drug design of proteins. In recent years, there have been increasing applications of the computational method in the identification of phosphorylation sites because of its low cost and high speed. Most of the curr...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Published: |
American Chemical Society
2021
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/91222/ http://dx.doi.org/10.1021/acs.jproteome.0c00431 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
id |
my.utm.91222 |
---|---|
record_format |
eprints |
spelling |
my.utm.912222021-06-21T08:41:11Z http://eprints.utm.my/id/eprint/91222/ DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites Guo, Lei Wang, Yongpei Xu, Xiangnan Cheng, Kian Kai Long, Yichi Xu, Jingjing Li, Sanshu Dong, Jiyang QD Chemistry Identification of phosphorylation sites is an important step in the function study and drug design of proteins. In recent years, there have been increasing applications of the computational method in the identification of phosphorylation sites because of its low cost and high speed. Most of the currently available methods focus on using local information around potential phosphorylation sites for prediction and do not take the global information of the protein sequence into consideration. Here, we demonstrated that the global information of protein sequences may be also critical for phosphorylation site prediction. In this paper, a new deep neural network model, called DeepPSP, was proposed for the prediction of protein phosphorylation sites. In the DeepPSP model, two parallel modules were introduced to extract both local and global features from protein sequences. Two squeeze-and-excitation blocks and one bidirectional long short-term memory block were introduced into each module to capture effective representations of the sequences. Comparative studies were carried out to evaluate the performance of DeepPSP, and four other prediction methods using public data sets The F1-score, area under receiver operating characteristic curves (AUROC), and area under precision-recall curves (AUPRC) of DeepPSP were found to be 0.4819, 0.82, and 0.50, respectively, for S/T general site prediction and 0.4206, 0.73, and 0.39, respectively, for Y general site prediction. Compared with the MusiteDeep method, the F1-score, AUROC, and AUPRC of DeepPSP were found to increase by 8.6, 2.5, and 8.7%, respectively, for S/T general site prediction and by 20.6, 5.8, and 18.2%, respectively, for Y general site prediction. Among the tested methods, the developed DeepPSP method was also found to produce best results for different kinase-specific site predictions including CDK, mitogen-activated protein kinase, CAMK, AGC, and CMGC. Taken together, the developed DeepPSP method may offer a more accurate phosphorylation site prediction by including global information. It may serve as an alternative model with better performance and interpretability for protein phosphorylation site prediction. American Chemical Society 2021-01 Article PeerReviewed Guo, Lei and Wang, Yongpei and Xu, Xiangnan and Cheng, Kian Kai and Long, Yichi and Xu, Jingjing and Li, Sanshu and Dong, Jiyang (2021) DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites. Journal of Proteome Research, 20 (1). pp. 346-356. ISSN 1535-3893 http://dx.doi.org/10.1021/acs.jproteome.0c00431 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
QD Chemistry |
spellingShingle |
QD Chemistry Guo, Lei Wang, Yongpei Xu, Xiangnan Cheng, Kian Kai Long, Yichi Xu, Jingjing Li, Sanshu Dong, Jiyang DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites |
description |
Identification of phosphorylation sites is an important step in the function study and drug design of proteins. In recent years, there have been increasing applications of the computational method in the identification of phosphorylation sites because of its low cost and high speed. Most of the currently available methods focus on using local information around potential phosphorylation sites for prediction and do not take the global information of the protein sequence into consideration. Here, we demonstrated that the global information of protein sequences may be also critical for phosphorylation site prediction. In this paper, a new deep neural network model, called DeepPSP, was proposed for the prediction of protein phosphorylation sites. In the DeepPSP model, two parallel modules were introduced to extract both local and global features from protein sequences. Two squeeze-and-excitation blocks and one bidirectional long short-term memory block were introduced into each module to capture effective representations of the sequences. Comparative studies were carried out to evaluate the performance of DeepPSP, and four other prediction methods using public data sets The F1-score, area under receiver operating characteristic curves (AUROC), and area under precision-recall curves (AUPRC) of DeepPSP were found to be 0.4819, 0.82, and 0.50, respectively, for S/T general site prediction and 0.4206, 0.73, and 0.39, respectively, for Y general site prediction. Compared with the MusiteDeep method, the F1-score, AUROC, and AUPRC of DeepPSP were found to increase by 8.6, 2.5, and 8.7%, respectively, for S/T general site prediction and by 20.6, 5.8, and 18.2%, respectively, for Y general site prediction. Among the tested methods, the developed DeepPSP method was also found to produce best results for different kinase-specific site predictions including CDK, mitogen-activated protein kinase, CAMK, AGC, and CMGC. Taken together, the developed DeepPSP method may offer a more accurate phosphorylation site prediction by including global information. It may serve as an alternative model with better performance and interpretability for protein phosphorylation site prediction. |
format |
Article |
author |
Guo, Lei Wang, Yongpei Xu, Xiangnan Cheng, Kian Kai Long, Yichi Xu, Jingjing Li, Sanshu Dong, Jiyang |
author_facet |
Guo, Lei Wang, Yongpei Xu, Xiangnan Cheng, Kian Kai Long, Yichi Xu, Jingjing Li, Sanshu Dong, Jiyang |
author_sort |
Guo, Lei |
title |
DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites |
title_short |
DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites |
title_full |
DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites |
title_fullStr |
DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites |
title_full_unstemmed |
DeepPSP: A global–local information-based deep neural network for the prediction of protein phosphorylation sites |
title_sort |
deeppsp: a global–local information-based deep neural network for the prediction of protein phosphorylation sites |
publisher |
American Chemical Society |
publishDate |
2021 |
url |
http://eprints.utm.my/id/eprint/91222/ http://dx.doi.org/10.1021/acs.jproteome.0c00431 |
_version_ |
1703960438602989568 |