Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.

Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA i...

Full description

Saved in:
Bibliographic Details
Main Authors: HAN, Lian Yi, CAI, Cong Zhong, LO, Siaw Ling, CHUNG, Maxey, CHEN, Yu Zong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2004
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4876
https://ink.library.smu.edu.sg/context/sis_research/article/5879/viewcontent/Prediction___PV.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5879
record_format dspace
spelling sg-smu-ink.sis_research-58792022-05-21T08:08:26Z Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. HAN, Lian Yi CAI, Cong Zhong LO, Siaw Ling CHUNG, Maxey CHEN, Yu Zong Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA interactions, which play central roles in regulating gene expression and certain RNA-mediated enzymatic processes. This work explored the use of a machine learning method, support vector machines (SVM), for the prediction of RNA-binding proteins directly from their primary sequence. Based on the knowledge of known RNA-binding and non-RNA-binding proteins, an SVM system was trained to recognize RNA-binding proteins. A total of 4011 RNA-binding and 9781 non-RNA-binding proteins was used to train and test the SVM classification system, and an independent set of 447 RNA-binding and 4881 non-RNA-binding proteins was used to evaluate the classification accuracy. Testing results using this independent evaluation set show a prediction accuracy of 94.1%, 79.3%, and 94.1% for rRNA-, mRNA-, and tRNA-binding proteins, and 98.7%, 96.5%, and 99.9% for non-rRNA-, non-mRNA-, and non-tRNA-binding proteins, respectively. The SVM classification system was further tested on a small class of snRNA-binding proteins with only 60 available sequences. The prediction accuracy is 40.0% and 99.9% for snRNA-binding and non-snRNA-binding proteins, indicating a need for a sufficient number of proteins to train SVM. The SVM classification systems trained in this work were added to our Web-based protein functional classification software SVMProt, at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Our study suggests the potential of SVM as a useful tool for facilitating the prediction of protein-RNA interactions. 2004-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4876 info:doi/10.1261/rna.5890304 https://ink.library.smu.edu.sg/context/sis_research/article/5879/viewcontent/Prediction___PV.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University RNA-binding proteins RNA-protein interactions rRNA mRNA tRNA snRNA support vector machine Bioinformatics Computer Sciences Life Sciences
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic RNA-binding proteins
RNA-protein interactions
rRNA
mRNA
tRNA
snRNA
support vector machine
Bioinformatics
Computer Sciences
Life Sciences
spellingShingle RNA-binding proteins
RNA-protein interactions
rRNA
mRNA
tRNA
snRNA
support vector machine
Bioinformatics
Computer Sciences
Life Sciences
HAN, Lian Yi
CAI, Cong Zhong
LO, Siaw Ling
CHUNG, Maxey
CHEN, Yu Zong
Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
description Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA interactions, which play central roles in regulating gene expression and certain RNA-mediated enzymatic processes. This work explored the use of a machine learning method, support vector machines (SVM), for the prediction of RNA-binding proteins directly from their primary sequence. Based on the knowledge of known RNA-binding and non-RNA-binding proteins, an SVM system was trained to recognize RNA-binding proteins. A total of 4011 RNA-binding and 9781 non-RNA-binding proteins was used to train and test the SVM classification system, and an independent set of 447 RNA-binding and 4881 non-RNA-binding proteins was used to evaluate the classification accuracy. Testing results using this independent evaluation set show a prediction accuracy of 94.1%, 79.3%, and 94.1% for rRNA-, mRNA-, and tRNA-binding proteins, and 98.7%, 96.5%, and 99.9% for non-rRNA-, non-mRNA-, and non-tRNA-binding proteins, respectively. The SVM classification system was further tested on a small class of snRNA-binding proteins with only 60 available sequences. The prediction accuracy is 40.0% and 99.9% for snRNA-binding and non-snRNA-binding proteins, indicating a need for a sufficient number of proteins to train SVM. The SVM classification systems trained in this work were added to our Web-based protein functional classification software SVMProt, at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Our study suggests the potential of SVM as a useful tool for facilitating the prediction of protein-RNA interactions.
format text
author HAN, Lian Yi
CAI, Cong Zhong
LO, Siaw Ling
CHUNG, Maxey
CHEN, Yu Zong
author_facet HAN, Lian Yi
CAI, Cong Zhong
LO, Siaw Ling
CHUNG, Maxey
CHEN, Yu Zong
author_sort HAN, Lian Yi
title Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
title_short Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
title_full Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
title_fullStr Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
title_full_unstemmed Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
title_sort prediction of rna-binding proteins from primary sequence by a support vector machine approach.
publisher Institutional Knowledge at Singapore Management University
publishDate 2004
url https://ink.library.smu.edu.sg/sis_research/4876
https://ink.library.smu.edu.sg/context/sis_research/article/5879/viewcontent/Prediction___PV.pdf
_version_ 1770575081630597120