Analysis and prediction of pair-wise contacts in protein tertiary structures
Protein structure prediction has been one of the greatest challenges in the field of computational biology and chemistry. This report presents findings on the statistical analysis of secondary structure-related patterns exhibited by non-local contact-pairs, followed by the investigation of a simple...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/16981 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-16981 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-169812023-03-03T20:43:10Z Analysis and prediction of pair-wise contacts in protein tertiary structures Yan, Eugene Wenhui Tan Ching Wai School of Computer Engineering BioSciences Research Centre DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Protein structure prediction has been one of the greatest challenges in the field of computational biology and chemistry. This report presents findings on the statistical analysis of secondary structure-related patterns exhibited by non-local contact-pairs, followed by the investigation of a simple predictive model using correlated mutational behavior arising from multiple sequence alignment of homologues, whilst using PSSM as a scoring function. The initial Exploratory Data Analysis phase produced results that show some unique patterns in formation of contact-pairs observed from proteins of SCOP classes A, B, C and D. By studying the characteristics of contact-pairs in known PDB structures, mathematical functions could be devised to serve as general estimators for contact-pair occurrences within a given protein sequence. Measurement of the frequency of residue-pairings also shed light on possibility of assigning probabilities to prediction models for showing preference to energetically favorable pairings that should have higher likelihood of forming contacts. The implemented prediction model yielded results that show a very slight improvement of between 2-14 percent over random assignment. The model was evaluated to be naïve, due to the absence of weighted parameters that could possibly filter the signals of true contacts from the background noise in graphical plots. The model also highlighted the common problem faced by most prediction techniques in comparative modeling, which is the huge number of false positives that hamper accuracy. Nevertheless, it has shown that PSSM is a viable late-stage scoring mechanism for the computation of correlation coefficient values, and is worthy of further research in the future. Bachelor of Engineering (Computer Science) 2009-05-29T03:00:32Z 2009-05-29T03:00:32Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16981 en Nanyang Technological University 57 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Yan, Eugene Wenhui Analysis and prediction of pair-wise contacts in protein tertiary structures |
description |
Protein structure prediction has been one of the greatest challenges in the field of computational biology and chemistry. This report presents findings on the statistical analysis of secondary structure-related patterns exhibited by non-local contact-pairs, followed by the investigation of a simple predictive model using correlated mutational behavior arising from multiple sequence alignment of homologues, whilst using PSSM as a scoring function.
The initial Exploratory Data Analysis phase produced results that show some unique patterns in formation of contact-pairs observed from proteins of SCOP classes A, B, C and D. By studying the characteristics of contact-pairs in known PDB structures, mathematical functions could be devised to serve as general estimators for contact-pair occurrences within a given protein sequence. Measurement of the frequency of residue-pairings also shed light on possibility of assigning probabilities to prediction models for showing preference to energetically favorable pairings that should have higher likelihood of forming contacts.
The implemented prediction model yielded results that show a very slight improvement of between 2-14 percent over random assignment. The model was evaluated to be naïve, due to the absence of weighted parameters that could possibly filter the signals of true contacts from the background noise in graphical plots. The model also highlighted the common problem faced by most prediction techniques in comparative modeling, which is the huge number of false positives that hamper accuracy. Nevertheless, it has shown that PSSM is a viable late-stage scoring mechanism for the computation of correlation coefficient values, and is worthy of further research in the future. |
author2 |
Tan Ching Wai |
author_facet |
Tan Ching Wai Yan, Eugene Wenhui |
format |
Final Year Project |
author |
Yan, Eugene Wenhui |
author_sort |
Yan, Eugene Wenhui |
title |
Analysis and prediction of pair-wise contacts in protein tertiary structures |
title_short |
Analysis and prediction of pair-wise contacts in protein tertiary structures |
title_full |
Analysis and prediction of pair-wise contacts in protein tertiary structures |
title_fullStr |
Analysis and prediction of pair-wise contacts in protein tertiary structures |
title_full_unstemmed |
Analysis and prediction of pair-wise contacts in protein tertiary structures |
title_sort |
analysis and prediction of pair-wise contacts in protein tertiary structures |
publishDate |
2009 |
url |
http://hdl.handle.net/10356/16981 |
_version_ |
1759853611083366400 |