Analysis and prediction of pair-wise contacts in protein tertiary structures

Protein structure prediction has been one of the greatest challenges in the field of computational biology and chemistry. This report presents findings on the statistical analysis of secondary structure-related patterns exhibited by non-local contact-pairs, followed by the investigation of a simple...

Full description

Saved in:
Bibliographic Details
Main Author: Yan, Eugene Wenhui
Other Authors: Tan Ching Wai
Format: Final Year Project
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10356/16981
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-16981
record_format dspace
spelling sg-ntu-dr.10356-169812023-03-03T20:43:10Z Analysis and prediction of pair-wise contacts in protein tertiary structures Yan, Eugene Wenhui Tan Ching Wai School of Computer Engineering BioSciences Research Centre DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Protein structure prediction has been one of the greatest challenges in the field of computational biology and chemistry. This report presents findings on the statistical analysis of secondary structure-related patterns exhibited by non-local contact-pairs, followed by the investigation of a simple predictive model using correlated mutational behavior arising from multiple sequence alignment of homologues, whilst using PSSM as a scoring function. The initial Exploratory Data Analysis phase produced results that show some unique patterns in formation of contact-pairs observed from proteins of SCOP classes A, B, C and D. By studying the characteristics of contact-pairs in known PDB structures, mathematical functions could be devised to serve as general estimators for contact-pair occurrences within a given protein sequence. Measurement of the frequency of residue-pairings also shed light on possibility of assigning probabilities to prediction models for showing preference to energetically favorable pairings that should have higher likelihood of forming contacts. The implemented prediction model yielded results that show a very slight improvement of between 2-14 percent over random assignment. The model was evaluated to be naïve, due to the absence of weighted parameters that could possibly filter the signals of true contacts from the background noise in graphical plots. The model also highlighted the common problem faced by most prediction techniques in comparative modeling, which is the huge number of false positives that hamper accuracy. Nevertheless, it has shown that PSSM is a viable late-stage scoring mechanism for the computation of correlation coefficient values, and is worthy of further research in the future. Bachelor of Engineering (Computer Science) 2009-05-29T03:00:32Z 2009-05-29T03:00:32Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16981 en Nanyang Technological University 57 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
spellingShingle DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Yan, Eugene Wenhui
Analysis and prediction of pair-wise contacts in protein tertiary structures
description Protein structure prediction has been one of the greatest challenges in the field of computational biology and chemistry. This report presents findings on the statistical analysis of secondary structure-related patterns exhibited by non-local contact-pairs, followed by the investigation of a simple predictive model using correlated mutational behavior arising from multiple sequence alignment of homologues, whilst using PSSM as a scoring function. The initial Exploratory Data Analysis phase produced results that show some unique patterns in formation of contact-pairs observed from proteins of SCOP classes A, B, C and D. By studying the characteristics of contact-pairs in known PDB structures, mathematical functions could be devised to serve as general estimators for contact-pair occurrences within a given protein sequence. Measurement of the frequency of residue-pairings also shed light on possibility of assigning probabilities to prediction models for showing preference to energetically favorable pairings that should have higher likelihood of forming contacts. The implemented prediction model yielded results that show a very slight improvement of between 2-14 percent over random assignment. The model was evaluated to be naïve, due to the absence of weighted parameters that could possibly filter the signals of true contacts from the background noise in graphical plots. The model also highlighted the common problem faced by most prediction techniques in comparative modeling, which is the huge number of false positives that hamper accuracy. Nevertheless, it has shown that PSSM is a viable late-stage scoring mechanism for the computation of correlation coefficient values, and is worthy of further research in the future.
author2 Tan Ching Wai
author_facet Tan Ching Wai
Yan, Eugene Wenhui
format Final Year Project
author Yan, Eugene Wenhui
author_sort Yan, Eugene Wenhui
title Analysis and prediction of pair-wise contacts in protein tertiary structures
title_short Analysis and prediction of pair-wise contacts in protein tertiary structures
title_full Analysis and prediction of pair-wise contacts in protein tertiary structures
title_fullStr Analysis and prediction of pair-wise contacts in protein tertiary structures
title_full_unstemmed Analysis and prediction of pair-wise contacts in protein tertiary structures
title_sort analysis and prediction of pair-wise contacts in protein tertiary structures
publishDate 2009
url http://hdl.handle.net/10356/16981
_version_ 1759853611083366400