Computational analysis of protein tertiary structures

Proteins are essential molecules that play important roles in virtually all the biological functions of a cell, one of which is that of catalysts in chemical reactions. These particular proteins, also known as enzymes, work by lowering the activation energy needed to carry out chemical reactions, t...

Full description

Saved in:
Bibliographic Details
Main Author: Theresia.
Other Authors: Tan Ching Wai
Format: Final Year Project
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10356/17031
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-17031
record_format dspace
spelling sg-ntu-dr.10356-170312023-03-03T20:35:48Z Computational analysis of protein tertiary structures Theresia. Tan Ching Wai School of Computer Engineering Bioinformatics Research Centre DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Proteins are essential molecules that play important roles in virtually all the biological functions of a cell, one of which is that of catalysts in chemical reactions. These particular proteins, also known as enzymes, work by lowering the activation energy needed to carry out chemical reactions, thus speeding up the reaction significantly. In chemical reactions, only 1% of residues in the single protein chain contribute to the catalytic reaction. These are known as catalytic residues. Therefore, it is desirable to learn how to identify these residues and their characteristics. The objective of this research is to identify catalytic residues in protein sequences using protein structural information, as previous studies has shown that a more accurate prediction can be yielded with the usage of structural information rather than pure sequence information alone. However, the structural information of protein is less readily available than sequence information. In this project, a novel method to obtain structural information from sequence information was introduced. The Structural Center of Mass (SCOM) and Linear Center of Mass (LCOM) were extracted. SCOM is defined as the centroid of the protein sequence, while LCOM is the midpoint of the protein sequence, in terms of molecular weight. The correlation between both features was analyzed to see if the method introduced was feasible and could be used to predict catalytic residues. In addition, analysis on the correlation between the Conservation Score of a protein and its SCOM was also performed to investigate whether better prediction of catalytic residues can be obtained. The findings show that there was no correlation between LCOM and SCOM. Thus it was not possible to predict the structural information from the sequence information of a protein alone. It is also observed that catalytic residues were not located close to LCOM of the protein, while 70% of catalytic residues were found located in the top 20% residues closest to the SCOM. Furthermore, 76% of the catalytic residues were found as part of the 20% conserved residues closest to the SCOM. Hence, it is concluded that SCOM can be used to identify catalytic residues from a sequence and conservation score should be used together to predict catalytic residues. Bachelor of Engineering (Computer Science) 2009-05-29T04:11:56Z 2009-05-29T04:11:56Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/17031 en Nanyang Technological University 70 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
spellingShingle DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Theresia.
Computational analysis of protein tertiary structures
description Proteins are essential molecules that play important roles in virtually all the biological functions of a cell, one of which is that of catalysts in chemical reactions. These particular proteins, also known as enzymes, work by lowering the activation energy needed to carry out chemical reactions, thus speeding up the reaction significantly. In chemical reactions, only 1% of residues in the single protein chain contribute to the catalytic reaction. These are known as catalytic residues. Therefore, it is desirable to learn how to identify these residues and their characteristics. The objective of this research is to identify catalytic residues in protein sequences using protein structural information, as previous studies has shown that a more accurate prediction can be yielded with the usage of structural information rather than pure sequence information alone. However, the structural information of protein is less readily available than sequence information. In this project, a novel method to obtain structural information from sequence information was introduced. The Structural Center of Mass (SCOM) and Linear Center of Mass (LCOM) were extracted. SCOM is defined as the centroid of the protein sequence, while LCOM is the midpoint of the protein sequence, in terms of molecular weight. The correlation between both features was analyzed to see if the method introduced was feasible and could be used to predict catalytic residues. In addition, analysis on the correlation between the Conservation Score of a protein and its SCOM was also performed to investigate whether better prediction of catalytic residues can be obtained. The findings show that there was no correlation between LCOM and SCOM. Thus it was not possible to predict the structural information from the sequence information of a protein alone. It is also observed that catalytic residues were not located close to LCOM of the protein, while 70% of catalytic residues were found located in the top 20% residues closest to the SCOM. Furthermore, 76% of the catalytic residues were found as part of the 20% conserved residues closest to the SCOM. Hence, it is concluded that SCOM can be used to identify catalytic residues from a sequence and conservation score should be used together to predict catalytic residues.
author2 Tan Ching Wai
author_facet Tan Ching Wai
Theresia.
format Final Year Project
author Theresia.
author_sort Theresia.
title Computational analysis of protein tertiary structures
title_short Computational analysis of protein tertiary structures
title_full Computational analysis of protein tertiary structures
title_fullStr Computational analysis of protein tertiary structures
title_full_unstemmed Computational analysis of protein tertiary structures
title_sort computational analysis of protein tertiary structures
publishDate 2009
url http://hdl.handle.net/10356/17031
_version_ 1759855326444650496