Computational analysis of protein tertiary structures
Proteins are essential molecules that play important roles in virtually all the biological functions of a cell, one of which is that of catalysts in chemical reactions. These particular proteins, also known as enzymes, work by lowering the activation energy needed to carry out chemical reactions, t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/17031 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-17031 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-170312023-03-03T20:35:48Z Computational analysis of protein tertiary structures Theresia. Tan Ching Wai School of Computer Engineering Bioinformatics Research Centre DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Proteins are essential molecules that play important roles in virtually all the biological functions of a cell, one of which is that of catalysts in chemical reactions. These particular proteins, also known as enzymes, work by lowering the activation energy needed to carry out chemical reactions, thus speeding up the reaction significantly. In chemical reactions, only 1% of residues in the single protein chain contribute to the catalytic reaction. These are known as catalytic residues. Therefore, it is desirable to learn how to identify these residues and their characteristics. The objective of this research is to identify catalytic residues in protein sequences using protein structural information, as previous studies has shown that a more accurate prediction can be yielded with the usage of structural information rather than pure sequence information alone. However, the structural information of protein is less readily available than sequence information. In this project, a novel method to obtain structural information from sequence information was introduced. The Structural Center of Mass (SCOM) and Linear Center of Mass (LCOM) were extracted. SCOM is defined as the centroid of the protein sequence, while LCOM is the midpoint of the protein sequence, in terms of molecular weight. The correlation between both features was analyzed to see if the method introduced was feasible and could be used to predict catalytic residues. In addition, analysis on the correlation between the Conservation Score of a protein and its SCOM was also performed to investigate whether better prediction of catalytic residues can be obtained. The findings show that there was no correlation between LCOM and SCOM. Thus it was not possible to predict the structural information from the sequence information of a protein alone. It is also observed that catalytic residues were not located close to LCOM of the protein, while 70% of catalytic residues were found located in the top 20% residues closest to the SCOM. Furthermore, 76% of the catalytic residues were found as part of the 20% conserved residues closest to the SCOM. Hence, it is concluded that SCOM can be used to identify catalytic residues from a sequence and conservation score should be used together to predict catalytic residues. Bachelor of Engineering (Computer Science) 2009-05-29T04:11:56Z 2009-05-29T04:11:56Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/17031 en Nanyang Technological University 70 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Theresia. Computational analysis of protein tertiary structures |
description |
Proteins are essential molecules that play important roles in virtually all the biological functions of a cell, one of which is that of catalysts in chemical reactions. These particular proteins, also known as enzymes, work by lowering the activation energy needed to carry out chemical reactions, thus speeding up the reaction significantly. In chemical reactions, only 1% of residues in the single protein chain contribute to the catalytic reaction. These are known as catalytic residues. Therefore, it is desirable to learn how to identify these residues and their characteristics.
The objective of this research is to identify catalytic residues in protein sequences using protein structural information, as previous studies has shown that a more accurate prediction can be yielded with the usage of structural information rather than pure sequence information alone. However, the structural information of protein is less readily available than sequence information. In this project, a novel method to obtain structural information from sequence information was introduced. The Structural Center of Mass (SCOM) and Linear Center of Mass (LCOM) were extracted. SCOM is defined as the centroid of the protein sequence, while LCOM is the midpoint of the protein sequence, in terms of molecular weight. The correlation between both features was analyzed to see if the method introduced was feasible and could be used to predict catalytic residues. In addition, analysis on the correlation between the Conservation Score of a protein and its SCOM was also performed to investigate whether better prediction of catalytic residues can be obtained.
The findings show that there was no correlation between LCOM and SCOM. Thus it was not possible to predict the structural information from the sequence information of a protein alone. It is also observed that catalytic residues were not located close to LCOM of the protein, while 70% of catalytic residues were found located in the top 20% residues closest to the SCOM. Furthermore, 76% of the catalytic residues were found as part of the 20% conserved residues closest to the SCOM. Hence, it is concluded that SCOM can be used to identify catalytic residues from a sequence and conservation score should be used together to predict catalytic residues. |
author2 |
Tan Ching Wai |
author_facet |
Tan Ching Wai Theresia. |
format |
Final Year Project |
author |
Theresia. |
author_sort |
Theresia. |
title |
Computational analysis of protein tertiary structures |
title_short |
Computational analysis of protein tertiary structures |
title_full |
Computational analysis of protein tertiary structures |
title_fullStr |
Computational analysis of protein tertiary structures |
title_full_unstemmed |
Computational analysis of protein tertiary structures |
title_sort |
computational analysis of protein tertiary structures |
publishDate |
2009 |
url |
http://hdl.handle.net/10356/17031 |
_version_ |
1759855326444650496 |