Characterization of physicochemical environments of proteins
Proteins are molecular machines in cells that perform a diverse set of essential biological functions. The functions of proteins are determined by its 3D structure. The struc- ture creates local microenvironments for protein atoms to interact with one another. A detailed understanding of these mi...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/69607 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-69607 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Tan, Kuan Pern Characterization of physicochemical environments of proteins |
description |
Proteins are molecular machines in cells that perform a diverse set of essential biological
functions. The functions of proteins are determined by its 3D structure. The struc-
ture creates local microenvironments for protein atoms to interact with one another. A
detailed understanding of these microenvironments would allow better characterization
and engineering of protein functions. For example, this knowledge forms the basis of
modern therapeutics innovations such as rational drug and vaccine design, and could
have implications in other industries, including bioprocessing, biomimetics, biomaterials
among others. This thesis presents my results on the characterization of physicochemical
properties of microenvironment in proteins. To investigate the complex nature of protein
microenvironment, the characterization effort can be broadly categorized into three in-
terconnected topics, namely (i) residue depth (ii) hydrogen bonding, and (iii) multibody
statistical potential.
The first topic aims to quantify protein microenvironment using the biophysical pa-
rameter of residue depth. Depth of an amino acid measures the degree of amino acid
burial in proteins. I have shown that the energetics of proteins, the spatial distribution
and chemical properties of amino acids are dependent on residue depth. To exemplify
and utilize the results, I have designed several computational methods for protein en-
gineering and functional characterization. First, a novel method to design temperature
sensitive alleles of proteins was proposed by making point mutations of these residues.
Next, I have used residue depth to identify small molecule ligand binding site on proteins
by supplementing it with solvent accessibility and evolutionary information. Benchmarks
have shown that the method has comparable or better than the best available methods,
and could reveal unconventional sites unidentifiable with other methods. In addition,
I have also shown that residue depth can be used in the estimation of protein cavity volume using a Monte Carlo sampling approach, and pK a of amino acid residues using a
linear model.
The second topic studies the physicochemical properties of hydrogen bonding in dif-
ferent protein environments. I have performed statistical analysis on databases and clas-
sified hydrogen bonds into different types, and characterized the geometrical preference
and variations of the different types. By analyzing quantum simulation of the system, I
have shown that the geometrical preference of main-chain hydrogen bond is due to elec-
tron density arising from the planar nature of the peptide bond. I have also performed
empirical simulations that strongly suggest the causal link between this geometrical pref-
erence and secondary structure formation. Next, I have discovered that low-resolution
protein models in databases are consistently missing hydrogen bonds. To ameliorate
the models, I have designed a two-step refinement protocol. First, a simple algorithm
was used to predict missing pairs of donor-acceptor to form hydrogen bonds based on
their mutual preference and specificity. Second, Gaussian restraints were applied on the
geometry distribution of the missing pairs, after which a standard modelling protocol
can be implemented to refine the protein model. The refinement protocol was shown
capable of re-introducing hydrogen bonds in the local environment as well as improving
overall model quality. The refinement has functional implication on the protein chemical
properties, as exemplified with the more accurate pK a prediction.
The third topic is constructing an environmental dependent protein statistical po-
tential Packpred. Here, I have explicitly defined protein microenvironments as a set of
tightly packed amino acids, dubbed as ”residue cliques”. Employing Sippl’s formulation,
the non-random occurrence of microenvironments is characterized. The non-random
occurrence is indicative of the strength of interaction among amino acids, and can be
interpreted as an energy potential. I have evaluated the capability of the potential in
describing protein energetics on a large number of mutagenesis data. The benchmark
has shown that, as compared to all other competing methods, Packpred has the best
performance not only in binary classification of destabilizing mutants, but also correctly
rank-ordering the degree of phenotypical change associated with different mutations.
Lastly, I also present three biomolecular system modelling studies involving non-
globular proteins. These system are (i) Cohesin ring protein with coiled-coil structure (ii) transmembrane transporters OCTN-1 and -2, (iii) interaction interface between onco-
genic proteins VAV1 and EZH2. Modelling of these systems are challenging because con-
ventional tools and framework of comparative modelling are not applicable. Instead, an
integrative modelling approach was undertaken pertaining to individual systems. In all
the modelling work I have proposed experimentally testable hypotheses to decipher the
biological mechanism underlying the systems.
In conclusion, in this thesis I have presented an extensive characterization of physic-
ochemical environments of protein. The complex nature of the environment was elu-
cidated by three interdependent topics of residue depth, hydrogen bonding and amino
acid cliques. In addition to novel results, for every investigation I have also explored
their biological utilities, and have built open-access tools for them. I hope that the
work presented here would facilitate future research into protein structures and their
functions. |
author2 |
Chandra Shekhar Verma |
author_facet |
Chandra Shekhar Verma Tan, Kuan Pern |
format |
Theses and Dissertations |
author |
Tan, Kuan Pern |
author_sort |
Tan, Kuan Pern |
title |
Characterization of physicochemical environments of proteins |
title_short |
Characterization of physicochemical environments of proteins |
title_full |
Characterization of physicochemical environments of proteins |
title_fullStr |
Characterization of physicochemical environments of proteins |
title_full_unstemmed |
Characterization of physicochemical environments of proteins |
title_sort |
characterization of physicochemical environments of proteins |
publishDate |
2017 |
url |
http://hdl.handle.net/10356/69607 |
_version_ |
1759855314151145472 |
spelling |
sg-ntu-dr.10356-696072023-03-04T00:49:03Z Characterization of physicochemical environments of proteins Tan, Kuan Pern Chandra Shekhar Verma Kwoh Chee Keong School of Computer Science and Engineering A*STAR Bioinformatics institute (BII) Mallur Srivatsan Madhusudhan DRNTU::Engineering::Computer science and engineering Proteins are molecular machines in cells that perform a diverse set of essential biological functions. The functions of proteins are determined by its 3D structure. The struc- ture creates local microenvironments for protein atoms to interact with one another. A detailed understanding of these microenvironments would allow better characterization and engineering of protein functions. For example, this knowledge forms the basis of modern therapeutics innovations such as rational drug and vaccine design, and could have implications in other industries, including bioprocessing, biomimetics, biomaterials among others. This thesis presents my results on the characterization of physicochemical properties of microenvironment in proteins. To investigate the complex nature of protein microenvironment, the characterization effort can be broadly categorized into three in- terconnected topics, namely (i) residue depth (ii) hydrogen bonding, and (iii) multibody statistical potential. The first topic aims to quantify protein microenvironment using the biophysical pa- rameter of residue depth. Depth of an amino acid measures the degree of amino acid burial in proteins. I have shown that the energetics of proteins, the spatial distribution and chemical properties of amino acids are dependent on residue depth. To exemplify and utilize the results, I have designed several computational methods for protein en- gineering and functional characterization. First, a novel method to design temperature sensitive alleles of proteins was proposed by making point mutations of these residues. Next, I have used residue depth to identify small molecule ligand binding site on proteins by supplementing it with solvent accessibility and evolutionary information. Benchmarks have shown that the method has comparable or better than the best available methods, and could reveal unconventional sites unidentifiable with other methods. In addition, I have also shown that residue depth can be used in the estimation of protein cavity volume using a Monte Carlo sampling approach, and pK a of amino acid residues using a linear model. The second topic studies the physicochemical properties of hydrogen bonding in dif- ferent protein environments. I have performed statistical analysis on databases and clas- sified hydrogen bonds into different types, and characterized the geometrical preference and variations of the different types. By analyzing quantum simulation of the system, I have shown that the geometrical preference of main-chain hydrogen bond is due to elec- tron density arising from the planar nature of the peptide bond. I have also performed empirical simulations that strongly suggest the causal link between this geometrical pref- erence and secondary structure formation. Next, I have discovered that low-resolution protein models in databases are consistently missing hydrogen bonds. To ameliorate the models, I have designed a two-step refinement protocol. First, a simple algorithm was used to predict missing pairs of donor-acceptor to form hydrogen bonds based on their mutual preference and specificity. Second, Gaussian restraints were applied on the geometry distribution of the missing pairs, after which a standard modelling protocol can be implemented to refine the protein model. The refinement protocol was shown capable of re-introducing hydrogen bonds in the local environment as well as improving overall model quality. The refinement has functional implication on the protein chemical properties, as exemplified with the more accurate pK a prediction. The third topic is constructing an environmental dependent protein statistical po- tential Packpred. Here, I have explicitly defined protein microenvironments as a set of tightly packed amino acids, dubbed as ”residue cliques”. Employing Sippl’s formulation, the non-random occurrence of microenvironments is characterized. The non-random occurrence is indicative of the strength of interaction among amino acids, and can be interpreted as an energy potential. I have evaluated the capability of the potential in describing protein energetics on a large number of mutagenesis data. The benchmark has shown that, as compared to all other competing methods, Packpred has the best performance not only in binary classification of destabilizing mutants, but also correctly rank-ordering the degree of phenotypical change associated with different mutations. Lastly, I also present three biomolecular system modelling studies involving non- globular proteins. These system are (i) Cohesin ring protein with coiled-coil structure (ii) transmembrane transporters OCTN-1 and -2, (iii) interaction interface between onco- genic proteins VAV1 and EZH2. Modelling of these systems are challenging because con- ventional tools and framework of comparative modelling are not applicable. Instead, an integrative modelling approach was undertaken pertaining to individual systems. In all the modelling work I have proposed experimentally testable hypotheses to decipher the biological mechanism underlying the systems. In conclusion, in this thesis I have presented an extensive characterization of physic- ochemical environments of protein. The complex nature of the environment was elu- cidated by three interdependent topics of residue depth, hydrogen bonding and amino acid cliques. In addition to novel results, for every investigation I have also explored their biological utilities, and have built open-access tools for them. I hope that the work presented here would facilitate future research into protein structures and their functions. Doctor of Philosophy (SCE) 2017-03-01T04:27:45Z 2017-03-01T04:27:45Z 2017 Thesis Tan, K. P. (2017). Characterization of physicochemical environments of proteins. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/69607 10.32657/10356/69607 en 295 p. application/pdf |