Characterization of physicochemical environments of proteins

Proteins are molecular machines in cells that perform a diverse set of essential biological functions. The functions of proteins are determined by its 3D structure. The struc- ture creates local microenvironments for protein atoms to interact with one another. A detailed understanding of these mi...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Kuan Pern
Other Authors: Chandra Shekhar Verma
Format: Theses and Dissertations
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/69607
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-69607
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Tan, Kuan Pern
Characterization of physicochemical environments of proteins
description Proteins are molecular machines in cells that perform a diverse set of essential biological functions. The functions of proteins are determined by its 3D structure. The struc- ture creates local microenvironments for protein atoms to interact with one another. A detailed understanding of these microenvironments would allow better characterization and engineering of protein functions. For example, this knowledge forms the basis of modern therapeutics innovations such as rational drug and vaccine design, and could have implications in other industries, including bioprocessing, biomimetics, biomaterials among others. This thesis presents my results on the characterization of physicochemical properties of microenvironment in proteins. To investigate the complex nature of protein microenvironment, the characterization effort can be broadly categorized into three in- terconnected topics, namely (i) residue depth (ii) hydrogen bonding, and (iii) multibody statistical potential. The first topic aims to quantify protein microenvironment using the biophysical pa- rameter of residue depth. Depth of an amino acid measures the degree of amino acid burial in proteins. I have shown that the energetics of proteins, the spatial distribution and chemical properties of amino acids are dependent on residue depth. To exemplify and utilize the results, I have designed several computational methods for protein en- gineering and functional characterization. First, a novel method to design temperature sensitive alleles of proteins was proposed by making point mutations of these residues. Next, I have used residue depth to identify small molecule ligand binding site on proteins by supplementing it with solvent accessibility and evolutionary information. Benchmarks have shown that the method has comparable or better than the best available methods, and could reveal unconventional sites unidentifiable with other methods. In addition, I have also shown that residue depth can be used in the estimation of protein cavity volume using a Monte Carlo sampling approach, and pK a of amino acid residues using a linear model. The second topic studies the physicochemical properties of hydrogen bonding in dif- ferent protein environments. I have performed statistical analysis on databases and clas- sified hydrogen bonds into different types, and characterized the geometrical preference and variations of the different types. By analyzing quantum simulation of the system, I have shown that the geometrical preference of main-chain hydrogen bond is due to elec- tron density arising from the planar nature of the peptide bond. I have also performed empirical simulations that strongly suggest the causal link between this geometrical pref- erence and secondary structure formation. Next, I have discovered that low-resolution protein models in databases are consistently missing hydrogen bonds. To ameliorate the models, I have designed a two-step refinement protocol. First, a simple algorithm was used to predict missing pairs of donor-acceptor to form hydrogen bonds based on their mutual preference and specificity. Second, Gaussian restraints were applied on the geometry distribution of the missing pairs, after which a standard modelling protocol can be implemented to refine the protein model. The refinement protocol was shown capable of re-introducing hydrogen bonds in the local environment as well as improving overall model quality. The refinement has functional implication on the protein chemical properties, as exemplified with the more accurate pK a prediction. The third topic is constructing an environmental dependent protein statistical po- tential Packpred. Here, I have explicitly defined protein microenvironments as a set of tightly packed amino acids, dubbed as ”residue cliques”. Employing Sippl’s formulation, the non-random occurrence of microenvironments is characterized. The non-random occurrence is indicative of the strength of interaction among amino acids, and can be interpreted as an energy potential. I have evaluated the capability of the potential in describing protein energetics on a large number of mutagenesis data. The benchmark has shown that, as compared to all other competing methods, Packpred has the best performance not only in binary classification of destabilizing mutants, but also correctly rank-ordering the degree of phenotypical change associated with different mutations. Lastly, I also present three biomolecular system modelling studies involving non- globular proteins. These system are (i) Cohesin ring protein with coiled-coil structure (ii) transmembrane transporters OCTN-1 and -2, (iii) interaction interface between onco- genic proteins VAV1 and EZH2. Modelling of these systems are challenging because con- ventional tools and framework of comparative modelling are not applicable. Instead, an integrative modelling approach was undertaken pertaining to individual systems. In all the modelling work I have proposed experimentally testable hypotheses to decipher the biological mechanism underlying the systems. In conclusion, in this thesis I have presented an extensive characterization of physic- ochemical environments of protein. The complex nature of the environment was elu- cidated by three interdependent topics of residue depth, hydrogen bonding and amino acid cliques. In addition to novel results, for every investigation I have also explored their biological utilities, and have built open-access tools for them. I hope that the work presented here would facilitate future research into protein structures and their functions.
author2 Chandra Shekhar Verma
author_facet Chandra Shekhar Verma
Tan, Kuan Pern
format Theses and Dissertations
author Tan, Kuan Pern
author_sort Tan, Kuan Pern
title Characterization of physicochemical environments of proteins
title_short Characterization of physicochemical environments of proteins
title_full Characterization of physicochemical environments of proteins
title_fullStr Characterization of physicochemical environments of proteins
title_full_unstemmed Characterization of physicochemical environments of proteins
title_sort characterization of physicochemical environments of proteins
publishDate 2017
url http://hdl.handle.net/10356/69607
_version_ 1759855314151145472
spelling sg-ntu-dr.10356-696072023-03-04T00:49:03Z Characterization of physicochemical environments of proteins Tan, Kuan Pern Chandra Shekhar Verma Kwoh Chee Keong School of Computer Science and Engineering A*STAR Bioinformatics institute (BII) Mallur Srivatsan Madhusudhan DRNTU::Engineering::Computer science and engineering Proteins are molecular machines in cells that perform a diverse set of essential biological functions. The functions of proteins are determined by its 3D structure. The struc- ture creates local microenvironments for protein atoms to interact with one another. A detailed understanding of these microenvironments would allow better characterization and engineering of protein functions. For example, this knowledge forms the basis of modern therapeutics innovations such as rational drug and vaccine design, and could have implications in other industries, including bioprocessing, biomimetics, biomaterials among others. This thesis presents my results on the characterization of physicochemical properties of microenvironment in proteins. To investigate the complex nature of protein microenvironment, the characterization effort can be broadly categorized into three in- terconnected topics, namely (i) residue depth (ii) hydrogen bonding, and (iii) multibody statistical potential. The first topic aims to quantify protein microenvironment using the biophysical pa- rameter of residue depth. Depth of an amino acid measures the degree of amino acid burial in proteins. I have shown that the energetics of proteins, the spatial distribution and chemical properties of amino acids are dependent on residue depth. To exemplify and utilize the results, I have designed several computational methods for protein en- gineering and functional characterization. First, a novel method to design temperature sensitive alleles of proteins was proposed by making point mutations of these residues. Next, I have used residue depth to identify small molecule ligand binding site on proteins by supplementing it with solvent accessibility and evolutionary information. Benchmarks have shown that the method has comparable or better than the best available methods, and could reveal unconventional sites unidentifiable with other methods. In addition, I have also shown that residue depth can be used in the estimation of protein cavity volume using a Monte Carlo sampling approach, and pK a of amino acid residues using a linear model. The second topic studies the physicochemical properties of hydrogen bonding in dif- ferent protein environments. I have performed statistical analysis on databases and clas- sified hydrogen bonds into different types, and characterized the geometrical preference and variations of the different types. By analyzing quantum simulation of the system, I have shown that the geometrical preference of main-chain hydrogen bond is due to elec- tron density arising from the planar nature of the peptide bond. I have also performed empirical simulations that strongly suggest the causal link between this geometrical pref- erence and secondary structure formation. Next, I have discovered that low-resolution protein models in databases are consistently missing hydrogen bonds. To ameliorate the models, I have designed a two-step refinement protocol. First, a simple algorithm was used to predict missing pairs of donor-acceptor to form hydrogen bonds based on their mutual preference and specificity. Second, Gaussian restraints were applied on the geometry distribution of the missing pairs, after which a standard modelling protocol can be implemented to refine the protein model. The refinement protocol was shown capable of re-introducing hydrogen bonds in the local environment as well as improving overall model quality. The refinement has functional implication on the protein chemical properties, as exemplified with the more accurate pK a prediction. The third topic is constructing an environmental dependent protein statistical po- tential Packpred. Here, I have explicitly defined protein microenvironments as a set of tightly packed amino acids, dubbed as ”residue cliques”. Employing Sippl’s formulation, the non-random occurrence of microenvironments is characterized. The non-random occurrence is indicative of the strength of interaction among amino acids, and can be interpreted as an energy potential. I have evaluated the capability of the potential in describing protein energetics on a large number of mutagenesis data. The benchmark has shown that, as compared to all other competing methods, Packpred has the best performance not only in binary classification of destabilizing mutants, but also correctly rank-ordering the degree of phenotypical change associated with different mutations. Lastly, I also present three biomolecular system modelling studies involving non- globular proteins. These system are (i) Cohesin ring protein with coiled-coil structure (ii) transmembrane transporters OCTN-1 and -2, (iii) interaction interface between onco- genic proteins VAV1 and EZH2. Modelling of these systems are challenging because con- ventional tools and framework of comparative modelling are not applicable. Instead, an integrative modelling approach was undertaken pertaining to individual systems. In all the modelling work I have proposed experimentally testable hypotheses to decipher the biological mechanism underlying the systems. In conclusion, in this thesis I have presented an extensive characterization of physic- ochemical environments of protein. The complex nature of the environment was elu- cidated by three interdependent topics of residue depth, hydrogen bonding and amino acid cliques. In addition to novel results, for every investigation I have also explored their biological utilities, and have built open-access tools for them. I hope that the work presented here would facilitate future research into protein structures and their functions. Doctor of Philosophy (SCE) 2017-03-01T04:27:45Z 2017-03-01T04:27:45Z 2017 Thesis Tan, K. P. (2017). Characterization of physicochemical environments of proteins. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/69607 10.32657/10356/69607 en 295 p. application/pdf