Investigating energy-based pool structure selection in the structure ensemble modeling with experimental distance constraints : the example from a multidomain protein Pub1
The structural variations of multidomain proteins with flexible parts mediate many biological processes, and a structure ensemble can be determined by selecting a weighted combination of representative structures from a simulated structure pool, producing the best fit to experimental constraints suc...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/87035 http://hdl.handle.net/10220/45213 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The structural variations of multidomain proteins with flexible parts mediate many biological processes, and a structure ensemble can be determined by selecting a weighted combination of representative structures from a simulated structure pool, producing the best fit to experimental constraints such as interatomic distance. In this study, a hybrid structure‐based and physics‐based atomistic force field with an efficient sampling strategy is adopted to simulate a model di‐domain protein against experimental paramagnetic relaxation enhancement (PRE) data that correspond to distance constraints. The molecular dynamics simulations produce a wide range of conformations depicted on a protein energy landscape. Subsequently, a conformational ensemble recovered with low‐energy structures and the minimum‐size restraint is identified in good agreement with experimental PRE rates, and the result is also supported by chemical shift perturbations and small‐angle X‐ray scattering data. It is illustrated that the regularizations of energy and ensemble‐size prevent an arbitrary interpretation of protein conformations. Moreover, energy is found to serve as a critical control to refine the structure pool and prevent data overfitting, because the absence of energy regularization exposes ensemble construction to the noise from high‐energy structures and causes a more ambiguous representation of protein conformations. Finally, we perform structure‐ensemble optimizations with a topology‐based structure pool, to enhance the understanding on the ensemble results from different sources of pool candidates. |
---|