Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts
Background: Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond to the need of accurate classification of the rapidly increasing protein structures. There are many unannotated crystal contacts and there also exist f...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/81494 http://hdl.handle.net/10220/40820 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-81494 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-814942022-02-16T16:29:03Z Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts Liu, Qian Li, Zhenhua Li, Jinyan School of Computer Science and Engineering Computer Science and Engineering Background: Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond to the need of accurate classification of the rapidly increasing protein structures. There are many unannotated crystal contacts and there also exist false annotations in this rapidly expanding volume of data. Previous tools have been proposed to address this problem. However, challenging issues still remain, such as low performance when the training and test data contain mixed interfaces having diverse sizes of contact areas. Methods and results: B factor is a measure to quantify the vibrational motion of an atom, a more relevant feature than interface size to characterize protein binding. We propose to use three features related to B factor for the classification between biological interfaces and crystal packing contacts. The first feature is the sum of the normalized B factors of the interfacial atoms in the contact area, the second is the average of the interfacial B factor per residue in the chain, and the third is the average number of interfacial atoms with a negative normalized B factor per residue in the chain. We investigate the distribution properties of these basic features and a compound feature on four datasets of biological binding and crystal packing, and on a protein binding-only dataset with known binding affinity. We also compare the cross-dataset classification performance of these features with existing methods and with a widely-used and the most effective feature interface area. The results demonstrate that our features outperform the interface area approach and the existing prediction methods remarkably for many tests on all of these datasets. Conclusions: The proposed B factor related features are more effective than interface area to distinguish crystal packing from biological binding interfaces. Our computational methods have a potential for large-scale and accurate identification of biological interactions from the experimentally determined structural data stored at PDB which may have diverse interface sizes. Published version 2016-06-28T08:28:56Z 2019-12-06T14:32:14Z 2016-06-28T08:28:56Z 2019-12-06T14:32:14Z 2014 Journal Article Liu, Q., Li, Z., & Li, J. (2014). Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts. BMC Bioinformatics, 15(Suppl 16), S3-. 1471-2105 https://hdl.handle.net/10356/81494 http://hdl.handle.net/10220/40820 10.1186/1471-2105-15-S16-S3 25522196 en BMC Bioinformatics © 2014 Liu et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. 11 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer Science and Engineering |
spellingShingle |
Computer Science and Engineering Liu, Qian Li, Zhenhua Li, Jinyan Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
description |
Background: Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond to the need of accurate classification of the rapidly increasing protein structures. There are many unannotated crystal contacts and there also exist false annotations in this rapidly expanding volume of data. Previous tools have been proposed to address this problem. However, challenging issues still remain, such as low performance when the training and test data contain mixed interfaces having diverse sizes of contact areas. Methods and results: B factor is a measure to quantify the vibrational motion of an atom, a more relevant feature than interface size to characterize protein binding. We propose to use three features related to B factor for the classification between biological interfaces and crystal packing contacts. The first feature is the sum of the normalized B factors of the interfacial atoms in the contact area, the second is the average of the interfacial B factor per residue in the chain, and the third is the average number of interfacial atoms with a negative normalized B factor per residue in the chain. We investigate the distribution properties of these basic features and a compound feature on four datasets of biological binding and crystal packing, and on a protein binding-only dataset with known binding affinity. We also compare the cross-dataset classification performance of these features with existing methods and with a widely-used and the most effective feature interface area. The results demonstrate that our features outperform the interface area approach and the existing prediction methods remarkably for many tests on all of these datasets. Conclusions: The proposed B factor related features are more effective than interface area to distinguish crystal packing from biological binding interfaces. Our computational methods have a potential for large-scale and accurate identification of biological interactions from the experimentally determined structural data stored at PDB which may have diverse interface sizes. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Liu, Qian Li, Zhenhua Li, Jinyan |
format |
Article |
author |
Liu, Qian Li, Zhenhua Li, Jinyan |
author_sort |
Liu, Qian |
title |
Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
title_short |
Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
title_full |
Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
title_fullStr |
Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
title_full_unstemmed |
Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
title_sort |
use b-factor related features for accurate classification between protein binding interfaces and crystal packing contacts |
publishDate |
2016 |
url |
https://hdl.handle.net/10356/81494 http://hdl.handle.net/10220/40820 |
_version_ |
1725985750678241280 |