Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/161049 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-161049 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1610492023-02-28T20:10:08Z Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction Liu, Xiang Feng, Huitao Wu, Jie Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Artificial Intelligence Ligands With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and cost in drug discovery processes. However, a major issue remains for all AI-based learning model that is efficient molecular representations. Here we propose Dowker complex (DC) based molecular interaction representations and Riemann Zeta function based molecular featurization, for the first time. Molecular interactions between proteins and ligands (or others) are modeled as Dowker complexes. A multiscale representation is generated by using a filtration process, during which a series of DCs are generated at different scales. Combinatorial (Hodge) Laplacian matrices are constructed from these DCs, and the Riemann zeta functions from their spectral information can be used as molecular descriptors. To validate our models, we consider protein-ligand binding affinity prediction. Our DC-based machine learning (DCML) models, in particular, DC-based gradient boosting tree (DC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-2007, PDBbind-2013 and PDBbind-2016, and extensively compared with other existing state-of-the-art models. It has been found that our DC-based descriptors can achieve the state-of-the-art results and have better performance than all machine learning models with traditional molecular descriptors. Our Dowker complex based machine learning models can be used in other tasks in AI-based drug design and molecular data analysis. Ministry of Education (MOE) Nanyang Technological University Published version This work was supported in part by Nanyang Technological University Startup Grant M4081842 and Singapore Ministry of Education Academic Research fund Tier 1 RG109/19, MOET2EP20120- 0013 and MOE-T2EP20220-0010. The first author (XL) was supported by Nankai Zhide foundation. The second author (HF) was supported by Natural Science Foundation of China (NSFC grant no. 11931007, 11221091, 11271062, 11571184). The third author (JW) was supported by Natural Science Foundation of China (NSFC grant no. 11971144) and High-level Scientific Research Foundation of Hebei Province. 2022-08-12T07:24:17Z 2022-08-12T07:24:17Z 2022 Journal Article Liu, X., Feng, H., Wu, J. & Xia, K. (2022). Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction. PLOS Computational Biology, 18(4), e1009943-. https://dx.doi.org/10.1371/journal.pcbi.1009943 1553-734X https://hdl.handle.net/10356/161049 10.1371/journal.pcbi.1009943 35385478 2-s2.0-85127634840 4 18 e1009943 en M4081842 RG109/19 MOE-T2EP20120-0013 MOE-T2EP20220-0010 PLOS Computational Biology © 2022 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Mathematics Artificial Intelligence Ligands |
spellingShingle |
Science::Mathematics Artificial Intelligence Ligands Liu, Xiang Feng, Huitao Wu, Jie Xia, Kelin Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction |
description |
With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and cost in drug discovery processes. However, a major issue remains for all AI-based learning model that is efficient molecular representations. Here we propose Dowker complex (DC) based molecular interaction representations and Riemann Zeta function based molecular featurization, for the first time. Molecular interactions between proteins and ligands (or others) are modeled as Dowker complexes. A multiscale representation is generated by using a filtration process, during which a series of DCs are generated at different scales. Combinatorial (Hodge) Laplacian matrices are constructed from these DCs, and the Riemann zeta functions from their spectral information can be used as molecular descriptors. To validate our models, we consider protein-ligand binding affinity prediction. Our DC-based machine learning (DCML) models, in particular, DC-based gradient boosting tree (DC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-2007, PDBbind-2013 and PDBbind-2016, and extensively compared with other existing state-of-the-art models. It has been found that our DC-based descriptors can achieve the state-of-the-art results and have better performance than all machine learning models with traditional molecular descriptors. Our Dowker complex based machine learning models can be used in other tasks in AI-based drug design and molecular data analysis. |
author2 |
School of Physical and Mathematical Sciences |
author_facet |
School of Physical and Mathematical Sciences Liu, Xiang Feng, Huitao Wu, Jie Xia, Kelin |
format |
Article |
author |
Liu, Xiang Feng, Huitao Wu, Jie Xia, Kelin |
author_sort |
Liu, Xiang |
title |
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction |
title_short |
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction |
title_full |
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction |
title_fullStr |
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction |
title_full_unstemmed |
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction |
title_sort |
dowker complex based machine learning (dcml) models for protein-ligand binding affinity prediction |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/161049 |
_version_ |
1759856421227200512 |