Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction

With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Xiang, Feng, Huitao, Wu, Jie, Xia, Kelin
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/161049
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-161049
record_format dspace
spelling sg-ntu-dr.10356-1610492023-02-28T20:10:08Z Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction Liu, Xiang Feng, Huitao Wu, Jie Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Artificial Intelligence Ligands With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and cost in drug discovery processes. However, a major issue remains for all AI-based learning model that is efficient molecular representations. Here we propose Dowker complex (DC) based molecular interaction representations and Riemann Zeta function based molecular featurization, for the first time. Molecular interactions between proteins and ligands (or others) are modeled as Dowker complexes. A multiscale representation is generated by using a filtration process, during which a series of DCs are generated at different scales. Combinatorial (Hodge) Laplacian matrices are constructed from these DCs, and the Riemann zeta functions from their spectral information can be used as molecular descriptors. To validate our models, we consider protein-ligand binding affinity prediction. Our DC-based machine learning (DCML) models, in particular, DC-based gradient boosting tree (DC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-2007, PDBbind-2013 and PDBbind-2016, and extensively compared with other existing state-of-the-art models. It has been found that our DC-based descriptors can achieve the state-of-the-art results and have better performance than all machine learning models with traditional molecular descriptors. Our Dowker complex based machine learning models can be used in other tasks in AI-based drug design and molecular data analysis. Ministry of Education (MOE) Nanyang Technological University Published version This work was supported in part by Nanyang Technological University Startup Grant M4081842 and Singapore Ministry of Education Academic Research fund Tier 1 RG109/19, MOET2EP20120- 0013 and MOE-T2EP20220-0010. The first author (XL) was supported by Nankai Zhide foundation. The second author (HF) was supported by Natural Science Foundation of China (NSFC grant no. 11931007, 11221091, 11271062, 11571184). The third author (JW) was supported by Natural Science Foundation of China (NSFC grant no. 11971144) and High-level Scientific Research Foundation of Hebei Province. 2022-08-12T07:24:17Z 2022-08-12T07:24:17Z 2022 Journal Article Liu, X., Feng, H., Wu, J. & Xia, K. (2022). Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction. PLOS Computational Biology, 18(4), e1009943-. https://dx.doi.org/10.1371/journal.pcbi.1009943 1553-734X https://hdl.handle.net/10356/161049 10.1371/journal.pcbi.1009943 35385478 2-s2.0-85127634840 4 18 e1009943 en M4081842 RG109/19 MOE-T2EP20120-0013 MOE-T2EP20220-0010 PLOS Computational Biology © 2022 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Mathematics
Artificial Intelligence
Ligands
spellingShingle Science::Mathematics
Artificial Intelligence
Ligands
Liu, Xiang
Feng, Huitao
Wu, Jie
Xia, Kelin
Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
description With the great advancements in experimental data, computational power and learning algorithms, artificial intelligence (AI) based drug design has begun to gain momentum recently. AI-based drug design has great promise to revolutionize pharmaceutical industries by significantly reducing the time and cost in drug discovery processes. However, a major issue remains for all AI-based learning model that is efficient molecular representations. Here we propose Dowker complex (DC) based molecular interaction representations and Riemann Zeta function based molecular featurization, for the first time. Molecular interactions between proteins and ligands (or others) are modeled as Dowker complexes. A multiscale representation is generated by using a filtration process, during which a series of DCs are generated at different scales. Combinatorial (Hodge) Laplacian matrices are constructed from these DCs, and the Riemann zeta functions from their spectral information can be used as molecular descriptors. To validate our models, we consider protein-ligand binding affinity prediction. Our DC-based machine learning (DCML) models, in particular, DC-based gradient boosting tree (DC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-2007, PDBbind-2013 and PDBbind-2016, and extensively compared with other existing state-of-the-art models. It has been found that our DC-based descriptors can achieve the state-of-the-art results and have better performance than all machine learning models with traditional molecular descriptors. Our Dowker complex based machine learning models can be used in other tasks in AI-based drug design and molecular data analysis.
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Liu, Xiang
Feng, Huitao
Wu, Jie
Xia, Kelin
format Article
author Liu, Xiang
Feng, Huitao
Wu, Jie
Xia, Kelin
author_sort Liu, Xiang
title Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_short Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_full Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_fullStr Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_full_unstemmed Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction
title_sort dowker complex based machine learning (dcml) models for protein-ligand binding affinity prediction
publishDate 2022
url https://hdl.handle.net/10356/161049
_version_ 1759856421227200512