Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/168978 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-168978 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1689782023-06-27T01:48:27Z Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction Wee, Junjie Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Forman Ricci Curvature Molecular Featurization Machine Learning Drug Design Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors. Ministry of Education (MOE) Nanyang Technological University Startup (supported in part through Grant M4081842.110); Singapore Ministry of Education Academic Research fund (Tier 1 RG109/19 and Tier 2 MOE2018-T2-1-033). 2023-06-27T01:48:27Z 2023-06-27T01:48:27Z 2021 Journal Article Wee, J. & Xia, K. (2021). Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction. Briefings in Bioinformatics, 22(6), bbab136-. https://dx.doi.org/10.1093/bib/bbab136 1467-5463 https://hdl.handle.net/10356/168978 10.1093/bib/bbab136 22 2-s2.0-85111173404 6 22 bbab136 en M4081842.110 RG109/19 MOE2018-T2-1-033 Briefings in Bioinformatics 10.21979/N9/ZTA5MN © 2021 The Author(s). Published by Oxford University Press. All rights reserved |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Mathematics Forman Ricci Curvature Molecular Featurization Machine Learning Drug Design |
spellingShingle |
Science::Mathematics Forman Ricci Curvature Molecular Featurization Machine Learning Drug Design Wee, Junjie Xia, Kelin Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction |
description |
Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors. |
author2 |
School of Physical and Mathematical Sciences |
author_facet |
School of Physical and Mathematical Sciences Wee, Junjie Xia, Kelin |
format |
Article |
author |
Wee, Junjie Xia, Kelin |
author_sort |
Wee, Junjie |
title |
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction |
title_short |
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction |
title_full |
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction |
title_fullStr |
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction |
title_full_unstemmed |
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction |
title_sort |
forman persistent ricci curvature (fprc)-based machine learning models for protein-ligand binding affinity prediction |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/168978 |
_version_ |
1772827618314289152 |