Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction

Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug...

Full description

Saved in:
Bibliographic Details
Main Authors: Wee, Junjie, Xia, Kelin
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/168978
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-168978
record_format dspace
spelling sg-ntu-dr.10356-1689782023-06-27T01:48:27Z Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction Wee, Junjie Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Forman Ricci Curvature Molecular Featurization Machine Learning Drug Design Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors. Ministry of Education (MOE) Nanyang Technological University Startup (supported in part through Grant M4081842.110); Singapore Ministry of Education Academic Research fund (Tier 1 RG109/19 and Tier 2 MOE2018-T2-1-033). 2023-06-27T01:48:27Z 2023-06-27T01:48:27Z 2021 Journal Article Wee, J. & Xia, K. (2021). Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction. Briefings in Bioinformatics, 22(6), bbab136-. https://dx.doi.org/10.1093/bib/bbab136 1467-5463 https://hdl.handle.net/10356/168978 10.1093/bib/bbab136 22 2-s2.0-85111173404 6 22 bbab136 en M4081842.110 RG109/19 MOE2018-T2-1-033 Briefings in Bioinformatics 10.21979/N9/ZTA5MN © 2021 The Author(s). Published by Oxford University Press. All rights reserved
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Mathematics
Forman Ricci Curvature
Molecular Featurization
Machine Learning
Drug Design
spellingShingle Science::Mathematics
Forman Ricci Curvature
Molecular Featurization
Machine Learning
Drug Design
Wee, Junjie
Xia, Kelin
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
description Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors.
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Wee, Junjie
Xia, Kelin
format Article
author Wee, Junjie
Xia, Kelin
author_sort Wee, Junjie
title Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_short Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_full Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_fullStr Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_full_unstemmed Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_sort forman persistent ricci curvature (fprc)-based machine learning models for protein-ligand binding affinity prediction
publishDate 2023
url https://hdl.handle.net/10356/168978
_version_ 1772827618314289152