OLB-AC: toward optimizing ligand bioactivities through deep graph learning and activity cliffs

Motivation: Deep graph learning (DGL) has been widely employed in the realm of ligand-based virtual screening. Within this field, a key hurdle is the existence of activity cliffs (ACs), where minor chemical alterations can lead to significant changes in bioactivity. In response, several DGL models h...

Full description

Saved in:
Bibliographic Details
Main Authors: Yin, Yueming, Hu, Haifeng, Yang, Jitao, Ye, Chun, Goh, Wilson Wen Bin, Kong, Adams Wai Kin, Wu, Jiansheng
Other Authors: College of Computing and Data Science
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181726
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Motivation: Deep graph learning (DGL) has been widely employed in the realm of ligand-based virtual screening. Within this field, a key hurdle is the existence of activity cliffs (ACs), where minor chemical alterations can lead to significant changes in bioactivity. In response, several DGL models have been developed to enhance ligand bioactivity prediction in the presence of ACs. Yet, there remains a largely unexplored opportunity within ACs for optimizing ligand bioactivity, making it an area ripe for further investigation. Results: We present a novel approach to simultaneously predict and optimize ligand bioactivities through DGL and ACs (OLB-AC). OLB-AC possesses the capability to optimize ligand molecules located near ACs, providing a direct reference for optimizing ligand bioactivities with the matching of original ligands. To accomplish this, a novel attentive graph reconstruction neural network and ligand optimization scheme are proposed. Attentive graph reconstruction neural network reconstructs original ligands and optimizes them through adversarial representations derived from their bioactivity prediction process. Experimental results on nine drug targets reveal that out of the 667 molecules generated through OLB-AC optimization on datasets comprising 974 low-activity, noninhibitor, or highly toxic ligands, 49 are recognized as known highly active, inhibitor, or nontoxic ligands beyond the datasets’ scope. The 27 out of 49 matched molecular pairs generated by OLB-AC reveal novel transformations not present in their training sets. The adversarial representations employed for ligand optimization originate from the gradients of bioactivity predictions. Therefore, we also assess OLB-AC’s prediction accuracy across 33 different bioactivity datasets. Results show that OLB-AC achieves the best Pearson correlation coefficient (r2) on 27/33 datasets, with an average improvement of 7.2%–22.9% against the state-of-the-art bioactivity prediction methods. Availability and implementation: The code and dataset developed in this work are available at github.com/Yueming-Yin/OLB-AC.