Generalised topological features and machine learning in drug design

One of the key steps of drug design is the prediction of binding affinity between a protein and a ligand. This is a task achievable using methods in supervised learning, where a supervised learning algorithm can be trained on a dataset of protein-ligand pairs and their binding affinity. Previous wor...

Full description

Saved in:
Bibliographic Details
Main Author: Ti, Tze Hong
Other Authors: Xia Kelin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/139051
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-139051
record_format dspace
spelling sg-ntu-dr.10356-1390512023-02-28T23:12:13Z Generalised topological features and machine learning in drug design Ti, Tze Hong Xia Kelin School of Physical and Mathematical Sciences xiakelin@ntu.edu.sg Science::Mathematics One of the key steps of drug design is the prediction of binding affinity between a protein and a ligand. This is a task achievable using methods in supervised learning, where a supervised learning algorithm can be trained on a dataset of protein-ligand pairs and their binding affinity. Previous works have shown that the use of persistent homology to first featurize the protein - ligand data and then machine learning models to generate binding affinity predictions can achieve very high accuracy in the binding affinity task. This work continues this approach to seek models with even better predictive capabilities. Firstly, two modern approaches to persistent homology based featurization are considered, persistent path embedding combined with the signature methods as well as persistent spectral models. These features are then used as inputs in several different machine learning models, including linear models, tree models, deep neural networks and echo state networks. The models are systematically tested on 2 commonly-used databases, the PDBbind – 2007 and PDBbind – 2016 dataset. It is found that the combination of persistent spectral based featurization as well as echo state networks performs the best and out performs several existing models in the literature. Bachelor of Science in Mathematical Sciences and Economics 2020-05-15T03:13:24Z 2020-05-15T03:13:24Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/139051 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Mathematics
spellingShingle Science::Mathematics
Ti, Tze Hong
Generalised topological features and machine learning in drug design
description One of the key steps of drug design is the prediction of binding affinity between a protein and a ligand. This is a task achievable using methods in supervised learning, where a supervised learning algorithm can be trained on a dataset of protein-ligand pairs and their binding affinity. Previous works have shown that the use of persistent homology to first featurize the protein - ligand data and then machine learning models to generate binding affinity predictions can achieve very high accuracy in the binding affinity task. This work continues this approach to seek models with even better predictive capabilities. Firstly, two modern approaches to persistent homology based featurization are considered, persistent path embedding combined with the signature methods as well as persistent spectral models. These features are then used as inputs in several different machine learning models, including linear models, tree models, deep neural networks and echo state networks. The models are systematically tested on 2 commonly-used databases, the PDBbind – 2007 and PDBbind – 2016 dataset. It is found that the combination of persistent spectral based featurization as well as echo state networks performs the best and out performs several existing models in the literature.
author2 Xia Kelin
author_facet Xia Kelin
Ti, Tze Hong
format Final Year Project
author Ti, Tze Hong
author_sort Ti, Tze Hong
title Generalised topological features and machine learning in drug design
title_short Generalised topological features and machine learning in drug design
title_full Generalised topological features and machine learning in drug design
title_fullStr Generalised topological features and machine learning in drug design
title_full_unstemmed Generalised topological features and machine learning in drug design
title_sort generalised topological features and machine learning in drug design
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/139051
_version_ 1759853733295947776