Generalised topological features and machine learning in drug design
One of the key steps of drug design is the prediction of binding affinity between a protein and a ligand. This is a task achievable using methods in supervised learning, where a supervised learning algorithm can be trained on a dataset of protein-ligand pairs and their binding affinity. Previous wor...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/139051 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-139051 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1390512023-02-28T23:12:13Z Generalised topological features and machine learning in drug design Ti, Tze Hong Xia Kelin School of Physical and Mathematical Sciences xiakelin@ntu.edu.sg Science::Mathematics One of the key steps of drug design is the prediction of binding affinity between a protein and a ligand. This is a task achievable using methods in supervised learning, where a supervised learning algorithm can be trained on a dataset of protein-ligand pairs and their binding affinity. Previous works have shown that the use of persistent homology to first featurize the protein - ligand data and then machine learning models to generate binding affinity predictions can achieve very high accuracy in the binding affinity task. This work continues this approach to seek models with even better predictive capabilities. Firstly, two modern approaches to persistent homology based featurization are considered, persistent path embedding combined with the signature methods as well as persistent spectral models. These features are then used as inputs in several different machine learning models, including linear models, tree models, deep neural networks and echo state networks. The models are systematically tested on 2 commonly-used databases, the PDBbind – 2007 and PDBbind – 2016 dataset. It is found that the combination of persistent spectral based featurization as well as echo state networks performs the best and out performs several existing models in the literature. Bachelor of Science in Mathematical Sciences and Economics 2020-05-15T03:13:24Z 2020-05-15T03:13:24Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/139051 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Mathematics |
spellingShingle |
Science::Mathematics Ti, Tze Hong Generalised topological features and machine learning in drug design |
description |
One of the key steps of drug design is the prediction of binding affinity between a protein and a ligand. This is a task achievable using methods in supervised learning, where a supervised learning algorithm can be trained on a dataset of protein-ligand pairs and their binding affinity. Previous works have shown that the use of persistent homology to first featurize the protein - ligand data and then machine learning models to generate binding affinity predictions can achieve very high accuracy in the binding affinity task. This work continues this approach to seek models with even better predictive capabilities. Firstly, two modern approaches to persistent homology based featurization are considered, persistent path embedding combined with the signature methods as well as persistent spectral models. These features are then used as inputs in several different machine learning models, including linear models, tree models, deep neural networks and echo state networks. The models are systematically tested on 2 commonly-used databases, the PDBbind – 2007 and PDBbind – 2016 dataset. It is found that the combination of persistent spectral based featurization as well as echo state networks performs the best and out performs several existing models in the literature. |
author2 |
Xia Kelin |
author_facet |
Xia Kelin Ti, Tze Hong |
format |
Final Year Project |
author |
Ti, Tze Hong |
author_sort |
Ti, Tze Hong |
title |
Generalised topological features and machine learning in drug design |
title_short |
Generalised topological features and machine learning in drug design |
title_full |
Generalised topological features and machine learning in drug design |
title_fullStr |
Generalised topological features and machine learning in drug design |
title_full_unstemmed |
Generalised topological features and machine learning in drug design |
title_sort |
generalised topological features and machine learning in drug design |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/139051 |
_version_ |
1759853733295947776 |