Geometric deep learning for antibiotic discovery
Nowadays, in order to reduce the unbearable laboratory cost, time cost and increase the accuracy rate of new drug identification at the same time, Artificial Intelligence (AI) techniques have been widely applied in pharmaceutical industry for drug discovery programs. In this article, we propos...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156881 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Nowadays, in order to reduce the unbearable laboratory cost, time cost and increase the accuracy
rate of new drug identification at the same time, Artificial Intelligence (AI) techniques have been
widely applied in pharmaceutical industry for drug discovery programs. In this article, we
proposed a geometric deep learning model that utilized the Graph Attention Network (GAT) to
identify potential new antibiotics candidates. Then, several performance metrics were tested on
to evaluate the model, which included AUC-ROC, accuracy, and weighted average of precision,
recall and F1-score. The performance of the proposed model was then compared with other
existing geometric deep learning models. Undersampling and 5-fold cross validation were
applied to reduce imbalance of data and reduce the variance and bias of the resulting
performance metrics, respectively, to make our experiment fair. The result of the experiment
showed that the proposed model outperformed all other competing models in all performance
metrics. This probably implies that the proposed model, which leverages more on the
neighboring messages that are more relevant to the updating atoms, are more suitable for
molecular property identification. Also, an ablation study was conducted to investigate the
contribution of Morgan Fingerprint, molecular graph embeddings, and SMILES text embeddings
towards the molecular property of interest. It turned out that Morgan Fingerprint and molecular
graph embeddings are the optimal combination of embeddings to be included in our model. |
---|