Challenges and solutions in drug-target interaction prediction

When a drug is developed, it is designed so that it interacts with a specific target of interest in order to achieve the desired therapeutic effect. However, it is quite common to later find that the developed drug also interacts with multiple other targets that were not intended during its developm...

Full description

Saved in:

Bibliographic Details
Main Author:	Ezzat, Ali
Other Authors:	Kwoh Chee Keong
Format:	Theses and Dissertations
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Online Access:	http://hdl.handle.net/10356/75771
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-75771
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Ezzat, Ali Challenges and solutions in drug-target interaction prediction
description	When a drug is developed, it is designed so that it interacts with a specific target of interest in order to achieve the desired therapeutic effect. However, it is quite common to later find that the developed drug also interacts with multiple other targets that were not intended during its development. This is interesting because if a drug can interact with multiple targets, then it may have more than one therapeutic effect. Therefore, this provides a clear motivation for discovering new interactions for existing drugs. In drug discovery, an important task called drug-target interaction prediction detects such interactions on a large scale by screening many drugs and targets simultaneously. While there are wet-lab techniques for discovering these interactions, the focus of this thesis is particularly on computational drug-target interaction prediction. Specifically, we investigate methods that discover new interactions based on prior knowledge of existing drugs and their experimentally confirmed targets (i.e. machine learning). Throughout this thesis, we identified and addressed 4 outstanding problems in drug target interaction (DTI) prediction. Having addressed these problems, we were able to enhance the prediction performance and outperform relevant state-of-the-art methods. Firstly, DTI prediction methods have difficulty predicting interactions involving new drugs or targets for which there are no known interactions. To predict interactions, we developed two matrix factorization methods that utilize graph regularization. In addition, considering that many of the non-occurring edges in the bipartite DTI network are actually unknown or missing cases, we developed a preprocessing step to enhance predictions in the “new drug” and “new target” cases by adding edges with intermediate interaction likelihood scores. In our experiments, our methods performed better than the state-of-the-art methods and was found to predict interactions reasonably well. Secondly, class imbalance is an issue that is prevalent across all DTI datasets. Class imbalance can be divided into two sub-problems, namely between-class and within-class 7 imbalance. Between-class imbalance refers to the imbalance ratio between interacting and non-interacting drug-target pairs; this degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Withinclass imbalance refers to the imbalance between the sizes of sub-groups (types) of interactions; this biases the predictions towards the bigger and more well-represented sub-groups, leading to more errors in the smaller groups. Here, we developed an ensemble learning method that incorporates techniques to address the issues of between class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. Thirdly, there are DTI datasets where the feature sets for representing the drugs and targets (and, by extension, the drug-target pairs) are of a high dimensionality. High dimensionality of the data may lead to much longer running times for the prediction models. Furthermore, there may be redundancy in the features which may also lead to degradation in prediction performance. In this work, we used dimensionality reduction to deal with both of these issues, and we additionally used ensemble learning to improve the prediction performance further. As base learners for the ensemble, we selected two classifiers, namely Decision Tree and Kernel Ridge Regression, resulting in two variants of ensemble models, EnsemDT and EnsemKRR, respectively. Experimental results show that our proposed methods are indeed successful. Lastly, there is a concept called differential representation bias that has an impact on the prediction performance of DTI prediction methods. Specifically, differential representation bias refers to how much a drug (or target) appears in the positive training data as opposed to the negative data. Bearing this concept in mind, we experimented with the way that the negative training data is sampled prior to training the prediction model. We found that our modified sampling procedure produced significant improvements in DTI prediction performance.
author2	Kwoh Chee Keong
author_facet	Kwoh Chee Keong Ezzat, Ali
format	Theses and Dissertations
author	Ezzat, Ali
author_sort	Ezzat, Ali
title	Challenges and solutions in drug-target interaction prediction
title_short	Challenges and solutions in drug-target interaction prediction
title_full	Challenges and solutions in drug-target interaction prediction
title_fullStr	Challenges and solutions in drug-target interaction prediction
title_full_unstemmed	Challenges and solutions in drug-target interaction prediction
title_sort	challenges and solutions in drug-target interaction prediction
publishDate	2018
url	http://hdl.handle.net/10356/75771
_version_	1759855424176128000
spelling	sg-ntu-dr.10356-757712023-03-04T00:52:10Z Challenges and solutions in drug-target interaction prediction Ezzat, Ali Kwoh Chee Keong School of Computer Science and Engineering Bioinformatics Research Centre Li Xiaoli DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences When a drug is developed, it is designed so that it interacts with a specific target of interest in order to achieve the desired therapeutic effect. However, it is quite common to later find that the developed drug also interacts with multiple other targets that were not intended during its development. This is interesting because if a drug can interact with multiple targets, then it may have more than one therapeutic effect. Therefore, this provides a clear motivation for discovering new interactions for existing drugs. In drug discovery, an important task called drug-target interaction prediction detects such interactions on a large scale by screening many drugs and targets simultaneously. While there are wet-lab techniques for discovering these interactions, the focus of this thesis is particularly on computational drug-target interaction prediction. Specifically, we investigate methods that discover new interactions based on prior knowledge of existing drugs and their experimentally confirmed targets (i.e. machine learning). Throughout this thesis, we identified and addressed 4 outstanding problems in drug target interaction (DTI) prediction. Having addressed these problems, we were able to enhance the prediction performance and outperform relevant state-of-the-art methods. Firstly, DTI prediction methods have difficulty predicting interactions involving new drugs or targets for which there are no known interactions. To predict interactions, we developed two matrix factorization methods that utilize graph regularization. In addition, considering that many of the non-occurring edges in the bipartite DTI network are actually unknown or missing cases, we developed a preprocessing step to enhance predictions in the “new drug” and “new target” cases by adding edges with intermediate interaction likelihood scores. In our experiments, our methods performed better than the state-of-the-art methods and was found to predict interactions reasonably well. Secondly, class imbalance is an issue that is prevalent across all DTI datasets. Class imbalance can be divided into two sub-problems, namely between-class and within-class 7 imbalance. Between-class imbalance refers to the imbalance ratio between interacting and non-interacting drug-target pairs; this degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Withinclass imbalance refers to the imbalance between the sizes of sub-groups (types) of interactions; this biases the predictions towards the bigger and more well-represented sub-groups, leading to more errors in the smaller groups. Here, we developed an ensemble learning method that incorporates techniques to address the issues of between class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. Thirdly, there are DTI datasets where the feature sets for representing the drugs and targets (and, by extension, the drug-target pairs) are of a high dimensionality. High dimensionality of the data may lead to much longer running times for the prediction models. Furthermore, there may be redundancy in the features which may also lead to degradation in prediction performance. In this work, we used dimensionality reduction to deal with both of these issues, and we additionally used ensemble learning to improve the prediction performance further. As base learners for the ensemble, we selected two classifiers, namely Decision Tree and Kernel Ridge Regression, resulting in two variants of ensemble models, EnsemDT and EnsemKRR, respectively. Experimental results show that our proposed methods are indeed successful. Lastly, there is a concept called differential representation bias that has an impact on the prediction performance of DTI prediction methods. Specifically, differential representation bias refers to how much a drug (or target) appears in the positive training data as opposed to the negative data. Bearing this concept in mind, we experimented with the way that the negative training data is sampled prior to training the prediction model. We found that our modified sampling procedure produced significant improvements in DTI prediction performance. Doctor of Philosophy (SCE) 2018-06-14T04:02:59Z 2018-06-14T04:02:59Z 2018 Thesis Ezzat, A. (2018). Challenges and solutions in drug-target interaction prediction. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/75771 10.32657/10356/75771 en 165 p. application/pdf

Challenges and solutions in drug-target interaction prediction

Similar Items