BINDING AFFINITY PREDICTION OF DRUG CANDIDATES THAT POTENTIALLY BECOME MPRO SARS-COV-2 INHIBITORS USING RANDOM FOREST REGRESSION

The coronavirus (COVID-19) was first discovered in December 2019 in Wuhan, Hubei Province, China. This disease has spread to all countries in the world causing millions of deaths. Therefore, currently there are a lot of studies researching for a drug to cure COVID-19. One of the computational drug d...

Full description

Saved in:
Bibliographic Details
Main Author: Restreva Danestiara, Venia
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/54824
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:The coronavirus (COVID-19) was first discovered in December 2019 in Wuhan, Hubei Province, China. This disease has spread to all countries in the world causing millions of deaths. Therefore, currently there are a lot of studies researching for a drug to cure COVID-19. One of the computational drug discovery techniques that can save costs and time is Molecular Docking. This method simulates the stability of receptor and ligand binding using scoring function that produces binding affinity. This study predicts the binding affinity of the data set using a machine learning scoring function. The data set contains 1138 drug candidates who were docked with Mpro SARS-Cov-2 using AutoDock Vina. Selection of drug candidates and receptors based on several previous studies. Sources of drug candidates were obtained from the DrugBank database which focused on antimalarial, anti-inflammatory and anti-viral drugs. The machine learning scoring function was applied using Random Forest Regression because it has good performance on the non-linear relationship between the receptor-ligand complex structure and binding affinity. In this process, training data is used which generates Random Forest-Score to predict the testing data which is the result of predicting binding affinity. The Random Forest-Score obtained has a relatively high accuracy with an R value (Pearson Correlation Coefficient) of 0.97 which indicates a linear relationship between the two variables. In addition, the MAE (Mean Absolute Error) and RMSE (Root Mean Square Error) values obtained are relatively small, namely 0.28 and 0.41. Meanwhile, the prediction of binding affinity by Random Forest Regression obtained relatively high accuracy, namely the value of R=0.81; MAE=0.61 and RMSE=0.92. The Random Forest Regression built is compatible as machine learning scoring function to predict the binding affinity of drug candidates. From the results of this study, hydrocortisone probutate was obtained as a potential drug candidate which was predicted to be able to inhibit activity of Mpro SARS-CoV-2.