PREDIKSI FENOTIP RESISTENSI FIRST-LINE ANTIBIOTICS MENGGUNAKAN GENOM MYCOBACTERIUM TUBERCULOSIS DENGAN ANALISIS K-MERS DAN MACHINE LEARNING

Antibiotic susceptibility testing is essential before conducting tuberculosis treatment to minimize the chance of new antibiotic resistance cases. Unfortunately, the standard methods generally take 5 to 16 days to complete. This mainly is due to the very slow growth rate of M. tuberculosis. The g...

Full description

Saved in:
Bibliographic Details
Main Author: Nasrulloh, Hilman
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/79535
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Antibiotic susceptibility testing is essential before conducting tuberculosis treatment to minimize the chance of new antibiotic resistance cases. Unfortunately, the standard methods generally take 5 to 16 days to complete. This mainly is due to the very slow growth rate of M. tuberculosis. The genomic approach has potential to serve as an alternative to the standard method of susceptibility testing and potentially can be performed in less than 48 hours. This approach is supported by the availability of the new Whole Genome Shotgun Sequencing method that allows us to sequence the M. tuberculosis genome directly from a patient body sample in about 24 hours. This approach is also supported by the advancement of many analytical methods for analyzing genomes on a large scale such as Pangenome, Genome-wide Association Study (GWAS), and Machine Learning. By considering the potentials, this study developed a proof of concept predictor to predict the phenotype of antibiotic resistance in M. tuberculosis—specifically ethambutol, isoniazid, and rifampin which are the firstline antibiotics—using GWAS, pangenome, and machine learning methods. A total of 669 M. tuberculosis genome samples and their susceptibility test results obtained from the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) were used in this study. GWAS based on case-control study was conducted using Linear Mixed Model (LMM) to determine the associations of k-mers presence with antibiotic resistance. The presence of significantly associated k-mers was used for training the prediction model using the CatBoost library. The prediction model was integrated into a preprocessing pipeline designed to mimic data transformation in previous steps to transform a new genome sequence into a k-mers presence matrix ready for use by the prediction model. The predictor was successfully developed with predictive performance evaluated by the Receiver Operating Characteristic Area Under Curve (ROC-AUC) metric with scores of 92% for ethambutol, 95% for isoniazid, and 93% for rifampin. This indicates the predictor is very good to use. In the future, this method can be used to predict new genotypes associated with specific antibiotic resistance, and improve the diagnosis accuracy of one or more resistance genotypes in a short time.