TREE-BASED AND NEURAL NETWORK-BASED MODELS EXPERIMENTATION FOR PREDICTION OF ACTIVITY AND SELECTIVITY PROFILES OF HUMAN CARBONIC ANHYDRASE (HCA) ISOFORM II, IX, AND XII INHIBITORS

The human Carbonic Anhydrase (hCA) enzyme plays a crucial role in human metabolism, including pH regulation, fluid secretion, and gas transport. However, the overexpression of isoforms IX and XII of this enzyme is associated with the development of various types of cancer, such as lung, breast,...

Full description

Saved in:
Bibliographic Details
Main Author: Prasidha Bhawarnawa, Gede
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/87583
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:The human Carbonic Anhydrase (hCA) enzyme plays a crucial role in human metabolism, including pH regulation, fluid secretion, and gas transport. However, the overexpression of isoforms IX and XII of this enzyme is associated with the development of various types of cancer, such as lung, breast, and brain cancers. This highlights the importance of methods to discover drug compounds that can inhibit isoforms IX and XII, one of which is virtual screening using machine learning models. The study referenced in this thesis successfully trained machine learning models from various model types to classify compound activity against hCA II, IX, and XII isoforms individually. It also found that decision tree-based models and ensemble methods produced the best performance. The same study further deduced that these models could predict the selective profile of compounds based on the high performance of each model. This research utilizes state-of-the-art models based on decision trees and neural networks as alternative solutions to predict compound activity and selective profiles against hCA II, IX, and XII isoforms. The models used in this study (ExtraTrees, XGBoost, GRANDE, DeepTLF, NCART, TabPFN) are designed to adapt decision trees with gradient-based learning modifications or employ Transformer-based architectures, with the aim of improving classification performance beyond the reference study's results. This research found that all the alternative models used achieved high performance, statistically equivalent in classifying compounds individually. However, it also revealed that the models failed to predict the selective profiles of compounds using the available data, contrary to the claims made by the reference study.