EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION

As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed...

Full description

Saved in:
Bibliographic Details
Main Author: Rizqi Alfisyahrin, Alvin
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/78308
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed, demonstrating competitive performance with fast training times. However, in critical contexts such as security, medicine, and finance, the need for increased transparency has arisen from various AI stakeholders. One pressing security concern is the proliferation of phishing websites. Hence, a new concept called Explainable AI (XAI) has been introduced. Experiments were conducted using three phishing website datasets with three gradient boosting algorithms and hyperparameter tuning. These experiments revealed that the model achieving the best accuracy for each dataset was CatBoost with tuning using Randomized Search. Subsequently, XAI was implemented to provide global (SHAP and PDP) and local (LIME and Anchor) explanations for the selected dataset. The experimental results identified certain features consistently deemed important in model predictions, including "length_url," "time_domain_activation," and "directory_length." Remarkably, even by utilizing only these three features from the initial set of 90 features, a high level of accuracy could still be maintained.