EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION

As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed...

Full description

Saved in:
Bibliographic Details
Main Author: Rizqi Alfisyahrin, Alvin
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/78308
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:78308
spelling id-itb.:783082023-09-18T23:04:40ZEXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION Rizqi Alfisyahrin, Alvin Indonesia Final Project explainable ai, gradient boosting, hyperparameter tuning, phishing website. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/78308 As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed, demonstrating competitive performance with fast training times. However, in critical contexts such as security, medicine, and finance, the need for increased transparency has arisen from various AI stakeholders. One pressing security concern is the proliferation of phishing websites. Hence, a new concept called Explainable AI (XAI) has been introduced. Experiments were conducted using three phishing website datasets with three gradient boosting algorithms and hyperparameter tuning. These experiments revealed that the model achieving the best accuracy for each dataset was CatBoost with tuning using Randomized Search. Subsequently, XAI was implemented to provide global (SHAP and PDP) and local (LIME and Anchor) explanations for the selected dataset. The experimental results identified certain features consistently deemed important in model predictions, including "length_url," "time_domain_activation," and "directory_length." Remarkably, even by utilizing only these three features from the initial set of 90 features, a high level of accuracy could still be maintained. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed, demonstrating competitive performance with fast training times. However, in critical contexts such as security, medicine, and finance, the need for increased transparency has arisen from various AI stakeholders. One pressing security concern is the proliferation of phishing websites. Hence, a new concept called Explainable AI (XAI) has been introduced. Experiments were conducted using three phishing website datasets with three gradient boosting algorithms and hyperparameter tuning. These experiments revealed that the model achieving the best accuracy for each dataset was CatBoost with tuning using Randomized Search. Subsequently, XAI was implemented to provide global (SHAP and PDP) and local (LIME and Anchor) explanations for the selected dataset. The experimental results identified certain features consistently deemed important in model predictions, including "length_url," "time_domain_activation," and "directory_length." Remarkably, even by utilizing only these three features from the initial set of 90 features, a high level of accuracy could still be maintained.
format Final Project
author Rizqi Alfisyahrin, Alvin
spellingShingle Rizqi Alfisyahrin, Alvin
EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
author_facet Rizqi Alfisyahrin, Alvin
author_sort Rizqi Alfisyahrin, Alvin
title EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
title_short EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
title_full EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
title_fullStr EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
title_full_unstemmed EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
title_sort explainable ai in gradient boosting algorithm for phishing website detection
url https://digilib.itb.ac.id/gdl/view/78308
_version_ 1822995700458192896