EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION
As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/78308 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:78308 |
---|---|
spelling |
id-itb.:783082023-09-18T23:04:40ZEXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION Rizqi Alfisyahrin, Alvin Indonesia Final Project explainable ai, gradient boosting, hyperparameter tuning, phishing website. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/78308 As the demand for optimal AI performance has grown, complex models have been developed, leading to a lack of transparency in explaining prediction outcomes. In recent years, three complex gradient boosting methods based on decision trees, namely XGBoost, CatBoost, and LightGBM, have been proposed, demonstrating competitive performance with fast training times. However, in critical contexts such as security, medicine, and finance, the need for increased transparency has arisen from various AI stakeholders. One pressing security concern is the proliferation of phishing websites. Hence, a new concept called Explainable AI (XAI) has been introduced. Experiments were conducted using three phishing website datasets with three gradient boosting algorithms and hyperparameter tuning. These experiments revealed that the model achieving the best accuracy for each dataset was CatBoost with tuning using Randomized Search. Subsequently, XAI was implemented to provide global (SHAP and PDP) and local (LIME and Anchor) explanations for the selected dataset. The experimental results identified certain features consistently deemed important in model predictions, including "length_url," "time_domain_activation," and "directory_length." Remarkably, even by utilizing only these three features from the initial set of 90 features, a high level of accuracy could still be maintained. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
As the demand for optimal AI performance has grown, complex models have been
developed, leading to a lack of transparency in explaining prediction outcomes. In
recent years, three complex gradient boosting methods based on decision trees, namely
XGBoost, CatBoost, and LightGBM, have been proposed, demonstrating competitive
performance with fast training times. However, in critical contexts such as security,
medicine, and finance, the need for increased transparency has arisen from various AI
stakeholders. One pressing security concern is the proliferation of phishing websites.
Hence, a new concept called Explainable AI (XAI) has been introduced.
Experiments were conducted using three phishing website datasets with three gradient
boosting algorithms and hyperparameter tuning. These experiments revealed that the
model achieving the best accuracy for each dataset was CatBoost with tuning using
Randomized Search. Subsequently, XAI was implemented to provide global (SHAP
and PDP) and local (LIME and Anchor) explanations for the selected dataset. The
experimental results identified certain features consistently deemed important in model
predictions, including "length_url," "time_domain_activation," and
"directory_length." Remarkably, even by utilizing only these three features from the
initial set of 90 features, a high level of accuracy could still be maintained. |
format |
Final Project |
author |
Rizqi Alfisyahrin, Alvin |
spellingShingle |
Rizqi Alfisyahrin, Alvin EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION |
author_facet |
Rizqi Alfisyahrin, Alvin |
author_sort |
Rizqi Alfisyahrin, Alvin |
title |
EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION |
title_short |
EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION |
title_full |
EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION |
title_fullStr |
EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION |
title_full_unstemmed |
EXPLAINABLE AI IN GRADIENT BOOSTING ALGORITHM FOR PHISHING WEBSITE DETECTION |
title_sort |
explainable ai in gradient boosting algorithm for phishing website detection |
url |
https://digilib.itb.ac.id/gdl/view/78308 |
_version_ |
1822995700458192896 |