SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT
Coronary heart disease is one of the leading causes of global mortality, making it crucial to develop accurate classification models for predicting this condition. However, datasets for coronary heart disease are often small and low-dimensional, which can increase the risk of overfitting if all f...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/86189 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:86189 |
---|---|
spelling |
id-itb.:861892024-09-16T14:15:52ZSHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT Afif Rizky A, M Indonesia Theses Feature selection, Classification, Acute Artery Disease, Shapley Additive Explanation, SHAP INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/86189 Coronary heart disease is one of the leading causes of global mortality, making it crucial to develop accurate classification models for predicting this condition. However, datasets for coronary heart disease are often small and low-dimensional, which can increase the risk of overfitting if all features are used in the classification model. Therefore, an appropriate feature selection method is necessary to choose the most relevant features. Some studies suggest that Shapley Additive Explanations (SHAP) holds potential as a solution for feature selection. This study aims to demonstrate that SHAP can be used as a feature selection solution in classification models for coronary heart disease data with small and low-dimensional characteristics.. The experiment was conducted using a coronary heart disease dataset characterized by its small size and low dimensionality. Two feature selection methods were compared: Principal Component Analysis (PCA) and expert validation. Classification models were built using the random forest algorithm, and model performance was evaluated using ROC-AUC and AU-PRC metrics to measure effectiveness in predicting coronary heart disease. The dataset was split into training and testing sets, and each model was tested in several experimental scenarios to assess the consistency of SHAP as a feature selection method. The experimental results show an improvement in the performance of the coronary heart disease classification model using SHAP for feature selection. The classification model experienced an increase in ROC-AUC from 0.91 to 0.94 and AU-PRC from 0.81 to 0.97 after applying feature selection, compared to models using PCA and features selected through expert validation. These findings demonstrate that SHAP enhances the accuracy and efficiency of coronary heart disease classification models using random forest, making it a highly useful method for feature selection in small-dimensional datasets, especially in the context of coronary heart disease cases.. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Coronary heart disease is one of the leading causes of global mortality, making it
crucial to develop accurate classification models for predicting this condition.
However, datasets for coronary heart disease are often small and low-dimensional,
which can increase the risk of overfitting if all features are used in the classification
model. Therefore, an appropriate feature selection method is necessary to choose
the most relevant features. Some studies suggest that Shapley Additive Explanations
(SHAP) holds potential as a solution for feature selection. This study aims to
demonstrate that SHAP can be used as a feature selection solution in classification
models for coronary heart disease data with small and low-dimensional
characteristics..
The experiment was conducted using a coronary heart disease dataset
characterized by its small size and low dimensionality. Two feature selection
methods were compared: Principal Component Analysis (PCA) and expert
validation. Classification models were built using the random forest algorithm, and
model performance was evaluated using ROC-AUC and AU-PRC metrics to
measure effectiveness in predicting coronary heart disease. The dataset was split
into training and testing sets, and each model was tested in several experimental
scenarios to assess the consistency of SHAP as a feature selection method.
The experimental results show an improvement in the performance of the coronary
heart disease classification model using SHAP for feature selection. The
classification model experienced an increase in ROC-AUC from 0.91 to 0.94 and
AU-PRC from 0.81 to 0.97 after applying feature selection, compared to models
using PCA and features selected through expert validation. These findings
demonstrate that SHAP enhances the accuracy and efficiency of coronary heart
disease classification models using random forest, making it a highly useful method
for feature selection in small-dimensional datasets, especially in the context of
coronary heart disease cases.. |
format |
Theses |
author |
Afif Rizky A, M |
spellingShingle |
Afif Rizky A, M SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT |
author_facet |
Afif Rizky A, M |
author_sort |
Afif Rizky A, M |
title |
SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT |
title_short |
SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT |
title_full |
SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT |
title_fullStr |
SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT |
title_full_unstemmed |
SHAPLEY ADDITIVE EXPLANATION (SHAP) AS FEATURE SELECTION FOR ACUTE ARTERY DISEASE CLASSIFICATION MODEL DEVELOPMENT |
title_sort |
shapley additive explanation (shap) as feature selection for acute artery disease classification model development |
url |
https://digilib.itb.ac.id/gdl/view/86189 |
_version_ |
1822283351623467008 |