Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXpl...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175082 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175082 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1750822024-04-19T15:42:09Z Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate Tan, Elise Zining Liu Siyuan School of Computer Science and Engineering SYLiu@ntu.edu.sg Computer and Information Science Explainable artificial intelligence Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXplainable Artificial Intelligence (XAI) is important to explain models and identify critical factors affecting lung cancer survival. In addition, real datasets frequently contain missing values due to diverse reasons. In this work, the Simulacrum dataset was used, and different statistical and machine learning imputation methods like Mean-Mode, K-Nearest Neighbors (KNN), MissForest were used to impute the missing data. An innovative neural network imputation method was proposed to evaluate whether the sequence of imputing the features with missing values matters. Neural network models were then trained on the imputed datasets to predict lung cancer patient survival. The results revealed that the models trained on imputed datasets obtained from machine learning imputation methods had better performance and the sequence in which features were imputed had minimal impact on the performance. For interpretability, different XAI approaches were deployed, namely, Expected Gradients (EG), SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The analysis revealed that important features were related to the tumor condition, age and dose administration, largely unaffected by different imputation methods. Pairwise similarity metrics such as Jaccard Similarity, Pearson Correlation Coefficient and Cosine Similarity were used to compare the XAI approaches. EG and SHAP exhibited the highest similarity to each other, though there were some notable inconsistencies between their explanations. This paper underscores the usefulness of XAI for medical professionals in understanding model decisions and emphasizes the importance of cross-validating explanations from various XAI approaches to evaluate model reliability. Bachelor's degree 2024-04-19T04:27:16Z 2024-04-19T04:27:16Z 2024 Final Year Project (FYP) Tan, E. Z. (2024). Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175082 https://hdl.handle.net/10356/175082 en SCSE23-0510 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Explainable artificial intelligence |
spellingShingle |
Computer and Information Science Explainable artificial intelligence Tan, Elise Zining Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate |
description |
Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXplainable Artificial Intelligence (XAI) is important to explain models and identify critical factors affecting lung cancer survival. In addition, real datasets frequently contain missing values due to diverse reasons. In this work, the Simulacrum dataset was used, and different statistical and machine learning imputation methods like Mean-Mode, K-Nearest Neighbors (KNN), MissForest were used to impute the missing data. An innovative neural network imputation method was proposed to evaluate whether the sequence of imputing the features with missing values matters. Neural network models were then trained on the imputed datasets to predict lung cancer patient survival. The results revealed that the models trained on imputed datasets obtained from machine learning imputation methods had better performance and the sequence in which features were imputed had minimal impact on the performance. For interpretability, different XAI approaches were deployed, namely, Expected Gradients (EG), SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The analysis revealed that important features were related to the tumor condition, age and dose administration, largely unaffected by different imputation methods. Pairwise similarity metrics such as Jaccard Similarity, Pearson Correlation Coefficient and Cosine Similarity were used to compare the XAI approaches. EG and SHAP exhibited the highest similarity to each other, though there were some notable inconsistencies between their explanations. This paper underscores the usefulness of XAI for medical professionals in understanding model decisions and emphasizes the importance of cross-validating explanations from various XAI approaches to evaluate model reliability. |
author2 |
Liu Siyuan |
author_facet |
Liu Siyuan Tan, Elise Zining |
format |
Final Year Project |
author |
Tan, Elise Zining |
author_sort |
Tan, Elise Zining |
title |
Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate |
title_short |
Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate |
title_full |
Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate |
title_fullStr |
Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate |
title_full_unstemmed |
Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate |
title_sort |
exploring the similarities of xai approaches in finding the influential factors in lung cancer survival rate |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175082 |
_version_ |
1800916339917848576 |