Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate

Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXpl...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Elise Zining
Other Authors: Liu Siyuan
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175082
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175082
record_format dspace
spelling sg-ntu-dr.10356-1750822024-04-19T15:42:09Z Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate Tan, Elise Zining Liu Siyuan School of Computer Science and Engineering SYLiu@ntu.edu.sg Computer and Information Science Explainable artificial intelligence Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXplainable Artificial Intelligence (XAI) is important to explain models and identify critical factors affecting lung cancer survival. In addition, real datasets frequently contain missing values due to diverse reasons. In this work, the Simulacrum dataset was used, and different statistical and machine learning imputation methods like Mean-Mode, K-Nearest Neighbors (KNN), MissForest were used to impute the missing data. An innovative neural network imputation method was proposed to evaluate whether the sequence of imputing the features with missing values matters. Neural network models were then trained on the imputed datasets to predict lung cancer patient survival. The results revealed that the models trained on imputed datasets obtained from machine learning imputation methods had better performance and the sequence in which features were imputed had minimal impact on the performance. For interpretability, different XAI approaches were deployed, namely, Expected Gradients (EG), SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The analysis revealed that important features were related to the tumor condition, age and dose administration, largely unaffected by different imputation methods. Pairwise similarity metrics such as Jaccard Similarity, Pearson Correlation Coefficient and Cosine Similarity were used to compare the XAI approaches. EG and SHAP exhibited the highest similarity to each other, though there were some notable inconsistencies between their explanations. This paper underscores the usefulness of XAI for medical professionals in understanding model decisions and emphasizes the importance of cross-validating explanations from various XAI approaches to evaluate model reliability. Bachelor's degree 2024-04-19T04:27:16Z 2024-04-19T04:27:16Z 2024 Final Year Project (FYP) Tan, E. Z. (2024). Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175082 https://hdl.handle.net/10356/175082 en SCSE23-0510 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Explainable artificial intelligence
spellingShingle Computer and Information Science
Explainable artificial intelligence
Tan, Elise Zining
Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
description Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXplainable Artificial Intelligence (XAI) is important to explain models and identify critical factors affecting lung cancer survival. In addition, real datasets frequently contain missing values due to diverse reasons. In this work, the Simulacrum dataset was used, and different statistical and machine learning imputation methods like Mean-Mode, K-Nearest Neighbors (KNN), MissForest were used to impute the missing data. An innovative neural network imputation method was proposed to evaluate whether the sequence of imputing the features with missing values matters. Neural network models were then trained on the imputed datasets to predict lung cancer patient survival. The results revealed that the models trained on imputed datasets obtained from machine learning imputation methods had better performance and the sequence in which features were imputed had minimal impact on the performance. For interpretability, different XAI approaches were deployed, namely, Expected Gradients (EG), SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The analysis revealed that important features were related to the tumor condition, age and dose administration, largely unaffected by different imputation methods. Pairwise similarity metrics such as Jaccard Similarity, Pearson Correlation Coefficient and Cosine Similarity were used to compare the XAI approaches. EG and SHAP exhibited the highest similarity to each other, though there were some notable inconsistencies between their explanations. This paper underscores the usefulness of XAI for medical professionals in understanding model decisions and emphasizes the importance of cross-validating explanations from various XAI approaches to evaluate model reliability.
author2 Liu Siyuan
author_facet Liu Siyuan
Tan, Elise Zining
format Final Year Project
author Tan, Elise Zining
author_sort Tan, Elise Zining
title Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_short Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_full Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_fullStr Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_full_unstemmed Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_sort exploring the similarities of xai approaches in finding the influential factors in lung cancer survival rate
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175082
_version_ 1800916339917848576