Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate

Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXpl...

Full description

Saved in:

Bibliographic Details
Main Author:	Tan, Elise Zining
Other Authors:	Liu Siyuan
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Explainable artificial intelligence
Online Access:	https://hdl.handle.net/10356/175082
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175082
record_format	dspace
spelling	sg-ntu-dr.10356-1750822024-04-19T15:42:09Z Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate Tan, Elise Zining Liu Siyuan School of Computer Science and Engineering SYLiu@ntu.edu.sg Computer and Information Science Explainable artificial intelligence Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXplainable Artificial Intelligence (XAI) is important to explain models and identify critical factors affecting lung cancer survival. In addition, real datasets frequently contain missing values due to diverse reasons. In this work, the Simulacrum dataset was used, and different statistical and machine learning imputation methods like Mean-Mode, K-Nearest Neighbors (KNN), MissForest were used to impute the missing data. An innovative neural network imputation method was proposed to evaluate whether the sequence of imputing the features with missing values matters. Neural network models were then trained on the imputed datasets to predict lung cancer patient survival. The results revealed that the models trained on imputed datasets obtained from machine learning imputation methods had better performance and the sequence in which features were imputed had minimal impact on the performance. For interpretability, different XAI approaches were deployed, namely, Expected Gradients (EG), SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The analysis revealed that important features were related to the tumor condition, age and dose administration, largely unaffected by different imputation methods. Pairwise similarity metrics such as Jaccard Similarity, Pearson Correlation Coefficient and Cosine Similarity were used to compare the XAI approaches. EG and SHAP exhibited the highest similarity to each other, though there were some notable inconsistencies between their explanations. This paper underscores the usefulness of XAI for medical professionals in understanding model decisions and emphasizes the importance of cross-validating explanations from various XAI approaches to evaluate model reliability. Bachelor's degree 2024-04-19T04:27:16Z 2024-04-19T04:27:16Z 2024 Final Year Project (FYP) Tan, E. Z. (2024). Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175082 https://hdl.handle.net/10356/175082 en SCSE23-0510 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Explainable artificial intelligence
spellingShingle	Computer and Information Science Explainable artificial intelligence Tan, Elise Zining Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
description	Lung cancer is the most common cancer and the leading cause of cancer-related deaths globally. While Artificial Intelligence (AI) offers promise in early detection of high-risk patients and providing treatment decision support, the black-box nature of AI models raises trust concerns. Therefore, eXplainable Artificial Intelligence (XAI) is important to explain models and identify critical factors affecting lung cancer survival. In addition, real datasets frequently contain missing values due to diverse reasons. In this work, the Simulacrum dataset was used, and different statistical and machine learning imputation methods like Mean-Mode, K-Nearest Neighbors (KNN), MissForest were used to impute the missing data. An innovative neural network imputation method was proposed to evaluate whether the sequence of imputing the features with missing values matters. Neural network models were then trained on the imputed datasets to predict lung cancer patient survival. The results revealed that the models trained on imputed datasets obtained from machine learning imputation methods had better performance and the sequence in which features were imputed had minimal impact on the performance. For interpretability, different XAI approaches were deployed, namely, Expected Gradients (EG), SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME). The analysis revealed that important features were related to the tumor condition, age and dose administration, largely unaffected by different imputation methods. Pairwise similarity metrics such as Jaccard Similarity, Pearson Correlation Coefficient and Cosine Similarity were used to compare the XAI approaches. EG and SHAP exhibited the highest similarity to each other, though there were some notable inconsistencies between their explanations. This paper underscores the usefulness of XAI for medical professionals in understanding model decisions and emphasizes the importance of cross-validating explanations from various XAI approaches to evaluate model reliability.
author2	Liu Siyuan
author_facet	Liu Siyuan Tan, Elise Zining
format	Final Year Project
author	Tan, Elise Zining
author_sort	Tan, Elise Zining
title	Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_short	Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_full	Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_fullStr	Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_full_unstemmed	Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate
title_sort	exploring the similarities of xai approaches in finding the influential factors in lung cancer survival rate
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175082
_version_	1800916339917848576

Exploring the similarities of XAI approaches in finding the influential factors in lung cancer survival rate

Similar Items