Counterfactual explanations for machine learning models on heterogeneous data

Counterfactual explanation aims to identify minimal and meaningful changes required in an input instance to produce a different prediction from a given model. Counterfactual explanations can assist users in comprehending the model's current prediction, detecting model unfairness, and providing...

Full description

Saved in:

Bibliographic Details
Main Author:	Wang, Yongjie
Other Authors:	Miao Chun Yan
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/169968
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-169968
record_format	dspace
spelling	sg-ntu-dr.10356-1699682023-09-04T07:32:08Z Counterfactual explanations for machine learning models on heterogeneous data Wang, Yongjie Miao Chun Yan School of Computer Science and Engineering ASCYMiao@ntu.edu.sg Engineering::Computer science and engineering Counterfactual explanation aims to identify minimal and meaningful changes required in an input instance to produce a different prediction from a given model. Counterfactual explanations can assist users in comprehending the model's current prediction, detecting model unfairness, and providing actionable recommendations for users who receive an undesired prediction. Consequently, counterfactual explanations have diverse applications in fields such as education, finance, marketing, and healthcare. The counterfactual explanation problem is formulated as a constrained optimization problem, where the goal is to minimize the cost between the input and counterfactual explanations subject to certain constraints. Existing research has mainly focused on two areas: incorporating practical constraints and introducing various solving methods. However, counterfactual explanations are still far from practical deployment. In this thesis, we improve this problem from the angles of trust, actionability, and safety, thus making counterfactual explanations more deployable. One goal of counterfactual explanations is to seek action suggestions from the model. However, commonly used models such as ensemble models and neural networks are black boxes with poor trustworthiness. Explaining the model can improve the trustworthiness of models. Yet, global explanations are too general to apply to all instances, and examining all local explanations one by one is also a burden. Therefore, we propose a group-level summarization method that finds $k$ groups, where each group is summarized by the distinct top-$l$ important features for a feature importance matrix. This approach provides a compact summary that makes it easier to understand and inspect the model. In real-life applications, it is difficult to compare changes in heterogeneous features with a scalar cost function. Moreover, existing methods do not support interactive exploration for users. To address them, we propose a skyline method that treats the change of each incomparable feature as an objective to minimize and finds a set of non-dominant counterfactual explanations. Users can interactively refine their requirements from this non-dominated set. Our experiments demonstrate that our method provides superior results compared to state-of-the-art methods Model security and privacy are critical concerns for model owners who want to deploy a counterfactual explanation service. However, these issues have not received much attention in the literature. To address this gap, we propose an efficient and effective attack method that can extract the pretrained model through counterfactual explanations (CFs). Specifically, our method treats CFs as common queries to find counterfactual explanations of counterfactual explanations (CCFs) and then trains a substitute model using pairs of CFs and CCFs. Experiments reveal that our approach can obtain a substitute model with a higher agreement. In summary, our research helps to bridge the research gap between the theoretical understanding and practical use of counterfactual explanations and provides valuable insights for researchers and practitioners in various domains. Doctor of Philosophy 2023-08-18T02:24:14Z 2023-08-18T02:24:14Z 2023 Thesis-Doctor of Philosophy Wang, Y. (2023). Counterfactual explanations for machine learning models on heterogeneous data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/169968 https://hdl.handle.net/10356/169968 10.32657/10356/169968 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Wang, Yongjie Counterfactual explanations for machine learning models on heterogeneous data
description	Counterfactual explanation aims to identify minimal and meaningful changes required in an input instance to produce a different prediction from a given model. Counterfactual explanations can assist users in comprehending the model's current prediction, detecting model unfairness, and providing actionable recommendations for users who receive an undesired prediction. Consequently, counterfactual explanations have diverse applications in fields such as education, finance, marketing, and healthcare. The counterfactual explanation problem is formulated as a constrained optimization problem, where the goal is to minimize the cost between the input and counterfactual explanations subject to certain constraints. Existing research has mainly focused on two areas: incorporating practical constraints and introducing various solving methods. However, counterfactual explanations are still far from practical deployment. In this thesis, we improve this problem from the angles of trust, actionability, and safety, thus making counterfactual explanations more deployable. One goal of counterfactual explanations is to seek action suggestions from the model. However, commonly used models such as ensemble models and neural networks are black boxes with poor trustworthiness. Explaining the model can improve the trustworthiness of models. Yet, global explanations are too general to apply to all instances, and examining all local explanations one by one is also a burden. Therefore, we propose a group-level summarization method that finds $k$ groups, where each group is summarized by the distinct top-$l$ important features for a feature importance matrix. This approach provides a compact summary that makes it easier to understand and inspect the model. In real-life applications, it is difficult to compare changes in heterogeneous features with a scalar cost function. Moreover, existing methods do not support interactive exploration for users. To address them, we propose a skyline method that treats the change of each incomparable feature as an objective to minimize and finds a set of non-dominant counterfactual explanations. Users can interactively refine their requirements from this non-dominated set. Our experiments demonstrate that our method provides superior results compared to state-of-the-art methods Model security and privacy are critical concerns for model owners who want to deploy a counterfactual explanation service. However, these issues have not received much attention in the literature. To address this gap, we propose an efficient and effective attack method that can extract the pretrained model through counterfactual explanations (CFs). Specifically, our method treats CFs as common queries to find counterfactual explanations of counterfactual explanations (CCFs) and then trains a substitute model using pairs of CFs and CCFs. Experiments reveal that our approach can obtain a substitute model with a higher agreement. In summary, our research helps to bridge the research gap between the theoretical understanding and practical use of counterfactual explanations and provides valuable insights for researchers and practitioners in various domains.
author2	Miao Chun Yan
author_facet	Miao Chun Yan Wang, Yongjie
format	Thesis-Doctor of Philosophy
author	Wang, Yongjie
author_sort	Wang, Yongjie
title	Counterfactual explanations for machine learning models on heterogeneous data
title_short	Counterfactual explanations for machine learning models on heterogeneous data
title_full	Counterfactual explanations for machine learning models on heterogeneous data
title_fullStr	Counterfactual explanations for machine learning models on heterogeneous data
title_full_unstemmed	Counterfactual explanations for machine learning models on heterogeneous data
title_sort	counterfactual explanations for machine learning models on heterogeneous data
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/169968
_version_	1779156764884729856

Counterfactual explanations for machine learning models on heterogeneous data

Similar Items