Counterfactual explanations for machine learning models on heterogeneous data
Counterfactual explanation aims to identify minimal and meaningful changes required in an input instance to produce a different prediction from a given model. Counterfactual explanations can assist users in comprehending the model's current prediction, detecting model unfairness, and providing...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/169968 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-169968 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1699682023-09-04T07:32:08Z Counterfactual explanations for machine learning models on heterogeneous data Wang, Yongjie Miao Chun Yan School of Computer Science and Engineering ASCYMiao@ntu.edu.sg Engineering::Computer science and engineering Counterfactual explanation aims to identify minimal and meaningful changes required in an input instance to produce a different prediction from a given model. Counterfactual explanations can assist users in comprehending the model's current prediction, detecting model unfairness, and providing actionable recommendations for users who receive an undesired prediction. Consequently, counterfactual explanations have diverse applications in fields such as education, finance, marketing, and healthcare. The counterfactual explanation problem is formulated as a constrained optimization problem, where the goal is to minimize the cost between the input and counterfactual explanations subject to certain constraints. Existing research has mainly focused on two areas: incorporating practical constraints and introducing various solving methods. However, counterfactual explanations are still far from practical deployment. In this thesis, we improve this problem from the angles of trust, actionability, and safety, thus making counterfactual explanations more deployable. One goal of counterfactual explanations is to seek action suggestions from the model. However, commonly used models such as ensemble models and neural networks are black boxes with poor trustworthiness. Explaining the model can improve the trustworthiness of models. Yet, global explanations are too general to apply to all instances, and examining all local explanations one by one is also a burden. Therefore, we propose a group-level summarization method that finds $k$ groups, where each group is summarized by the distinct top-$l$ important features for a feature importance matrix. This approach provides a compact summary that makes it easier to understand and inspect the model. In real-life applications, it is difficult to compare changes in heterogeneous features with a scalar cost function. Moreover, existing methods do not support interactive exploration for users. To address them, we propose a skyline method that treats the change of each incomparable feature as an objective to minimize and finds a set of non-dominant counterfactual explanations. Users can interactively refine their requirements from this non-dominated set. Our experiments demonstrate that our method provides superior results compared to state-of-the-art methods Model security and privacy are critical concerns for model owners who want to deploy a counterfactual explanation service. However, these issues have not received much attention in the literature. To address this gap, we propose an efficient and effective attack method that can extract the pretrained model through counterfactual explanations (CFs). Specifically, our method treats CFs as common queries to find counterfactual explanations of counterfactual explanations (CCFs) and then trains a substitute model using pairs of CFs and CCFs. Experiments reveal that our approach can obtain a substitute model with a higher agreement. In summary, our research helps to bridge the research gap between the theoretical understanding and practical use of counterfactual explanations and provides valuable insights for researchers and practitioners in various domains. Doctor of Philosophy 2023-08-18T02:24:14Z 2023-08-18T02:24:14Z 2023 Thesis-Doctor of Philosophy Wang, Y. (2023). Counterfactual explanations for machine learning models on heterogeneous data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/169968 https://hdl.handle.net/10356/169968 10.32657/10356/169968 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Wang, Yongjie Counterfactual explanations for machine learning models on heterogeneous data |
description |
Counterfactual explanation aims to identify minimal and meaningful changes required in an input instance to produce a different prediction from a given model. Counterfactual explanations can assist users in comprehending the model's current prediction, detecting model unfairness, and providing actionable recommendations for users who receive an undesired prediction. Consequently, counterfactual explanations have diverse applications in fields such as education, finance, marketing, and healthcare.
The counterfactual explanation problem is formulated as a constrained optimization problem, where the goal is to minimize the cost between the input and counterfactual explanations subject to certain constraints. Existing research has mainly focused on two areas: incorporating practical constraints and introducing various solving methods. However, counterfactual explanations are still far from practical deployment. In this thesis, we improve this problem from the angles of trust, actionability, and safety, thus making counterfactual explanations more deployable.
One goal of counterfactual explanations is to seek action suggestions from the model. However, commonly used models such as ensemble models and neural networks are black boxes with poor trustworthiness. Explaining the model can improve the trustworthiness of models. Yet, global explanations are too general to apply to all instances, and examining all local explanations one by one is also a burden. Therefore, we propose a group-level summarization method that finds $k$ groups, where each group is summarized by the distinct top-$l$ important features for a feature importance matrix. This approach provides a compact summary that makes it easier to understand and inspect the model.
In real-life applications, it is difficult to compare changes in heterogeneous features with a scalar cost function. Moreover, existing methods do not support interactive exploration for users. To address them, we propose a skyline method that treats the change of each incomparable feature as an objective to minimize and finds a set of non-dominant counterfactual explanations. Users can interactively refine their requirements from this non-dominated set. Our experiments demonstrate that our method provides superior results compared to state-of-the-art methods
Model security and privacy are critical concerns for model owners who want to deploy a counterfactual explanation service. However, these issues have not received much attention in the literature. To address this gap, we propose an efficient and effective attack method that can extract the pretrained model through counterfactual explanations (CFs). Specifically, our method treats CFs as common queries to find counterfactual explanations of counterfactual explanations (CCFs) and then trains a substitute model using pairs of CFs and CCFs. Experiments reveal that our approach can obtain a substitute model with a higher agreement.
In summary, our research helps to bridge the research gap between the theoretical understanding and practical use of counterfactual explanations and provides valuable insights for researchers and practitioners in various domains. |
author2 |
Miao Chun Yan |
author_facet |
Miao Chun Yan Wang, Yongjie |
format |
Thesis-Doctor of Philosophy |
author |
Wang, Yongjie |
author_sort |
Wang, Yongjie |
title |
Counterfactual explanations for machine learning models on heterogeneous data |
title_short |
Counterfactual explanations for machine learning models on heterogeneous data |
title_full |
Counterfactual explanations for machine learning models on heterogeneous data |
title_fullStr |
Counterfactual explanations for machine learning models on heterogeneous data |
title_full_unstemmed |
Counterfactual explanations for machine learning models on heterogeneous data |
title_sort |
counterfactual explanations for machine learning models on heterogeneous data |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/169968 |
_version_ |
1779156764884729856 |