Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection
Model attributions are important in deep neural networks as they aid practitioners in understanding the models, but recent studies reveal that attributions can be easily perturbed by adding imperceptible noise to the input. The non-differentiable Kendall's rank correlation is a key performan...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/161935 https://nips.cc/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-161935 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1619352024-05-28T07:35:40Z Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection Wang, Fan Kong, Adams Wai Kin School of Computer Science and Engineering Interdisciplinary Graduate School (IGS) Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS 2022) Rapid-Rich Object Search (ROSE) Lab Computer Science - Learning Computer Science - Artificial Intelligence Artificial Intelligence Neural Networks Model attributions are important in deep neural networks as they aid practitioners in understanding the models, but recent studies reveal that attributions can be easily perturbed by adding imperceptible noise to the input. The non-differentiable Kendall's rank correlation is a key performance index for attribution protection. In this paper, we first show that the expected Kendall's rank correlation is positively correlated to cosine similarity and then indicate that the direction of attribution is the key to attribution robustness. Based on these findings, we explore the vector space of attribution to explain the shortcomings of attribution defense methods using $\ell_p$ norm and propose integrated gradient regularizer (IGR), which maximizes the cosine similarity between natural and perturbed attributions. Our analysis further exposes that IGR encourages neurons with the same activation states for natural samples and the corresponding perturbed samples, which is shown to induce robustness to gradient-based attribution methods. Our experiments on different models and datasets confirm our analysis on attribution protection and demonstrate a decent improvement in adversarial robustness. Ministry of Education (MOE) Submitted/Accepted version This work is partially supported by the Ministry of Education, Singapore through Academic Research Fund Tier 1, RG73/21. 2022-12-19T05:25:41Z 2022-12-19T05:25:41Z 2022 Conference Paper Wang, F. & Kong, A. W. K. (2022). Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection. Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS 2022). https://hdl.handle.net/10356/161935 https://nips.cc/ en RG73/21 © 2022 The Author(s). All rights reserved. This paper was published in the Proceedings of Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS 2022) and is made available with permission of The Author(s). application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer Science - Learning Computer Science - Artificial Intelligence Artificial Intelligence Neural Networks |
spellingShingle |
Computer Science - Learning Computer Science - Artificial Intelligence Artificial Intelligence Neural Networks Wang, Fan Kong, Adams Wai Kin Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection |
description |
Model attributions are important in deep neural networks as they aid
practitioners in understanding the models, but recent studies reveal that
attributions can be easily perturbed by adding imperceptible noise to the
input. The non-differentiable Kendall's rank correlation is a key performance
index for attribution protection. In this paper, we first show that the
expected Kendall's rank correlation is positively correlated to cosine
similarity and then indicate that the direction of attribution is the key to
attribution robustness. Based on these findings, we explore the vector space of
attribution to explain the shortcomings of attribution defense methods using
$\ell_p$ norm and propose integrated gradient regularizer (IGR), which
maximizes the cosine similarity between natural and perturbed attributions. Our
analysis further exposes that IGR encourages neurons with the same activation
states for natural samples and the corresponding perturbed samples, which is
shown to induce robustness to gradient-based attribution methods. Our
experiments on different models and datasets confirm our analysis on
attribution protection and demonstrate a decent improvement in adversarial
robustness. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Wang, Fan Kong, Adams Wai Kin |
format |
Conference or Workshop Item |
author |
Wang, Fan Kong, Adams Wai Kin |
author_sort |
Wang, Fan |
title |
Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection |
title_short |
Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection |
title_full |
Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection |
title_fullStr |
Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection |
title_full_unstemmed |
Exploiting the relationship between Kendall’s rank correlation and cosine similarity for attribution protection |
title_sort |
exploiting the relationship between kendall’s rank correlation and cosine similarity for attribution protection |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/161935 https://nips.cc/ |
_version_ |
1800916256621068288 |