Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection

Deep learning is a popular powerful machine learning solution to the computer vision tasks. The most criticized vulnerability of deep learning is its poor tolerance towards adversarial images obtained by deliberately adding imperceptibly small perturbations to the clean inputs. Such negatives can d...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Si, Liu, Wenye, Chang, Chip-Hong
Other Authors:	School of Electrical and Electronic Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2020
Subjects:	Engineering::Electrical and electronic engineering::Integrated circuits Machine Learning Security Adversarial Attack
Online Access:	https://hdl.handle.net/10356/137128 https://doi.org/10.21979/N9/WCIL7X
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-137128
record_format	dspace
spelling	sg-ntu-dr.10356-1371282021-01-18T04:50:19Z Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection Wang, Si Liu, Wenye Chang, Chip-Hong School of Electrical and Electronic Engineering 2019 IEEE Asian Hardware Oriented Security and Trust Symposium Engineering::Electrical and electronic engineering::Integrated circuits Machine Learning Security Adversarial Attack Deep learning is a popular powerful machine learning solution to the computer vision tasks. The most criticized vulnerability of deep learning is its poor tolerance towards adversarial images obtained by deliberately adding imperceptibly small perturbations to the clean inputs. Such negatives can delude a classifier into wrong decision making. Previous defensive techniques mostly focused on refining the models or input transformation. They are either implemented only with small datasets or shown to have limited success. Furthermore, they are rarely scrutinized from the hardware perspective despite Artificial Intelligence (AI) on a chip is a roadmap for embedded intelligence everywhere. In this paper we propose a new discriminative noise injection strategy to adaptively select a few dominant layers and progressively discriminate adversarial from benign inputs. This is made possible by evaluating the differences in label change rate from both adversarial and natural images by injecting different amount of noise into the weights of individual layers in the model. The approach is evaluated on the ImageNet Dataset with 8-bit truncated models for the state-of-the-art DNN architectures. The results show a high detection rate of up to 88.00% with only approximately 5% of false positive rate for MobileNet. Both detection rate and false positive rate have been improved well above existing advanced defenses against the most practical noninvasive universal perturbation attack on deep learning based AI chip. MOE (Min. of Education, S’pore) Accepted version 2020-03-02T01:24:12Z 2020-03-02T01:24:12Z 2019 Conference Paper Wang, S., Liu, W., & Chang, C.-H. (2019). Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection. 2019 Asian Hardware Oriented Security and Trust Symposium (AsianHOST). doi:10.1109/AsianHOST47458.2019.9006702 https://hdl.handle.net/10356/137128 10.1109/AsianHOST47458.2019.9006702 en https://doi.org/10.21979/N9/WCIL7X © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/AsianHOST47458.2019.9006702 application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering::Integrated circuits Machine Learning Security Adversarial Attack
spellingShingle	Engineering::Electrical and electronic engineering::Integrated circuits Machine Learning Security Adversarial Attack Wang, Si Liu, Wenye Chang, Chip-Hong Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
description	Deep learning is a popular powerful machine learning solution to the computer vision tasks. The most criticized vulnerability of deep learning is its poor tolerance towards adversarial images obtained by deliberately adding imperceptibly small perturbations to the clean inputs. Such negatives can delude a classifier into wrong decision making. Previous defensive techniques mostly focused on refining the models or input transformation. They are either implemented only with small datasets or shown to have limited success. Furthermore, they are rarely scrutinized from the hardware perspective despite Artificial Intelligence (AI) on a chip is a roadmap for embedded intelligence everywhere. In this paper we propose a new discriminative noise injection strategy to adaptively select a few dominant layers and progressively discriminate adversarial from benign inputs. This is made possible by evaluating the differences in label change rate from both adversarial and natural images by injecting different amount of noise into the weights of individual layers in the model. The approach is evaluated on the ImageNet Dataset with 8-bit truncated models for the state-of-the-art DNN architectures. The results show a high detection rate of up to 88.00% with only approximately 5% of false positive rate for MobileNet. Both detection rate and false positive rate have been improved well above existing advanced defenses against the most practical noninvasive universal perturbation attack on deep learning based AI chip.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Wang, Si Liu, Wenye Chang, Chip-Hong
format	Conference or Workshop Item
author	Wang, Si Liu, Wenye Chang, Chip-Hong
author_sort	Wang, Si
title	Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
title_short	Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
title_full	Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
title_fullStr	Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
title_full_unstemmed	Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
title_sort	detecting adversarial examples for deep neural networks via layer directed discriminative noise injection
publishDate	2020
url	https://hdl.handle.net/10356/137128 https://doi.org/10.21979/N9/WCIL7X
_version_	1690658402432188416

Detecting adversarial examples for deep neural networks via layer directed discriminative noise injection

Similar Items