Fired neuron rate based decision tree for detection of adversarial examples in DNNs

Deep neural network (DNN) is a prevalent machine learning solution to computer vision problems. The most criticized vulnerability of deep learning is its susceptibility towards adversarial images crafted by maliciously adding infinitesimal distortions to the benign inputs. Such negatives can fool a...

全面介紹

Saved in:

書目詳細資料
Main Authors:	Wang, Si, Liu, Wenye, Chang, Chip-Hong
其他作者:	School of Electrical and Electronic Engineering
格式:	Conference or Workshop Item
語言:	English
出版:	2020
主題:	Engineering::Electrical and electronic engineering::Computer hardware, software and systems Deep Learning Security Adversarial Attack
在線閱讀:	https://hdl.handle.net/10356/144346 https://doi.org/10.21979/N9/YPY0EB
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

實物特徵
總結:	Deep neural network (DNN) is a prevalent machine learning solution to computer vision problems. The most criticized vulnerability of deep learning is its susceptibility towards adversarial images crafted by maliciously adding infinitesimal distortions to the benign inputs. Such negatives can fool a classifier. Existing countermeasures against these adversarial attacks are mainly developed based on software model of DNNs by using modified training during learning or modified input during testing, modifying networks or changing loss/activation functions, or relying on add-on models for classifying unseen examples. These approaches do not consider the optimization for hardware implementation of the learning models. In this paper, a new thresholding method is proposed based on comparators integrated into the most discriminative layers of the DNN determined by their layer-wise fired neuron rates between adversarial and normal inputs. Effectiveness of the method is validated on the ImageNet dataset with 8-bit truncated models for the state-of-the-art DNN architectures. A high detection rate of up to 98% with only 4.5% of false positive rate is achieved. The results show a significant improvement on both detection rate and false positive rate compared with previous countermeasures against the most practical non-invasive universal perturbation attack on deep learning based AI chip.

Fired neuron rate based decision tree for detection of adversarial examples in DNNs

相似書籍