Defence on unrestricted adversarial examples

Deep neural networks in image classification have gained popularity in recent years, and as such, have also become the target of attacks. Adversarial samples are inputs crafted to fool neural networks into misclassification. They come in two forms: one is created by adding specific perturbatio...

Full description

Saved in:

Bibliographic Details
Main Author:	Chan, Jarod Yan Cheng
Other Authors:	Jun Zhao
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/149008
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	Deep neural networks in image classification have gained popularity in recent years, and as such, have also become the target of attacks. Adversarial samples are inputs crafted to fool neural networks into misclassification. They come in two forms: one is created by adding specific perturbations to pixels in an image and the second is through generative models or transformations, called unrestricted adversarial samples, which will be the focus of this paper. Conventional methods that make use of the neural network’s gradients are less effective against unrestricted adversarial samples. This paper proposes making use of Generative Adversarial Networks (GANs) which are neural networks that generate images through learning the differences between real and fake images. Transfer learning is used from parts of the GAN to train a general network to distinguish between images created by generative models and real images. Neural networks can be protected from unrestricted adversarial attack through detection of the presence of adversarial images and prevent them from being input to the neural networks. Experiments from the project show that when trained on a dataset of real and adversarial images, the model can differentiate these two classes of images. Testing on images outside of the dataset distribution however yields worse results.

Defence on unrestricted adversarial examples

Similar Items