Demystifying adversarial attacks on neural networks

Prevalent use of Neural Networks for Classification Tasks has brought to attention the security and integrity of the Neural Networks that industries are so reliant on. Adversarial examples are conspicuous to humans, but neural networks struggle to correctly classify images with the presence of adver...

Full description

Saved in:
Bibliographic Details
Main Author: Yip, Lionell En Zhi
Other Authors: Anupam Chattopadhyay
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137946
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Prevalent use of Neural Networks for Classification Tasks has brought to attention the security and integrity of the Neural Networks that industries are so reliant on. Adversarial examples are conspicuous to humans, but neural networks struggle to correctly classify images with the presence of adversarial perturbations. I introduce a framework for understanding how neural networks perceive inputs, and its relation to adversarial attack methods. I demonstrate that there is no correlation between the region of importance and the region of attack. I demonstrate that a frequently perturbed region of an adversarial example across a class in a data-set exists. I attempt to improve classification performance by exploiting the differences of input and adversarial attack, and I demonstrate a novel augmentation method for improving prediction performance of adversarial samples.