Adversarial attack defences for neural network
Since the advent of deep learning, we have been wielding them to solve intricate problems in the field of natural language processing, image processing, etc. Furthermore, we have been deploying complex deep learning models in real-time systems like autonomous vehicles, security cameras, etc purel...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/157133 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Since the advent of deep learning, we have been wielding them to solve intricate problems
in the field of natural language processing, image processing, etc. Furthermore, we have
been deploying complex deep learning models in real-time systems like autonomous vehicles,
security cameras, etc purely based on their precision only to realize that these high precision
models can be vulnerable to a variety of adversaries in the environment, that can hamper the
overall robustness of our deep learning models.
The contemporary defense strategies in the market either cannot alleviate a variety of
adversarial attacks primarily in a white box environment or do not have a standardized
approach that can be applied to any form of the complex deep-learning models to make them
inert from a variety of adversaries. Moreover, there is a need for standardized adversarial
defense strategies for mitigating a variety of adversarial attacks to make our models more
robust in a white box environment.
In this project, we make use of three different state-of-the-art deep-learning architectures
trained on 2 benchmarking datasets – CIFAR-10 and CIFAR-100, to analyze the difference in
the performance of these models in the absence of an adversary as well as in the presence of an
adversary in a white-box environment. We primarily use two white box attack methodologies –
Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) to plant adversarial
samples using epsilon values ranging from 0.1 to 0.8. Furthermore, we go one step further
to devise a defense strategy – Defensive Distillation, that can be applied to a deep-learning
architecture to deplete the overall efficacy of FGSM and PGD attacks. |
---|