Evaluation of adversarial attacks against deep learning models
Machine learning has been increasingly prevalent in aiding us in our day-to-day lives. They have been and are still useful in performing tasks in different fields such as Computer Vision and Natural Language Processing. However, they are also increasingly targeted by adversaries, who aim to reduc...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171835 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Machine learning has been increasingly prevalent in aiding us in our day-to-day lives. They
have been and are still useful in performing tasks in different fields such as Computer Vision
and Natural Language Processing. However, they are also increasingly targeted by
adversaries, who aim to reduce their effectiveness rendering them useless and unpredictable.
Hence, there is a need to improve the robustness of current machine learning models, to deter
adversarial attacks.
Existing defences have been proven to be useful in deterring known attacks such as Fast
Gradient Sign Method (FGSM), Projected Gradient Descent (PGD) and Carlini and Wagner
(C&W). However, in recent times, adaptive attacks such as Backward Pass Differential
Approximation (BPDA) and AutoAttack (AA), have been able to counteract existing defence
techniques, rendering them ineffective.
In this project, we focus on adversarial defences in the field of Computer Vision. In our
experiments, we employed various input preprocessing techniques as defence such as JPEG
compression, Total Variance Minimization (TVM), Spatial Smoothing, Bit-depth Reduction,
Principal Component Analysis (PCA) and Pixel Deflection to remove adversarial
perturbations from input data. These defence techniques have been evaluated on the
ResNet-20 and ResNet-56 networks, trained with CIFAR-10 and CIFAR-100 datasets. The
image inputs were adversarially perturbed using several known attacks such as C&W, PGD
and AA. |
---|