Evaluation of adversarial attacks against deep learning models

Machine learning has been increasingly prevalent in aiding us in our day-to-day lives. They have been and are still useful in performing tasks in different fields such as Computer Vision and Natural Language Processing. However, they are also increasingly targeted by adversaries, who aim to reduc...

Full description

Saved in:
Bibliographic Details
Main Author: Chua, Jonathan Wen Rong
Other Authors: Zhang Tianwei
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171835
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Machine learning has been increasingly prevalent in aiding us in our day-to-day lives. They have been and are still useful in performing tasks in different fields such as Computer Vision and Natural Language Processing. However, they are also increasingly targeted by adversaries, who aim to reduce their effectiveness rendering them useless and unpredictable. Hence, there is a need to improve the robustness of current machine learning models, to deter adversarial attacks. Existing defences have been proven to be useful in deterring known attacks such as Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD) and Carlini and Wagner (C&W). However, in recent times, adaptive attacks such as Backward Pass Differential Approximation (BPDA) and AutoAttack (AA), have been able to counteract existing defence techniques, rendering them ineffective. In this project, we focus on adversarial defences in the field of Computer Vision. In our experiments, we employed various input preprocessing techniques as defence such as JPEG compression, Total Variance Minimization (TVM), Spatial Smoothing, Bit-depth Reduction, Principal Component Analysis (PCA) and Pixel Deflection to remove adversarial perturbations from input data. These defence techniques have been evaluated on the ResNet-20 and ResNet-56 networks, trained with CIFAR-10 and CIFAR-100 datasets. The image inputs were adversarially perturbed using several known attacks such as C&W, PGD and AA.