Evaluation of adversarial attacks against deep learning models
The aim of the project is to evaluate and improve adversarial attacks against deep learning models. Specifically, this project will focus on attacks on deep learning models specific to image processing and classification and will only take references from defense methods. The definition of evaluat...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/165158 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The aim of the project is to evaluate and improve adversarial attacks against deep
learning models. Specifically, this project will focus on attacks on deep learning
models specific to image processing and classification and will only take references from defense methods. The definition of evaluation and improvement
for this project would be to improve attack timings under the same computing
resource and target model.
A specific adversarial attack has been chosen in this project to focus on. Hence,
understanding the various adversarial attacks on deep learning image classification models is essential in choosing which attack to focus on. The
Projected Gradient Descent (PGD) attack was chosen for in-depth evaluation and
improvement. The PGD attack explored in this project is a white box untargeted
attack, specifically focusing on ∞ bound. Two datasets have also been used, the
ImageNet dataset and CIFAR-10 dataset.
Methods in order to improve effectiveness of the PGD attack were designed and
attempted with the aim of improving attack timings. This is done in the form of
attack variants. Four PGD variants have been crafted to test and improve its
effectiveness against the standard PGD attacks and will be compared in two areas, attack timing and attack success rate (ASR). These four variants centre around the concept of randomness and decay in step sizes as well as the surrogate loss function of the PGD attacks. In total, six PGD attacks have been crafted, four variants and two standard PGD attacks for comparison. These six PGD attacks are then implemented on the two different datasets mentioned above, ImageNet and CIFAR-10. Hence, a total of twelve attacks were designed and implemented.
The experimental results show that the PGD variants do improve effectiveness
compared to standard PGD attacks. However, the variants are only effective in
one area out of the two, either more effective in terms of attack timing or ASR.
Brief improvements also have been attempted on the variants. Since the
improvements attempted are brief, more research has to be done to successfully
improve the variants. The experimental results also show that the properties of
the variants along with using different surrogate loss functions produce different
results when implemented on the different datasets.
Overall, this project has proven that the concept of randomness in step size and
concept of decay in step size are important factors of improving effectiveness of
PGD attack. Additionally, this project has also proven that certain surrogate loss
function is more suited and improves effectiveness of PGD attacks better for
certain datasets. Definitely, this project opens opportunities for future work. |
---|