Towards deep neural networks robust to adversarial examples

Deep learning has become the dominant approach for any problem where learning from data is necessary, e.g. recognizing objects, understanding natural language. If the data is the "nail", then deep learning is the "hammer". Nevertheless, state-of-the-art deep neural networks are p...

Full description

Saved in:
Bibliographic Details
Main Author: Matyasko, Alexander
Other Authors: Lap-Pui Chau
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/143316
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-143316
record_format dspace
spelling sg-ntu-dr.10356-1433162023-07-04T17:17:15Z Towards deep neural networks robust to adversarial examples Matyasko, Alexander Lap-Pui Chau School of Electrical and Electronic Engineering elpchau@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Deep learning has become the dominant approach for any problem where learning from data is necessary, e.g. recognizing objects, understanding natural language. If the data is the "nail", then deep learning is the "hammer". Nevertheless, state-of-the-art deep neural networks are prone to small perturbations in the input data. For example, recent experiments have shown that adding adversarial noise to inputs creates images that are optically indistinguishable from the original data, but the neural network misclassifies it with high confidence. These adversarially crafted modifications of the input, so-called adversarial examples, are neural network ``blind spots'' and is the main subject of this dissertation. In this thesis, we outline the problem of adversarial examples and show several partial solutions to it. The existence of adversarial examples has spurred significant interest in deep learning research. The research on adversarial examples can be broadly divided into research on attacks and research on defenses. We make original contributions to both fields of research. First of all, we establish a connection between classifier margin and its robustness. We generalize a support vector machine (SVM) margin maximization objective to deep neural networks. We also prove that our formulation is equivalent to robust optimization. In the subsequent work, we suggest that ideally, adversarial examples for the robust classifier should be indistinguishable from the regular data. Unlike approaches based on robust optimization, we do not require that an input noise does not change the label of the input. We formulate a problem of learning robust classifier in the framework of generative adversarial networks (GAN), where an auxiliary network, or an adversary discriminator, is trained to distinguish regular and adversarial data. Then, a robust classifier is trained to classify the original inputs correctly and to fool the discriminator with its adversarial examples. Finally, accurately estimating the model's robustness is a challenging task. Existing attack methods require multiple restarts or do not explicitly minimize the norm of the perturbation. To address the above limitations, we propose a primal-dual proximal gradient attack algorithm. Our attack is fast and accurate. We directly solve the attacker's problem for any Lp-norm for which the proximal operator can be computed in the closed-form. In summary, we present two defenses and one white-box attack in this thesis. Future efforts should address the robustness of deep neural networks to unrestricted adversarial examples, provide strong theoretical guarantees on the model's performances, and verify model robustness for a comprehensive comparison of various defenses. Doctor of Philosophy 2020-08-24T01:05:41Z 2020-08-24T01:05:41Z 2020 Thesis-Doctor of Philosophy Matyasko, A. (2020). Towards deep neural networks robust to adversarial examples. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/143316 10.32657/10356/143316 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Matyasko, Alexander
Towards deep neural networks robust to adversarial examples
description Deep learning has become the dominant approach for any problem where learning from data is necessary, e.g. recognizing objects, understanding natural language. If the data is the "nail", then deep learning is the "hammer". Nevertheless, state-of-the-art deep neural networks are prone to small perturbations in the input data. For example, recent experiments have shown that adding adversarial noise to inputs creates images that are optically indistinguishable from the original data, but the neural network misclassifies it with high confidence. These adversarially crafted modifications of the input, so-called adversarial examples, are neural network ``blind spots'' and is the main subject of this dissertation. In this thesis, we outline the problem of adversarial examples and show several partial solutions to it. The existence of adversarial examples has spurred significant interest in deep learning research. The research on adversarial examples can be broadly divided into research on attacks and research on defenses. We make original contributions to both fields of research. First of all, we establish a connection between classifier margin and its robustness. We generalize a support vector machine (SVM) margin maximization objective to deep neural networks. We also prove that our formulation is equivalent to robust optimization. In the subsequent work, we suggest that ideally, adversarial examples for the robust classifier should be indistinguishable from the regular data. Unlike approaches based on robust optimization, we do not require that an input noise does not change the label of the input. We formulate a problem of learning robust classifier in the framework of generative adversarial networks (GAN), where an auxiliary network, or an adversary discriminator, is trained to distinguish regular and adversarial data. Then, a robust classifier is trained to classify the original inputs correctly and to fool the discriminator with its adversarial examples. Finally, accurately estimating the model's robustness is a challenging task. Existing attack methods require multiple restarts or do not explicitly minimize the norm of the perturbation. To address the above limitations, we propose a primal-dual proximal gradient attack algorithm. Our attack is fast and accurate. We directly solve the attacker's problem for any Lp-norm for which the proximal operator can be computed in the closed-form. In summary, we present two defenses and one white-box attack in this thesis. Future efforts should address the robustness of deep neural networks to unrestricted adversarial examples, provide strong theoretical guarantees on the model's performances, and verify model robustness for a comprehensive comparison of various defenses.
author2 Lap-Pui Chau
author_facet Lap-Pui Chau
Matyasko, Alexander
format Thesis-Doctor of Philosophy
author Matyasko, Alexander
author_sort Matyasko, Alexander
title Towards deep neural networks robust to adversarial examples
title_short Towards deep neural networks robust to adversarial examples
title_full Towards deep neural networks robust to adversarial examples
title_fullStr Towards deep neural networks robust to adversarial examples
title_full_unstemmed Towards deep neural networks robust to adversarial examples
title_sort towards deep neural networks robust to adversarial examples
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/143316
_version_ 1772828645473124352