Towards deep neural networks robust to adversarial examples

Deep learning has become the dominant approach for any problem where learning from data is necessary, e.g. recognizing objects, understanding natural language. If the data is the "nail", then deep learning is the "hammer". Nevertheless, state-of-the-art deep neural networks are p...

Full description

Saved in:

Bibliographic Details
Main Author:	Matyasko, Alexander
Other Authors:	Lap-Pui Chau
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/143316
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-143316
record_format	dspace
spelling	sg-ntu-dr.10356-1433162023-07-04T17:17:15Z Towards deep neural networks robust to adversarial examples Matyasko, Alexander Lap-Pui Chau School of Electrical and Electronic Engineering elpchau@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Deep learning has become the dominant approach for any problem where learning from data is necessary, e.g. recognizing objects, understanding natural language. If the data is the "nail", then deep learning is the "hammer". Nevertheless, state-of-the-art deep neural networks are prone to small perturbations in the input data. For example, recent experiments have shown that adding adversarial noise to inputs creates images that are optically indistinguishable from the original data, but the neural network misclassifies it with high confidence. These adversarially crafted modifications of the input, so-called adversarial examples, are neural network ``blind spots'' and is the main subject of this dissertation. In this thesis, we outline the problem of adversarial examples and show several partial solutions to it. The existence of adversarial examples has spurred significant interest in deep learning research. The research on adversarial examples can be broadly divided into research on attacks and research on defenses. We make original contributions to both fields of research. First of all, we establish a connection between classifier margin and its robustness. We generalize a support vector machine (SVM) margin maximization objective to deep neural networks. We also prove that our formulation is equivalent to robust optimization. In the subsequent work, we suggest that ideally, adversarial examples for the robust classifier should be indistinguishable from the regular data. Unlike approaches based on robust optimization, we do not require that an input noise does not change the label of the input. We formulate a problem of learning robust classifier in the framework of generative adversarial networks (GAN), where an auxiliary network, or an adversary discriminator, is trained to distinguish regular and adversarial data. Then, a robust classifier is trained to classify the original inputs correctly and to fool the discriminator with its adversarial examples. Finally, accurately estimating the model's robustness is a challenging task. Existing attack methods require multiple restarts or do not explicitly minimize the norm of the perturbation. To address the above limitations, we propose a primal-dual proximal gradient attack algorithm. Our attack is fast and accurate. We directly solve the attacker's problem for any Lp-norm for which the proximal operator can be computed in the closed-form. In summary, we present two defenses and one white-box attack in this thesis. Future efforts should address the robustness of deep neural networks to unrestricted adversarial examples, provide strong theoretical guarantees on the model's performances, and verify model robustness for a comprehensive comparison of various defenses. Doctor of Philosophy 2020-08-24T01:05:41Z 2020-08-24T01:05:41Z 2020 Thesis-Doctor of Philosophy Matyasko, A. (2020). Towards deep neural networks robust to adversarial examples. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/143316 10.32657/10356/143316 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Matyasko, Alexander Towards deep neural networks robust to adversarial examples
description	Deep learning has become the dominant approach for any problem where learning from data is necessary, e.g. recognizing objects, understanding natural language. If the data is the "nail", then deep learning is the "hammer". Nevertheless, state-of-the-art deep neural networks are prone to small perturbations in the input data. For example, recent experiments have shown that adding adversarial noise to inputs creates images that are optically indistinguishable from the original data, but the neural network misclassifies it with high confidence. These adversarially crafted modifications of the input, so-called adversarial examples, are neural network ``blind spots'' and is the main subject of this dissertation. In this thesis, we outline the problem of adversarial examples and show several partial solutions to it. The existence of adversarial examples has spurred significant interest in deep learning research. The research on adversarial examples can be broadly divided into research on attacks and research on defenses. We make original contributions to both fields of research. First of all, we establish a connection between classifier margin and its robustness. We generalize a support vector machine (SVM) margin maximization objective to deep neural networks. We also prove that our formulation is equivalent to robust optimization. In the subsequent work, we suggest that ideally, adversarial examples for the robust classifier should be indistinguishable from the regular data. Unlike approaches based on robust optimization, we do not require that an input noise does not change the label of the input. We formulate a problem of learning robust classifier in the framework of generative adversarial networks (GAN), where an auxiliary network, or an adversary discriminator, is trained to distinguish regular and adversarial data. Then, a robust classifier is trained to classify the original inputs correctly and to fool the discriminator with its adversarial examples. Finally, accurately estimating the model's robustness is a challenging task. Existing attack methods require multiple restarts or do not explicitly minimize the norm of the perturbation. To address the above limitations, we propose a primal-dual proximal gradient attack algorithm. Our attack is fast and accurate. We directly solve the attacker's problem for any Lp-norm for which the proximal operator can be computed in the closed-form. In summary, we present two defenses and one white-box attack in this thesis. Future efforts should address the robustness of deep neural networks to unrestricted adversarial examples, provide strong theoretical guarantees on the model's performances, and verify model robustness for a comprehensive comparison of various defenses.
author2	Lap-Pui Chau
author_facet	Lap-Pui Chau Matyasko, Alexander
format	Thesis-Doctor of Philosophy
author	Matyasko, Alexander
author_sort	Matyasko, Alexander
title	Towards deep neural networks robust to adversarial examples
title_short	Towards deep neural networks robust to adversarial examples
title_full	Towards deep neural networks robust to adversarial examples
title_fullStr	Towards deep neural networks robust to adversarial examples
title_full_unstemmed	Towards deep neural networks robust to adversarial examples
title_sort	towards deep neural networks robust to adversarial examples
publisher	Nanyang Technological University
publishDate	2020
url	https://hdl.handle.net/10356/143316
_version_	1772828645473124352

Towards deep neural networks robust to adversarial examples

Similar Items