Statistical diagnosis system for adversarial examples

DeepNeuralNetworks (DNNs) are powerful to the classification tasks, finding the potential links between dataset with high accuracy and speed. However, the DNNs are also fragile to intentionally produced adversarial attacks, especially in the field of image analysis where also the concept of adver...

Full description

Saved in:
Bibliographic Details
Main Author: Wu, Yuting
Other Authors: Wang Dan Wei
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140900
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:DeepNeuralNetworks (DNNs) are powerful to the classification tasks, finding the potential links between dataset with high accuracy and speed. However, the DNNs are also fragile to intentionally produced adversarial attacks, especially in the field of image analysis where also the concept of adversarial examples first emerged. These adversarial perturbations are designed to be quasi-imperceptible to human vision but can easily fool the deep models with high confidence. This situation aroused researchers’ great interest in detection and defense of adversarial examples to improve the reliability of the deep neural networks which would play an important role in safety and security systems in the coming future. In the view of that, this work will first give a brief view of common attacks on Mnist and Cifar-10 dataset with a general concept on what adversarial examples are and how to generate it. After that, different kinds of defense methods will be introduced, and we will mainly focus on statistical defense way. Experiments are conducted to evaluate the performance of those existing defense methods on their merits and demerits. In chapter 4, an improvement on Principal Component Analysis with Gaussian Mixture Model method is proposed which enables it to detect adversarial examples from dataset attacked by C&W attack. In chapter 5, this dissertation proposes an improvedKernel-Density-Estimation detection method based on a Deep Graph infomax. We assume that a simple modification on loss function, with an extra loss that maximizes the mutual information between images and their deep representations, the DNN models could extract more key information from input images to their deep feature map. After the modification, it would help the model detect the unique feature of adversarial examples and improve the detection result. The experiment in chapter 5 has demonstrated the verification of our assumption.