Detecting adversarial samples for deep neural networks through mutation testing
Deep Neural Networks (DNNs) are adept at many tasks, with the more well-known task of image recognition using a subset of DNNs called Convolutional Neural Networks (CNNs). However, they are prone to attacks called adversarial attacks. Adversarial attacks are malicious modifications made on input sam...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/138719 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-138719 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1387192023-07-07T18:17:52Z Detecting adversarial samples for deep neural networks through mutation testing Tan, Kye Yen Chang Chip Hong School of Electrical and Electronic Engineering echchang@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering Deep Neural Networks (DNNs) are adept at many tasks, with the more well-known task of image recognition using a subset of DNNs called Convolutional Neural Networks (CNNs). However, they are prone to attacks called adversarial attacks. Adversarial attacks are malicious modifications made on input samples to the DNN that causes the DNN to fail at its task. In the case of image recognition, which is the focus of this project, adversarial attacks result in misclassification of images by the CNN. These attacks are conducted by deliberately adding perturbations imperceptible to humans in images before being fed into the CNN. This is a serious breach of security in CNNs which may lead to disastrous consequences in security reliant applications. Finding a defence mechanism for these attacks are imperative in ensuring the safe operation of CNNs. The first line of defence for CNNs against adversarial attacks is the detection of the adversarial images. This method of defence has been a topic for scrutiny to achieve not only high accuracy but also being real-time. Currently, high detection rate is computationally intensive, leading to increased time to detect the adversaries. Therefore, in this final year project, two methods were proposed to detect adversarial images with lower computational effort. The first method employs network prediction inconsistency concept, which has shown that adversarial inputs are more sensitive to model mutation than the natural inputs. It optimizes previous mutation testing method by implementing partial mutation to the statistically determined most distinguishable areas of the CNNs, instead of blindly implemented random mutations. These specific mutations in the CNNs causes changes in the output prediction which determines the inputs as adversarial. The second method makes use of the difference in layer-wise firing neuron rate distribution between adversarial and normal images to build a decision tree for adversarial detection. Both methods had shown reasonable detection rate. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-05-12T04:04:01Z 2020-05-12T04:04:01Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/138719 en A2040-191 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering Tan, Kye Yen Detecting adversarial samples for deep neural networks through mutation testing |
description |
Deep Neural Networks (DNNs) are adept at many tasks, with the more well-known task of image recognition using a subset of DNNs called Convolutional Neural Networks (CNNs). However, they are prone to attacks called adversarial attacks. Adversarial attacks are malicious modifications made on input samples to the DNN that causes the DNN to fail at its task. In the case of image recognition, which is the focus of this project, adversarial attacks result in misclassification of images by the CNN. These attacks are conducted by deliberately adding perturbations imperceptible to humans in images before being fed into the CNN. This is a serious breach of security in CNNs which may lead to disastrous consequences in security reliant applications. Finding a defence mechanism for these attacks are imperative in ensuring the safe operation of CNNs. The first line of defence for CNNs against adversarial attacks is the detection of the adversarial images. This method of defence has been a topic for scrutiny to achieve not only high accuracy but also being real-time. Currently, high detection rate is computationally intensive, leading to increased time to detect the adversaries. Therefore, in this final year project, two methods were proposed to detect adversarial images with lower computational effort. The first method employs network prediction inconsistency concept, which has shown that adversarial inputs are more sensitive to model mutation than the natural inputs. It optimizes previous mutation testing method by implementing partial mutation to the statistically determined most distinguishable areas of the CNNs, instead of blindly implemented random mutations. These specific mutations in the CNNs causes changes in the output prediction which determines the inputs as adversarial. The second method makes use of the difference in layer-wise firing neuron rate distribution between adversarial and normal images to build a decision tree for adversarial detection. Both methods had shown reasonable detection rate. |
author2 |
Chang Chip Hong |
author_facet |
Chang Chip Hong Tan, Kye Yen |
format |
Final Year Project |
author |
Tan, Kye Yen |
author_sort |
Tan, Kye Yen |
title |
Detecting adversarial samples for deep neural networks through mutation testing |
title_short |
Detecting adversarial samples for deep neural networks through mutation testing |
title_full |
Detecting adversarial samples for deep neural networks through mutation testing |
title_fullStr |
Detecting adversarial samples for deep neural networks through mutation testing |
title_full_unstemmed |
Detecting adversarial samples for deep neural networks through mutation testing |
title_sort |
detecting adversarial samples for deep neural networks through mutation testing |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/138719 |
_version_ |
1772828663858855936 |