Detection of attacks on artificial intelligence systems
Artificial intelligence (AI) is gradually and profoundly changing production and life, generally used in various fields such as visual information processing, autonomous systems, safety diagnosis and protection. Security issues will eventually become the biggest challenge. The adversarial attack is...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/152977 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Artificial intelligence (AI) is gradually and profoundly changing production and life, generally used in various fields such as visual information processing, autonomous systems, safety diagnosis and protection. Security issues will eventually become the biggest challenge. The adversarial attack is a powerful security threat to Deep Neural Networks (DNNs). This dissertation focuses on a passive defence method -- the detection of adversarial samples. The adversarial sample is essentially different from the normal sample. The dimension of the high-dimensional continuous space in which it is located is much larger than the intrinsic dimensionality of any given data submanifold. Focusing on the Local Intrinsic Dimensionality (LID), a better detector -- LID-based classifier is studied. Four attack methods were used to conduct experiments on two common datasets. The experiments show that the LID-based classifier has a great performance improvement than the single-characteristic classifier based on Kernel Density (KD) and Bayesian Network Uncertainty (BNU) and the combined classifier based on KD&BNU. The improvement is up to 37.44% for the single-characteristic classifiers, and up to 18.65% for the combined classifiers. Then it proves that the LID-based classifier trained based on one attack can detect adversarial samples generated by other attack methods. The classifiers trained on weaker attacks will perform better in the face of adversarial samples generated by other stronger attacks than when testing under the same attack, and vice versa. It fully proves that the LID-based classifier is a very effective means to detect adversarial samples, with certain universality and transferability. Finally, the dissertation also makes a prediction and recommendation for the possible direction of the future work. |
---|