Investigating the causes of the vulnerability of CNNs to adversarial perturbations: learning objective, model components, and learned representations

This work focuses on understanding how adversarial perturbations can disrupt the behavior of Convolutional Neural Networks (CNNs). Here, it is hypothesized that some components may be more vulnerable than others, unlike other research that considers a model vulnerable as a whole. Identifying model-s...

Full description

Saved in:

Bibliographic Details
Main Author:	Coppola, Davide
Other Authors:	Guan Cuntai
Format:	Thesis-Master by Research
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/171336
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	This work focuses on understanding how adversarial perturbations can disrupt the behavior of Convolutional Neural Networks (CNNs). Here, it is hypothesized that some components may be more vulnerable than others, unlike other research that considers a model vulnerable as a whole. Identifying model-specific vulnerabilities can help develop ad hoc defense mechanisms to effectively patch trained models without having to retrain them. For this purpose, analytical frameworks have been developed to serve two purposes: 1) to diagnose trained models and reveal model-specific vulnerabilities; and 2) to understand how the learned hidden representations of a CNN are affected by adversarial perturbations. Empirical results verified that the shallow layers play a major role in the vulnerability of the entire model. Furthermore, it was found that a few channels in the shallow layers are significantly more vulnerable than others in the same layers, highlighting them as the main causes of a model’s weakness to adversarial perturbations.

Investigating the causes of the vulnerability of CNNs to adversarial perturbations: learning objective, model components, and learned representations

Similar Items