Adversarial attack defenses for neural networks
The widespread adoption of deep neural networks (DNNs) across various domains has led to the creation of high-performance models trained on extensive datasets. As a result, there is a growing need to protect the intellectual property of these models, leading to the development of various watermar...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175196 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The widespread adoption of deep neural networks (DNNs) across various domains has
led to the creation of high-performance models trained on extensive datasets. As a
result, there is a growing need to protect the intellectual property of these models,
leading to the development of various watermarking techniques.
However, these techniques are not impervious to attacks. In this report, I explore the
vulnerabilities of state-of-the-art neural network watermarking techniques and propose
a novel framework for attacking and neutralizing these watermarks.
Our proposed approach focuses on the removal of embedded watermarking techniques
using adversarial, out-of-distribution (OOD) and random label trigger data, demonstrating
effective strategies for their detection and removal. By providing a comprehensive
analysis of the weaknesses in current watermarking methods, our work contributes to
the ongoing discussion on model security and intellectual property protection in the
realm of deep learning. |
---|