Adversarial attack defenses for neural networks

The widespread adoption of deep neural networks (DNNs) across various domains has led to the creation of high-performance models trained on extensive datasets. As a result, there is a growing need to protect the intellectual property of these models, leading to the development of various watermar...

全面介紹

Saved in:
書目詳細資料
主要作者: Puah, Yi Hao
其他作者: Anupam Chattopadhyay
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/175196
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:The widespread adoption of deep neural networks (DNNs) across various domains has led to the creation of high-performance models trained on extensive datasets. As a result, there is a growing need to protect the intellectual property of these models, leading to the development of various watermarking techniques. However, these techniques are not impervious to attacks. In this report, I explore the vulnerabilities of state-of-the-art neural network watermarking techniques and propose a novel framework for attacking and neutralizing these watermarks. Our proposed approach focuses on the removal of embedded watermarking techniques using adversarial, out-of-distribution (OOD) and random label trigger data, demonstrating effective strategies for their detection and removal. By providing a comprehensive analysis of the weaknesses in current watermarking methods, our work contributes to the ongoing discussion on model security and intellectual property protection in the realm of deep learning.