Adversarial attack defenses for neural networks
The widespread adoption of deep neural networks (DNNs) across various domains has led to the creation of high-performance models trained on extensive datasets. As a result, there is a growing need to protect the intellectual property of these models, leading to the development of various watermar...
محفوظ في:
المؤلف الرئيسي: | |
---|---|
مؤلفون آخرون: | |
التنسيق: | Final Year Project |
اللغة: | English |
منشور في: |
Nanyang Technological University
2024
|
الموضوعات: | |
الوصول للمادة أونلاين: | https://hdl.handle.net/10356/175196 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
الملخص: | The widespread adoption of deep neural networks (DNNs) across various domains has
led to the creation of high-performance models trained on extensive datasets. As a
result, there is a growing need to protect the intellectual property of these models,
leading to the development of various watermarking techniques.
However, these techniques are not impervious to attacks. In this report, I explore the
vulnerabilities of state-of-the-art neural network watermarking techniques and propose
a novel framework for attacking and neutralizing these watermarks.
Our proposed approach focuses on the removal of embedded watermarking techniques
using adversarial, out-of-distribution (OOD) and random label trigger data, demonstrating
effective strategies for their detection and removal. By providing a comprehensive
analysis of the weaknesses in current watermarking methods, our work contributes to
the ongoing discussion on model security and intellectual property protection in the
realm of deep learning. |
---|