Fingerprinting deep neural networks - a DeepFool approach
A well-trained deep learning classifier is an expensive intellectual property of the model owner. However, recently proposed model extraction attacks and reverse engineering techniques make model theft possible and similar quality deep learning solution reproducible at a low cost. To protect the int...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/147023 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | A well-trained deep learning classifier is an expensive intellectual property of the model owner. However, recently proposed model extraction attacks and reverse engineering techniques make model theft possible and similar quality deep learning solution reproducible at a low cost. To protect the interest and revenue of the model owner, watermarking on Deep Neural Network (DNN) has been proposed. However, the extra components and computations due to the embedded watermark tend to interfere with the model training process and result in inevitable degradation in classification accuracy. In this paper, we utilize the geometry characteristics inherited in the DeepFool algorithm to extract data points near the classification boundary of the target model for ownership verification. As the fingerprint is extracted after the training process has been completed, the original achievable classification accuracy will not be compromised. This
countermeasure is founded on the hypothesis that different models possess different classification boundaries determined solely by the hyperparameters of the DNN and the training it has undergone. Therefore, given a set of fingerprint data points, a pirated model or its post-processed version will produce similar prediction but another originally designed and trained DNN for the same task will produce very different prediction even if they have similar or better classification accuracy. The effectiveness of the proposed Intellectual Property (IP) protection method is validated on the CIFAR-10, CIFAR-100 and ImageNet datasets. The results show a detection rate of 100% and a false positive rate of 0% for each
dataset. More importantly, the fingerprint extraction and its runtime are both dataset independent. It is on average ∼130× faster than two state-of-the-art fingerprinting methods. |
---|