SNIFF : reverse engineering of neural networks with fault attacks

Neural networks have been shown to be vulnerable against fault injection attacks. These attacks change the physical behavior of the device during the computation, resulting in a change of value that is currently being computed. They can be realized by various techniques, ranging from clock/voltage g...

Full description

Saved in:
Bibliographic Details
Main Authors: Breier, Jakub, Jap, Dirmanto, Hou, Xiaolu, Bhasin, Shivam, Liu, Yang
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/155678
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Neural networks have been shown to be vulnerable against fault injection attacks. These attacks change the physical behavior of the device during the computation, resulting in a change of value that is currently being computed. They can be realized by various techniques, ranging from clock/voltage glitching to application of lasers to rowhammer. Previous works have mostly explored fault attacks for output misclassification, thus affecting the reliability of neural networks. In this article, we investigate the possibility to reverse engineer neural networks with fault attacks. Sign bit flip fault attack enables the reverse engineering by changing the sign of intermediate values. We develop the first exact extraction method on deep-layer feature extractor networks that provably allows the recovery of proprietary model parameters. Our experiments with Keras library show that the precision error for the parameter recovery for the tested networks is less than <formula><tex>$10^{-13}$</tex></formula>with the usage of 64-bit floats, which improves the current state of the art by six orders of magnitude.