Live demonstration: man-in-the-middle attack on edge artificial intelligence

Deep neural networks (DNNs) are susceptible to evasion attacks. However, digital adversarial examples are typically applied to pre-captured static images. The perturbations are generated by loss optimization with knowledge of target model hyperparameters and are added offline. Physical adversarial e...

Full description

Saved in:
Bibliographic Details
Main Authors: Hu, Bowen, He, Weiyang, Wang, Si, Liu, Wenye, Chang, Chip Hong
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/174146
https://2024.ieee-iscas.org/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Deep neural networks (DNNs) are susceptible to evasion attacks. However, digital adversarial examples are typically applied to pre-captured static images. The perturbations are generated by loss optimization with knowledge of target model hyperparameters and are added offline. Physical adversarial examples, on the other hand, tamper with the physical target or use a realistically fabricated target to fool the DNN. A sufficient number of pristine target samples captured under different varying environmental conditions are required to create the physical adversarial perturbations. Both digital and physical input evasion attacks are not robust against dynamic object scene variations and the adversarial effects are often weakened by model reduction and quantization when the DNNs are implemented on edge artificial intelligence (AI) accelerator platforms. This demonstration presents a practical man-in-the-middle (MITM) attack on an edge DNN first reported in [1]. A tiny MIPI FPGA chip with hardened CSI-2 and D-PHY blocks is attached between the camera and the edge AI accelerator to inject unobtrusive stripes onto the RAW image data. The attack is less influenced by dynamic context variations such as changes in viewing angle, illumination, and distance of the target from the camera.