SIMC 2.0: Improved secure ML inference against malicious clients

In this paper, we study the problem of secure ML inference against a malicious client and a semi-trusted server such that the client only learns the inference output while the server learns nothing. This problem is first formulated by Lehmkuhl et al. with a solution (MUSE, Usenix Security’21), whose...

Full description

Saved in:
Bibliographic Details
Main Authors: XU, Guowen, HAN, Xingshuo, ZHANG, Tianwei, XU, Shengmin, NING, Jianting, HUANG, Xinyi, LI, Hongwei, DENG, Robert H.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9816
https://ink.library.smu.edu.sg/context/sis_research/article/10816/viewcontent/2207.04637v2.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:In this paper, we study the problem of secure ML inference against a malicious client and a semi-trusted server such that the client only learns the inference output while the server learns nothing. This problem is first formulated by Lehmkuhl et al. with a solution (MUSE, Usenix Security’21), whose performance is then substantially improved by Chandran et al.'s work (SIMC, USENIX Security’22). However, there still exists a nontrivial gap in these efforts towards practicality, giving the challenges of overhead reduction and secure inference acceleration in an all-round way. Based on this, we propose SIMC 2.0, which complies with the underlying structure of SIMC, but significantly optimizes both the linear and non-linear layers of the model. Specifically, (1) we design a new coding method for parallel homomorphic computation between matrices and vectors. (2) We reduce the size of the garbled circuit (GC) (used to calculate non-linear activation functions, e.g., ReLU) in SIMC by about two thirds. Compared with SIMC, our experiments show that SIMC 2.0 achieves a significant speedup by up to 17.4×17.4× for linear layer computation, and at least 1.3×1.3× reduction of both the computation and communication overhead in the implementation of non-linear layers under different data dimensions. Meanwhile, SIMC 2.0 demonstrates an encouraging runtime boost by 2.3∼4.3×2.3∼4.3× over SIMC on different state-of-the-art ML models.