MPC-enabled privacy-preserving machine learning

Privacy-Preserving Machine Learning (PPML) has received much attention from the machine learning community, from academic researchers to industry practitioners to government regulators. The construction of PPML systems typically relies on two types of techniques, including (i) pure cryptographic con...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Ziyao
Other Authors: Lam Kwok Yan
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167272
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Privacy-Preserving Machine Learning (PPML) has received much attention from the machine learning community, from academic researchers to industry practitioners to government regulators. The construction of PPML systems typically relies on two types of techniques, including (i) pure cryptographic construction, e.g., secure multi-party computation, and (ii) federated learning. The former provides strong security guarantees but involves large overheads. The latter allows participants to individually train their ML models, which are then aggregated to construct a global model. This process may lead to privacy leakage in the process of aggregation. This thesis proposes three PPML methods that aim to address the aforementioned issues. The proposed methods include both pure MPC-based construction and MPC-based aggregation protocols for privacy-enhanced federated learning. The major contributions of this thesis include: Firstly, a set of customized MPC protocols is proposed to improve the efficiency of secure neural network training in an active malicious setting. Theoretical analysis and experimental show their efficiency compared to those in a semi-honest setting. Experimental results show that the proposed protocols offer active security with affordable overheads of around 2 times and 2.7 times in LAN and WAN time, respectively. Secondly, to improve both the dropout resilience and communication efficiency of federated learning systems, we propose a privacy-preserving aggregation protocol that outperforms state-of-the-art schemes by up to 6.37 times in runtime and provides a stronger dropout resilience. Thirdly, to address the privacy risks caused by the multi-round training in federated learning, we propose a scheme including a strategy with a set of security protocols. This scheme proposes the first long-term privacy-preserving aggregation protocol providing both single-round and multi-round privacy guarantees.