Privacy-preserving data sharing for the future economy

Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept...

Full description

Saved in:
Bibliographic Details
Main Author: Sea, Xin
Other Authors: Wang Huaxiong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166478
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept was the use of Differentially Private Stochastic Gradient Descent (DP-SGD), which ensured privacy by adding noise and clipping gradients. However, training machine learning or deep neural network with sensitive data poses significant privacy risks. To address this issue, researchers have improved privacy budgets by embedding generative models (GANs) into their models. A differentially private generative model framework was known as the G-PATE, which was based on the Private Aggregation of Teacher Ensemble framework (PATE). G-PATE ensured differential privacy by training generators without direct access to sensitive information. Another approach to ensure privacy was to train a framework without the use of sensitive data. The research named Knowledge Extraction with Generative Network (KEGNET) utilised the generative adversarial network and knowledge distillation and was able to generate good data points and achieve high classification accuracy. In this paper, a new model that combines the privacy-preserving data generative model based on G-PATE and Knowledge Extraction with Generative Network (KEGNET) was proposed. This new model aimed to provide a differential private generative model through knowledge extraction without any observable data approach to ensure high data utility. This paper showed unexpected results with low quality images produced which was due to insufficient pre-trained teacher classifiers and hyperparameters set in the mechanism. This paper recommends several future works to be done, such as increasing the number of pre-trained teacher classifiers and lowering the noise levels corresponding to the number of pre-trained teacher classifiers set. Furthermore, it is also recommended to train more image datasets that contain different variability and compare the results with the G-PATE framework.