Privacy-preserving data sharing for the future economy
Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166478 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-166478 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1664782023-05-08T15:38:48Z Privacy-preserving data sharing for the future economy Sea, Xin Wang Huaxiong School of Physical and Mathematical Sciences Tan Hong Meng Benjamin HXWang@ntu.edu.sg, benjamin_tan@i2r.a-star.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept was the use of Differentially Private Stochastic Gradient Descent (DP-SGD), which ensured privacy by adding noise and clipping gradients. However, training machine learning or deep neural network with sensitive data poses significant privacy risks. To address this issue, researchers have improved privacy budgets by embedding generative models (GANs) into their models. A differentially private generative model framework was known as the G-PATE, which was based on the Private Aggregation of Teacher Ensemble framework (PATE). G-PATE ensured differential privacy by training generators without direct access to sensitive information. Another approach to ensure privacy was to train a framework without the use of sensitive data. The research named Knowledge Extraction with Generative Network (KEGNET) utilised the generative adversarial network and knowledge distillation and was able to generate good data points and achieve high classification accuracy. In this paper, a new model that combines the privacy-preserving data generative model based on G-PATE and Knowledge Extraction with Generative Network (KEGNET) was proposed. This new model aimed to provide a differential private generative model through knowledge extraction without any observable data approach to ensure high data utility. This paper showed unexpected results with low quality images produced which was due to insufficient pre-trained teacher classifiers and hyperparameters set in the mechanism. This paper recommends several future works to be done, such as increasing the number of pre-trained teacher classifiers and lowering the noise levels corresponding to the number of pre-trained teacher classifiers set. Furthermore, it is also recommended to train more image datasets that contain different variability and compare the results with the G-PATE framework. Bachelor of Science in Mathematical Sciences and Economics 2023-05-02T05:12:22Z 2023-05-02T05:12:22Z 2023 Final Year Project (FYP) Sea, X. (2023). Privacy-preserving data sharing for the future economy. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166478 https://hdl.handle.net/10356/166478 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Sea, Xin Privacy-preserving data sharing for the future economy |
description |
Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept was the use of Differentially Private Stochastic Gradient Descent (DP-SGD), which ensured privacy by adding noise and clipping gradients.
However, training machine learning or deep neural network with sensitive data poses significant privacy risks. To address this issue, researchers have improved privacy budgets by embedding generative models (GANs) into their models. A differentially private generative model framework was known as the G-PATE, which was based on the Private Aggregation of Teacher Ensemble framework (PATE). G-PATE ensured differential privacy by training generators without direct access to sensitive information.
Another approach to ensure privacy was to train a framework without the use of sensitive data. The research named Knowledge Extraction with Generative Network (KEGNET) utilised the generative adversarial network and knowledge distillation and was able to generate good data points and achieve high classification accuracy.
In this paper, a new model that combines the privacy-preserving data generative model based on G-PATE and Knowledge Extraction with Generative Network (KEGNET) was proposed. This new model aimed to provide a differential private generative model through knowledge extraction without any observable data approach to ensure high data utility. This paper showed unexpected results with low quality images produced which was due to insufficient pre-trained teacher classifiers and hyperparameters set in the mechanism.
This paper recommends several future works to be done, such as increasing the number of pre-trained teacher classifiers and lowering the noise levels corresponding to the number of pre-trained teacher classifiers set. Furthermore, it is also recommended to train more image datasets that contain different variability and compare the results with the G-PATE framework. |
author2 |
Wang Huaxiong |
author_facet |
Wang Huaxiong Sea, Xin |
format |
Final Year Project |
author |
Sea, Xin |
author_sort |
Sea, Xin |
title |
Privacy-preserving data sharing for the future economy |
title_short |
Privacy-preserving data sharing for the future economy |
title_full |
Privacy-preserving data sharing for the future economy |
title_fullStr |
Privacy-preserving data sharing for the future economy |
title_full_unstemmed |
Privacy-preserving data sharing for the future economy |
title_sort |
privacy-preserving data sharing for the future economy |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/166478 |
_version_ |
1770564638381965312 |