Privacy-preserving data sharing for the future economy

Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept...

Full description

Saved in:
Bibliographic Details
Main Author: Sea, Xin
Other Authors: Wang Huaxiong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166478
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-166478
record_format dspace
spelling sg-ntu-dr.10356-1664782023-05-08T15:38:48Z Privacy-preserving data sharing for the future economy Sea, Xin Wang Huaxiong School of Physical and Mathematical Sciences Tan Hong Meng Benjamin HXWang@ntu.edu.sg, benjamin_tan@i2r.a-star.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept was the use of Differentially Private Stochastic Gradient Descent (DP-SGD), which ensured privacy by adding noise and clipping gradients. However, training machine learning or deep neural network with sensitive data poses significant privacy risks. To address this issue, researchers have improved privacy budgets by embedding generative models (GANs) into their models. A differentially private generative model framework was known as the G-PATE, which was based on the Private Aggregation of Teacher Ensemble framework (PATE). G-PATE ensured differential privacy by training generators without direct access to sensitive information. Another approach to ensure privacy was to train a framework without the use of sensitive data. The research named Knowledge Extraction with Generative Network (KEGNET) utilised the generative adversarial network and knowledge distillation and was able to generate good data points and achieve high classification accuracy. In this paper, a new model that combines the privacy-preserving data generative model based on G-PATE and Knowledge Extraction with Generative Network (KEGNET) was proposed. This new model aimed to provide a differential private generative model through knowledge extraction without any observable data approach to ensure high data utility. This paper showed unexpected results with low quality images produced which was due to insufficient pre-trained teacher classifiers and hyperparameters set in the mechanism. This paper recommends several future works to be done, such as increasing the number of pre-trained teacher classifiers and lowering the noise levels corresponding to the number of pre-trained teacher classifiers set. Furthermore, it is also recommended to train more image datasets that contain different variability and compare the results with the G-PATE framework. Bachelor of Science in Mathematical Sciences and Economics 2023-05-02T05:12:22Z 2023-05-02T05:12:22Z 2023 Final Year Project (FYP) Sea, X. (2023). Privacy-preserving data sharing for the future economy. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166478 https://hdl.handle.net/10356/166478 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Sea, Xin
Privacy-preserving data sharing for the future economy
description Preserving data privacy is a rising concern as the advancement in artificial intelligence relies heavily on data. Differential Privacy offered a precisely defined and mathematically rigorous concept, that ensured robust and meaningful privacy preservation. One promising implementation of the concept was the use of Differentially Private Stochastic Gradient Descent (DP-SGD), which ensured privacy by adding noise and clipping gradients. However, training machine learning or deep neural network with sensitive data poses significant privacy risks. To address this issue, researchers have improved privacy budgets by embedding generative models (GANs) into their models. A differentially private generative model framework was known as the G-PATE, which was based on the Private Aggregation of Teacher Ensemble framework (PATE). G-PATE ensured differential privacy by training generators without direct access to sensitive information. Another approach to ensure privacy was to train a framework without the use of sensitive data. The research named Knowledge Extraction with Generative Network (KEGNET) utilised the generative adversarial network and knowledge distillation and was able to generate good data points and achieve high classification accuracy. In this paper, a new model that combines the privacy-preserving data generative model based on G-PATE and Knowledge Extraction with Generative Network (KEGNET) was proposed. This new model aimed to provide a differential private generative model through knowledge extraction without any observable data approach to ensure high data utility. This paper showed unexpected results with low quality images produced which was due to insufficient pre-trained teacher classifiers and hyperparameters set in the mechanism. This paper recommends several future works to be done, such as increasing the number of pre-trained teacher classifiers and lowering the noise levels corresponding to the number of pre-trained teacher classifiers set. Furthermore, it is also recommended to train more image datasets that contain different variability and compare the results with the G-PATE framework.
author2 Wang Huaxiong
author_facet Wang Huaxiong
Sea, Xin
format Final Year Project
author Sea, Xin
author_sort Sea, Xin
title Privacy-preserving data sharing for the future economy
title_short Privacy-preserving data sharing for the future economy
title_full Privacy-preserving data sharing for the future economy
title_fullStr Privacy-preserving data sharing for the future economy
title_full_unstemmed Privacy-preserving data sharing for the future economy
title_sort privacy-preserving data sharing for the future economy
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/166478
_version_ 1770564638381965312