Robust AI: security and privacy issues in machine learning

Machine learning based decision making can be adopted in practice as a driver of most applications only when there are strong guarantees on its reliability. The trust of those involved as stakeholders needs to be established for making it more ubiquitous and acceptable. In general, the idea of relia...

Full description

Saved in:
Bibliographic Details
Main Author: Chattopadhyay, Nandish
Other Authors: Anupam Chattopadhyay
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/165248
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-165248
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Chattopadhyay, Nandish
Robust AI: security and privacy issues in machine learning
description Machine learning based decision making can be adopted in practice as a driver of most applications only when there are strong guarantees on its reliability. The trust of those involved as stakeholders needs to be established for making it more ubiquitous and acceptable. In general, the idea of reliability in machine learning can be construed as a sum of two parts, Robustness and Resilience. Since reliability is concerned about providing assurances against malfunctions or errors, it can also be classified based on the types of those errors. This thesis deals with Robustness, that is robustness against attacks, which are responsible for generating intentional and malicious errors. The robustness is therefore to be studied in conjunction with the mechanisms an adversary may adopt to disrupt the machine learning application. To investigate and understand Robustness in ML algorithms against various forms of attacks, we break down each into components and consider all possible combinations. Typically, Robustness is studied with respect to security and privacy aspects. The ML algorithm itself comprises of the trained model and the training data. Therefore, we consider security and privacy related problems pertaining to both components of ML, that is the model and the data. It may be noted here that the work touches upon many important problems in this regard, but it is not exhaustive. We consider all problems related to attacks that jeopardize the fundamental ML task itself under the umbrella of security issues and those that deal with attacks leaking secret and sensitive information about the system under privacy. The primary vulnerability of machine learning models with respect to security is adversarial attacks. In the first part, we consider security for models, and study the adversarial attacks through the lens of dimensionality. We assert that the high dimensional landscape in which the neural network models optimize facilitate the generation of adversarial examples and that dimensionality reduction enhances adversarial robustness. We have explored the mathematical background for this proposition, studying the properties of data distributions in the high dimensional spaces and nature of such trained manifolds. The idea has been empirically justified thereafter. We have extended this notion of the influence of dimensionality on adversarial sample generation from images to videos and text and provided practical and efficient solutions by leveraging adversarial samples detection and dimensionality reduction. It is important to note here that reducing the dimensionality has an additional computational cost and can in some cases also have an adverse effect on the fundamental machine learning task itself. We therefore optimise the dimension reduction operation for each of the tasks and use-cases and carefully choose the amount of variability to preserve that effectively eliminates adversarial noise but retains the meaningful information necessary for classification or object detection etc. Additionally, in one of the works, we make use of a parallel channel to run the classifications task and adversarial sample detection and apply dimensionality reduction to only those samples that are detected as adversarial. This significantly improves efficiency of the overall system and proves to be beneficial in terms of accuracy as well. Thereafter, we study the security flaw of adversarial attacks from the perspective of vulnerable features within the data. We have analysed spatially correlated patterns within adversarial images. Class wise, we have split images into two key parts, Region of Importance (RoI) and Region of Attack (RoA), such that the RoI is the region that the classifier is particularly sensitive to, during the classification task, and the RoA is the region which the adversarial attack modifies. The goal of this exercise was to figure out areas within the images that do not contribute to the task of classification but are adversarially vulnerable. Our proposed adversarial defence mechanism out of this work is to neutralize that region, therefore bringing down adversarial vulnerability without compromising on classification accuracy. The idea is demonstrated through benchmarking datasets and models. Moving on to the aspect of privacy in the second part, we look at some very different problems. First, we look at privacy for the models, followed by that of the data. In the context of preserving privacy of the trained neural network models, we direct our attention to protecting ownership and IP rights of the models, using watermarking. We review the state-of-the-art in watermarking schemes for neural networks and select the most appropriate ones to study. Watermarking using backdooring is the scheme of choice here. We investigate the vulnerabilities of this scheme and break it using synthesis. In our proposition titled Re-Markable, we make the assumption that the adversary has very limited compute power and access to samples from the data distribution relevant to the task. We train a GAN (Generative Adversarial Network) to synthesize more samples and use the synthesized samples to re-train just the fully connected layers of the watermarked models. As demonstrated, it turns out that this minimal computation is sufficient to eliminate the presence of embedded watermarks from the model, and this vulnerability makes the existing scheme extremely unreliable. To solve the problem thus discovered, we worked on a robust watermarking scheme that overcomes this vulnerability. ROWBACK, or robust watermarking for neural networks using backdooring, uses a redesigned mechanism for generating the Trigger Set (using adversarial examples with explicit labelling) that is used as the private key for the watermarking, and a method of explicitly marking every layer of the neural network with the embedded watermarks. The goal is to ensure that an adversary interested in extracting the network would need to re-train every layer of the model, which is as good as training a fresh model from scratch as it would require extensive training samples and compute power. We also extended the idea of robust watermarking for models in the domain of natural language processing, particularly text classifiers. TextBack, designed to embed watermarks within text classifiers using backdooring, uses a marking scheme that involves the Trigger samples and clean samples together, unlike that in images, as this property is observed only in sequential models like recurrent neural networks and LSTM based models. Finally, to cover the aspect of privacy of data, we focus on collaborative ML. The motivation in this work is to protect the privacy of the data that is used for training, as many practical applications necessitate the usage of highly sensitive data. This data is decentralised, resides with different non-collocated entities, and is not sharable on privacy grounds. The straight-forward solution to this problem is available in the literature in the form of Federated Learning. Additionally, an adversary may tap into the federated learning infrastructure in multiple ways and extract information. For example, Membership Inference attacks are possible on models deployed in the cloud (which is natural in many federated learning setups), wherein the model could be queried multiple times and based on the output probabilities returned by the model, one can train another separate model to know if the particular input that had been sent as the query belonged to the training set or not. Carrying out this process multiple times can in theory lead to the reverse-engineering of the entire training set. This is a serious privacy violation. We attempted to solve this problem using Differential Privacy, where a differentially private learning algorithm is used by the participants of the federated learning infrastructure for their local training, involving gradient trimming and addition of sampled noise. We studied practical applications of such collaborative learning systems and deployed the framework on edge devices, by creating light-weight versions of the models that do not compromise on accuracy. Similarly, in another use-case, the participants of the federated learning setup itself could have malicious intentions and can come up with sabotaging attacks on the learning framework. Considering such potential single points of failures of the overall system, we proposed a robust federated learning infrastructure, that assigns coefficients to the updates sent by the clients to the server. This takes care of tolerating faults in up to 50% of the clients’ failures. Overall, the ideas discussed in this thesis are a major step towards making machine learning systems more robust and is therefore a necessary step in the direction of reliable AI.
author2 Anupam Chattopadhyay
author_facet Anupam Chattopadhyay
Chattopadhyay, Nandish
format Thesis-Doctor of Philosophy
author Chattopadhyay, Nandish
author_sort Chattopadhyay, Nandish
title Robust AI: security and privacy issues in machine learning
title_short Robust AI: security and privacy issues in machine learning
title_full Robust AI: security and privacy issues in machine learning
title_fullStr Robust AI: security and privacy issues in machine learning
title_full_unstemmed Robust AI: security and privacy issues in machine learning
title_sort robust ai: security and privacy issues in machine learning
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/165248
_version_ 1764208144900161536
spelling sg-ntu-dr.10356-1652482023-04-04T02:58:00Z Robust AI: security and privacy issues in machine learning Chattopadhyay, Nandish Anupam Chattopadhyay School of Computer Science and Engineering Hardware & Embedded Systems Lab (HESL) anupam@ntu.edu.sg Engineering::Computer science and engineering Machine learning based decision making can be adopted in practice as a driver of most applications only when there are strong guarantees on its reliability. The trust of those involved as stakeholders needs to be established for making it more ubiquitous and acceptable. In general, the idea of reliability in machine learning can be construed as a sum of two parts, Robustness and Resilience. Since reliability is concerned about providing assurances against malfunctions or errors, it can also be classified based on the types of those errors. This thesis deals with Robustness, that is robustness against attacks, which are responsible for generating intentional and malicious errors. The robustness is therefore to be studied in conjunction with the mechanisms an adversary may adopt to disrupt the machine learning application. To investigate and understand Robustness in ML algorithms against various forms of attacks, we break down each into components and consider all possible combinations. Typically, Robustness is studied with respect to security and privacy aspects. The ML algorithm itself comprises of the trained model and the training data. Therefore, we consider security and privacy related problems pertaining to both components of ML, that is the model and the data. It may be noted here that the work touches upon many important problems in this regard, but it is not exhaustive. We consider all problems related to attacks that jeopardize the fundamental ML task itself under the umbrella of security issues and those that deal with attacks leaking secret and sensitive information about the system under privacy. The primary vulnerability of machine learning models with respect to security is adversarial attacks. In the first part, we consider security for models, and study the adversarial attacks through the lens of dimensionality. We assert that the high dimensional landscape in which the neural network models optimize facilitate the generation of adversarial examples and that dimensionality reduction enhances adversarial robustness. We have explored the mathematical background for this proposition, studying the properties of data distributions in the high dimensional spaces and nature of such trained manifolds. The idea has been empirically justified thereafter. We have extended this notion of the influence of dimensionality on adversarial sample generation from images to videos and text and provided practical and efficient solutions by leveraging adversarial samples detection and dimensionality reduction. It is important to note here that reducing the dimensionality has an additional computational cost and can in some cases also have an adverse effect on the fundamental machine learning task itself. We therefore optimise the dimension reduction operation for each of the tasks and use-cases and carefully choose the amount of variability to preserve that effectively eliminates adversarial noise but retains the meaningful information necessary for classification or object detection etc. Additionally, in one of the works, we make use of a parallel channel to run the classifications task and adversarial sample detection and apply dimensionality reduction to only those samples that are detected as adversarial. This significantly improves efficiency of the overall system and proves to be beneficial in terms of accuracy as well. Thereafter, we study the security flaw of adversarial attacks from the perspective of vulnerable features within the data. We have analysed spatially correlated patterns within adversarial images. Class wise, we have split images into two key parts, Region of Importance (RoI) and Region of Attack (RoA), such that the RoI is the region that the classifier is particularly sensitive to, during the classification task, and the RoA is the region which the adversarial attack modifies. The goal of this exercise was to figure out areas within the images that do not contribute to the task of classification but are adversarially vulnerable. Our proposed adversarial defence mechanism out of this work is to neutralize that region, therefore bringing down adversarial vulnerability without compromising on classification accuracy. The idea is demonstrated through benchmarking datasets and models. Moving on to the aspect of privacy in the second part, we look at some very different problems. First, we look at privacy for the models, followed by that of the data. In the context of preserving privacy of the trained neural network models, we direct our attention to protecting ownership and IP rights of the models, using watermarking. We review the state-of-the-art in watermarking schemes for neural networks and select the most appropriate ones to study. Watermarking using backdooring is the scheme of choice here. We investigate the vulnerabilities of this scheme and break it using synthesis. In our proposition titled Re-Markable, we make the assumption that the adversary has very limited compute power and access to samples from the data distribution relevant to the task. We train a GAN (Generative Adversarial Network) to synthesize more samples and use the synthesized samples to re-train just the fully connected layers of the watermarked models. As demonstrated, it turns out that this minimal computation is sufficient to eliminate the presence of embedded watermarks from the model, and this vulnerability makes the existing scheme extremely unreliable. To solve the problem thus discovered, we worked on a robust watermarking scheme that overcomes this vulnerability. ROWBACK, or robust watermarking for neural networks using backdooring, uses a redesigned mechanism for generating the Trigger Set (using adversarial examples with explicit labelling) that is used as the private key for the watermarking, and a method of explicitly marking every layer of the neural network with the embedded watermarks. The goal is to ensure that an adversary interested in extracting the network would need to re-train every layer of the model, which is as good as training a fresh model from scratch as it would require extensive training samples and compute power. We also extended the idea of robust watermarking for models in the domain of natural language processing, particularly text classifiers. TextBack, designed to embed watermarks within text classifiers using backdooring, uses a marking scheme that involves the Trigger samples and clean samples together, unlike that in images, as this property is observed only in sequential models like recurrent neural networks and LSTM based models. Finally, to cover the aspect of privacy of data, we focus on collaborative ML. The motivation in this work is to protect the privacy of the data that is used for training, as many practical applications necessitate the usage of highly sensitive data. This data is decentralised, resides with different non-collocated entities, and is not sharable on privacy grounds. The straight-forward solution to this problem is available in the literature in the form of Federated Learning. Additionally, an adversary may tap into the federated learning infrastructure in multiple ways and extract information. For example, Membership Inference attacks are possible on models deployed in the cloud (which is natural in many federated learning setups), wherein the model could be queried multiple times and based on the output probabilities returned by the model, one can train another separate model to know if the particular input that had been sent as the query belonged to the training set or not. Carrying out this process multiple times can in theory lead to the reverse-engineering of the entire training set. This is a serious privacy violation. We attempted to solve this problem using Differential Privacy, where a differentially private learning algorithm is used by the participants of the federated learning infrastructure for their local training, involving gradient trimming and addition of sampled noise. We studied practical applications of such collaborative learning systems and deployed the framework on edge devices, by creating light-weight versions of the models that do not compromise on accuracy. Similarly, in another use-case, the participants of the federated learning setup itself could have malicious intentions and can come up with sabotaging attacks on the learning framework. Considering such potential single points of failures of the overall system, we proposed a robust federated learning infrastructure, that assigns coefficients to the updates sent by the clients to the server. This takes care of tolerating faults in up to 50% of the clients’ failures. Overall, the ideas discussed in this thesis are a major step towards making machine learning systems more robust and is therefore a necessary step in the direction of reliable AI. Doctor of Philosophy 2023-03-21T07:46:19Z 2023-03-21T07:46:19Z 2023 Thesis-Doctor of Philosophy Chattopadhyay, N. (2023). Robust AI: security and privacy issues in machine learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165248 https://hdl.handle.net/10356/165248 10.32657/10356/165248 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University