Evaluating & enhancing deep learning systems via out-of-distribution detection

Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have be...

Full description

Saved in:

Bibliographic Details
Main Author:	Christopher, Berend David
Other Authors:	Liu Yang
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/162032
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-162032
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Christopher, Berend David Evaluating & enhancing deep learning systems via out-of-distribution detection
description	Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have been introduced that cover quality assurance principles such as robustness, fairness, and security of a DL system. Robustness aims to address the DL system's ability to predict new unseen inputs that are relevant to the DL application. Fairness aims to enable equal performance for all demographics, such as gender, age or ethnicity. Security aims to address new DL threats and defenses. Combining all three principles enables a more trustworthy DL system, which gives benefits to developers and end users. Developers gain confidence in deployment and end users gain trust in adopting a DL system. So far, the principles of robustness, fairness and security have been outlined in trustworthy initiatives from an expectations point of view under ISO or regulatory acts under the European Commission\cite{ai_act}. However, these initiatives give little attention to methods and quantitative assessment strategies on how those principles can be enabled and validated. Without assessment strategies, forming benchmarks to determine the quality of a DL system becomes a challenging task. This motivates research questions such as: ``\textit{At what point can a DL system be considered sufficient for deployment?}” and ``\textit{Is the data used for testing relevant and complete for the target application?}” In this thesis, we aim to address such questions and present foundational improvements by introducing out-of-distribution (OOD) awareness to robustness, fairness and security. Finally, the research methodologies are integrated into a national standard of Singapore which represents a high impact outcome of this work. First, we conduct a large-scale OOD empirical study to analyse which OOD technique is most suited for real-world DL system testing. Then, we propose an OOD testing criteria to enhance error discovery and propose distribution aware robustness enhancement by showing that filtering errors far from the trained distribution increases robustness by 21.5\%. We then enhance the field of OOD by introducing a fine-grained OOD technique assessing the ground truth of data augmentation and increasing accuracy by 31.9\% by guiding data augmentation selection with distribution awareness. We further apply the same augmentation technique to subsets of the training data to balance accuracy among demographics to enhance fairness. Finally, we introduce OOD defense approaches against two novel DL threats which fully evade traditional defenses. Ultimately, the thesis concludes with increasing quality assurance research impact by presenting the methodology that led to launching a national standard. The standard integrates our security research and evaluates it among four real-world use cases. This represents an important step in creating and validating trustworthy DL systems enabling increased trust in the intelligent solution and thereby further accelerating adoption in a safeguarded manner.
author2	Liu Yang
author_facet	Liu Yang Christopher, Berend David
format	Thesis-Doctor of Philosophy
author	Christopher, Berend David
author_sort	Christopher, Berend David
title	Evaluating & enhancing deep learning systems via out-of-distribution detection
title_short	Evaluating & enhancing deep learning systems via out-of-distribution detection
title_full	Evaluating & enhancing deep learning systems via out-of-distribution detection
title_fullStr	Evaluating & enhancing deep learning systems via out-of-distribution detection
title_full_unstemmed	Evaluating & enhancing deep learning systems via out-of-distribution detection
title_sort	evaluating & enhancing deep learning systems via out-of-distribution detection
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/162032
_version_	1749179243606048768
spelling	sg-ntu-dr.10356-1620322022-11-01T04:54:23Z Evaluating & enhancing deep learning systems via out-of-distribution detection Christopher, Berend David Liu Yang School of Computer Science and Engineering yangliu@ntu.edu.sg Engineering::Computer science and engineering Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have been introduced that cover quality assurance principles such as robustness, fairness, and security of a DL system. Robustness aims to address the DL system's ability to predict new unseen inputs that are relevant to the DL application. Fairness aims to enable equal performance for all demographics, such as gender, age or ethnicity. Security aims to address new DL threats and defenses. Combining all three principles enables a more trustworthy DL system, which gives benefits to developers and end users. Developers gain confidence in deployment and end users gain trust in adopting a DL system. So far, the principles of robustness, fairness and security have been outlined in trustworthy initiatives from an expectations point of view under ISO or regulatory acts under the European Commission\cite{ai_act}. However, these initiatives give little attention to methods and quantitative assessment strategies on how those principles can be enabled and validated. Without assessment strategies, forming benchmarks to determine the quality of a DL system becomes a challenging task. This motivates research questions such as: ``\textit{At what point can a DL system be considered sufficient for deployment?}” and ``\textit{Is the data used for testing relevant and complete for the target application?}” In this thesis, we aim to address such questions and present foundational improvements by introducing out-of-distribution (OOD) awareness to robustness, fairness and security. Finally, the research methodologies are integrated into a national standard of Singapore which represents a high impact outcome of this work. First, we conduct a large-scale OOD empirical study to analyse which OOD technique is most suited for real-world DL system testing. Then, we propose an OOD testing criteria to enhance error discovery and propose distribution aware robustness enhancement by showing that filtering errors far from the trained distribution increases robustness by 21.5\%. We then enhance the field of OOD by introducing a fine-grained OOD technique assessing the ground truth of data augmentation and increasing accuracy by 31.9\% by guiding data augmentation selection with distribution awareness. We further apply the same augmentation technique to subsets of the training data to balance accuracy among demographics to enhance fairness. Finally, we introduce OOD defense approaches against two novel DL threats which fully evade traditional defenses. Ultimately, the thesis concludes with increasing quality assurance research impact by presenting the methodology that led to launching a national standard. The standard integrates our security research and evaluates it among four real-world use cases. This represents an important step in creating and validating trustworthy DL systems enabling increased trust in the intelligent solution and thereby further accelerating adoption in a safeguarded manner. Doctor of Philosophy 2022-10-10T23:29:16Z 2022-10-10T23:29:16Z 2022 Thesis-Doctor of Philosophy Christopher, B. D. (2022). Evaluating & enhancing deep learning systems via out-of-distribution detection. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/162032 https://hdl.handle.net/10356/162032 10.32657/10356/162032 en National Research Foundation (Award No. NRF2018NCR-NCR005-0001, AISG2-RP-2020-019, NRF-NRFI06-2020-0001, NRF2018NCR-NSOE003-0001), European Union’s Horizon 2020 research and innovation programme (Award No. 830927) This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Evaluating & enhancing deep learning systems via out-of-distribution detection

Similar Items