Evaluating & enhancing deep learning systems via out-of-distribution detection

Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have be...

Full description

Saved in:
Bibliographic Details
Main Author: Christopher, Berend David
Other Authors: Liu Yang
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/162032
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-162032
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Christopher, Berend David
Evaluating & enhancing deep learning systems via out-of-distribution detection
description Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have been introduced that cover quality assurance principles such as robustness, fairness, and security of a DL system. Robustness aims to address the DL system's ability to predict new unseen inputs that are relevant to the DL application. Fairness aims to enable equal performance for all demographics, such as gender, age or ethnicity. Security aims to address new DL threats and defenses. Combining all three principles enables a more trustworthy DL system, which gives benefits to developers and end users. Developers gain confidence in deployment and end users gain trust in adopting a DL system. So far, the principles of robustness, fairness and security have been outlined in trustworthy initiatives from an expectations point of view under ISO or regulatory acts under the European Commission\cite{ai_act}. However, these initiatives give little attention to methods and quantitative assessment strategies on how those principles can be enabled and validated. Without assessment strategies, forming benchmarks to determine the quality of a DL system becomes a challenging task. This motivates research questions such as: ``\textit{At what point can a DL system be considered sufficient for deployment?}” and ``\textit{Is the data used for testing relevant and complete for the target application?}” In this thesis, we aim to address such questions and present foundational improvements by introducing out-of-distribution (OOD) awareness to robustness, fairness and security. Finally, the research methodologies are integrated into a national standard of Singapore which represents a high impact outcome of this work. First, we conduct a large-scale OOD empirical study to analyse which OOD technique is most suited for real-world DL system testing. Then, we propose an OOD testing criteria to enhance error discovery and propose distribution aware robustness enhancement by showing that filtering errors far from the trained distribution increases robustness by 21.5\%. We then enhance the field of OOD by introducing a fine-grained OOD technique assessing the ground truth of data augmentation and increasing accuracy by 31.9\% by guiding data augmentation selection with distribution awareness. We further apply the same augmentation technique to subsets of the training data to balance accuracy among demographics to enhance fairness. Finally, we introduce OOD defense approaches against two novel DL threats which fully evade traditional defenses. Ultimately, the thesis concludes with increasing quality assurance research impact by presenting the methodology that led to launching a national standard. The standard integrates our security research and evaluates it among four real-world use cases. This represents an important step in creating and validating trustworthy DL systems enabling increased trust in the intelligent solution and thereby further accelerating adoption in a safeguarded manner.
author2 Liu Yang
author_facet Liu Yang
Christopher, Berend David
format Thesis-Doctor of Philosophy
author Christopher, Berend David
author_sort Christopher, Berend David
title Evaluating & enhancing deep learning systems via out-of-distribution detection
title_short Evaluating & enhancing deep learning systems via out-of-distribution detection
title_full Evaluating & enhancing deep learning systems via out-of-distribution detection
title_fullStr Evaluating & enhancing deep learning systems via out-of-distribution detection
title_full_unstemmed Evaluating & enhancing deep learning systems via out-of-distribution detection
title_sort evaluating & enhancing deep learning systems via out-of-distribution detection
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/162032
_version_ 1749179243606048768
spelling sg-ntu-dr.10356-1620322022-11-01T04:54:23Z Evaluating & enhancing deep learning systems via out-of-distribution detection Christopher, Berend David Liu Yang School of Computer Science and Engineering yangliu@ntu.edu.sg Engineering::Computer science and engineering Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have been introduced that cover quality assurance principles such as robustness, fairness, and security of a DL system. Robustness aims to address the DL system's ability to predict new unseen inputs that are relevant to the DL application. Fairness aims to enable equal performance for all demographics, such as gender, age or ethnicity. Security aims to address new DL threats and defenses. Combining all three principles enables a more trustworthy DL system, which gives benefits to developers and end users. Developers gain confidence in deployment and end users gain trust in adopting a DL system. So far, the principles of robustness, fairness and security have been outlined in trustworthy initiatives from an expectations point of view under ISO or regulatory acts under the European Commission\cite{ai_act}. However, these initiatives give little attention to methods and quantitative assessment strategies on how those principles can be enabled and validated. Without assessment strategies, forming benchmarks to determine the quality of a DL system becomes a challenging task. This motivates research questions such as: ``\textit{At what point can a DL system be considered sufficient for deployment?}” and ``\textit{Is the data used for testing relevant and complete for the target application?}” In this thesis, we aim to address such questions and present foundational improvements by introducing out-of-distribution (OOD) awareness to robustness, fairness and security. Finally, the research methodologies are integrated into a national standard of Singapore which represents a high impact outcome of this work. First, we conduct a large-scale OOD empirical study to analyse which OOD technique is most suited for real-world DL system testing. Then, we propose an OOD testing criteria to enhance error discovery and propose distribution aware robustness enhancement by showing that filtering errors far from the trained distribution increases robustness by 21.5\%. We then enhance the field of OOD by introducing a fine-grained OOD technique assessing the ground truth of data augmentation and increasing accuracy by 31.9\% by guiding data augmentation selection with distribution awareness. We further apply the same augmentation technique to subsets of the training data to balance accuracy among demographics to enhance fairness. Finally, we introduce OOD defense approaches against two novel DL threats which fully evade traditional defenses. Ultimately, the thesis concludes with increasing quality assurance research impact by presenting the methodology that led to launching a national standard. The standard integrates our security research and evaluates it among four real-world use cases. This represents an important step in creating and validating trustworthy DL systems enabling increased trust in the intelligent solution and thereby further accelerating adoption in a safeguarded manner. Doctor of Philosophy 2022-10-10T23:29:16Z 2022-10-10T23:29:16Z 2022 Thesis-Doctor of Philosophy Christopher, B. D. (2022). Evaluating & enhancing deep learning systems via out-of-distribution detection. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/162032 https://hdl.handle.net/10356/162032 10.32657/10356/162032 en National Research Foundation (Award No. NRF2018NCR-NCR005-0001, AISG2-RP-2020-019, NRF-NRFI06-2020-0001, NRF2018NCR-NSOE003-0001), European Union’s Horizon 2020 research and innovation programme (Award No. 830927) This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University