Evaluating & enhancing deep learning systems via out-of-distribution detection
Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses. With the rise of DL adoption, trustworthy AI initiatives have be...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/162032 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Deep Learning (DL) is continuously adopted in many industrial applications at a rapidly increasing pace. This includes safety- and security-critical applications where errors in the DL system can lead to massive or even fatal losses.
With the rise of DL adoption, trustworthy AI initiatives have been introduced that cover quality assurance principles such as robustness, fairness, and security of a DL system. Robustness aims to address the DL system's ability to predict new unseen inputs that are relevant to the DL application. Fairness aims to enable equal performance for all demographics, such as gender, age or ethnicity. Security aims to address new DL threats and defenses. Combining all three principles enables a more trustworthy DL system, which gives benefits to developers and end users. Developers gain confidence in deployment and end users gain trust in adopting a DL system.
So far, the principles of robustness, fairness and security have been outlined in trustworthy initiatives from an expectations point of view under ISO or regulatory acts under the European Commission\cite{ai_act}. However, these initiatives give little attention to methods and quantitative assessment strategies on how those principles can be enabled and validated. Without assessment strategies, forming benchmarks to determine the quality of a DL system becomes a challenging task. This motivates research questions such as: ``\textit{At what point can a DL system be considered sufficient for deployment?}” and ``\textit{Is the data used for testing relevant and complete for the target application?}” In this thesis, we aim to address such questions and present foundational improvements by introducing out-of-distribution (OOD) awareness to robustness, fairness and security.
Finally, the research methodologies are integrated into a national standard of Singapore which represents a high impact outcome of this work.
First, we conduct a large-scale OOD empirical study to analyse which OOD technique is most suited for real-world DL system testing. Then, we propose an OOD testing criteria to enhance error discovery and propose distribution aware robustness enhancement by showing that filtering errors far from the trained distribution increases robustness by 21.5\%. We then enhance the field of OOD by introducing a fine-grained OOD technique assessing the ground truth of data augmentation and increasing accuracy by 31.9\% by guiding data augmentation selection with distribution awareness. We further apply the same augmentation technique to subsets of the training data to balance accuracy among demographics to enhance fairness. Finally, we introduce OOD defense approaches against two novel DL threats which fully evade traditional defenses.
Ultimately, the thesis concludes with increasing quality assurance research impact by presenting the methodology that led to launching a national standard. The standard integrates our security research and evaluates it among four real-world use cases. This represents an important step in creating and validating trustworthy DL systems enabling increased trust in the intelligent solution and thereby further accelerating adoption in a safeguarded manner. |
---|