Natural robustness of machine learning in the open world
Modern machine learning techniques have demonstrated their excellent capabilities in many areas. Despite the human-surpassing performance in experimental settings, many researches have revealed the vulnerability of machine learning models caused by the violation of fundamental assumptions in real-wo...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166625 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Modern machine learning techniques have demonstrated their excellent capabilities in many areas. Despite the human-surpassing performance in experimental settings, many researches have revealed the vulnerability of machine learning models caused by the violation of fundamental assumptions in real-world applications. Such issues significantly hinder the applicability and reliability of machine learning. This motivates the need to preserve model performance under naturally induced data corruptions or alterations across the machine learning pipeline, which is termed ``Natural Robustness". To this end, this thesis starts by investigating two naturally occurring issues: label corruption and distribution shift. Thereafter, we proceed by exploring the value of out-of-distribution data in the robustness of machine learning.
Firstly, the observed labels of training examples are assumed to be ground truth. However, labels solicited from humans can often be subject to label corruption, leading to poor generalization performance. This gives rise to the importance of robustness against label corruption, where the goal is to train a robust classifier in the presence of noisy and erroneous labels. We first investigate how the diversity among multiple networks affects the sample selection and the overfitting to label noise. For the problem of learning with multiple noisy labels, we design an end-to-end learning framework to maximize the likelihood of the union annotation information, which is not only theoretically consistent but also experimentally effective and efficient.
Secondly, classic machine learning methods are built on the i.i.d. assumption that training and testing data are independent and identically distributed. However, neural networks deployed in the open world often struggle with out-of-distribution inputs, where they produce abnormally high confidence for both in- and out-of-distribution inputs. To alleviate this issue, we first reveal why the cross-entropy loss encourages the model to be overconfident. Then, we design a simple fix to the cross-entropy loss that enhances many existing post-hoc methods for OOD detection. Training with the proposed loss, the network tends to give conservative predictions and results in strong separability of softmax confidence scores between in- and out-of-distribution inputs.
Lastly, traditional machine learning algorithms only exploit information from in-distribution examples that are normally expensive and challenging to collect. Thus, exploring the value of out-of-distribution examples, which are almost for free, is of great importance theoretically and practically. We investigate how open-set noisy labels affect the generalization and robustness against inherent noisy labels, how to theoretically analyze the effect of open-set noisy labels from the perspective of SGD noises, and design algorithms that utilize out-of-distribution examples to improve label-noise robustness. Besides, we provide the first attempt to utilize out-of-distribution data to rebalance the class priors of long-tailed datasets and study the effect of out-of-distribution data on the learned representations in long-tailed learning.
We evaluate the effectiveness and robustness of all the introduced methods on multiple simulated and real-world benchmarks. The reported results indicate that our methods are superior to many state-of-the-art approaches for alleviating the corresponding issues. We hope our efforts provide insights to inspire specially designed methods for these robust issues, and expedite the exploration of out-of-distribution examples for designing effective and robust systems. |
---|