Removing bias for out-of-distribution generalization

Deep models have a strong ability to fit the training data, and thus can achieve high performance when the testing data is sampled from the same distribution as the training. However, in practice, the deep models fail to perform perfectly because the testing data is usually Out-of-Distribution (OOD)...

Full description

Saved in:
Bibliographic Details
Main Author: Qi, Jiaxin
Other Authors: Zhang Hanwang
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/168654
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-168654
record_format dspace
spelling sg-ntu-dr.10356-1686542023-07-04T01:52:12Z Removing bias for out-of-distribution generalization Qi, Jiaxin Zhang Hanwang School of Computer Science and Engineering hanwangzhang@ntu.edu.sg Engineering::Computer science and engineering Deep models have a strong ability to fit the training data, and thus can achieve high performance when the testing data is sampled from the same distribution as the training. However, in practice, the deep models fail to perform perfectly because the testing data is usually Out-of-Distribution (OOD) compared to the training, which is known as the OOD Generalization problem. The underlying reason is that, in the training, besides the causal effect, i.e., the causalities between inputs and outputs which describe the data generation process and will not change under any data distribution, the models also learn the bias, i.e., the spurious correlations between inputs and outputs which only exists in the training distribution, and thus learning such bias will make the model fail to generalize to OOD data. To help the models achieve better OOD Generalization performance, we need to pursue the causal effect by removing the learned bias. However, due to the various data organization formats and different given inputs, it is hard to propose a uniform bias removal strategy, and thus we categorize the OOD Generalization tasks into three camps and conduct specific case studies for each one: 1) OOD Generalization with Multiple Modalities, where multiple modalities, such as language and image, are provided in the training, and we focus on a specific case, Visual Dialog, to analyze its underlying causal relationships between the modalities and propose two causal principles to remove the history bias and user bias for better OOD performance. 2) OOD Generalization with Multiple Domains, where there is only one modality, images, but multiple training domains and their domain annotations are given. We focus on Domain Generalization (DG) and propose to create a new domain by cross-domain influence to remove the ``spurious invariance'' bias to help current DG methods achieve better OOD performance. 3) OOD Generalization with no Additional Annotations, where only one modality, images, and one training domain with no additional annotations, such as domain annotations or bias annotations, are given in the training. We focus on a specific case, Debiasing, and propose two algorithms for removing bias. First, we design a two-stage pipeline with re-weighting methods to effectively remove the underlying context bias. Second, due to the context estimation method used by current re-weighting is hard to succeed when class effect and context effect are entangled, we propose Invariant Risk Minimization for Context to disentangle the context to achieve better re-weighting for removing context bias to achieve better OOD Generalization for debiasing. Doctor of Philosophy 2023-06-14T06:23:54Z 2023-06-14T06:23:54Z 2023 Thesis-Doctor of Philosophy Qi, J. (2023). Removing bias for out-of-distribution generalization. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/168654 https://hdl.handle.net/10356/168654 10.32657/10356/168654 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Qi, Jiaxin
Removing bias for out-of-distribution generalization
description Deep models have a strong ability to fit the training data, and thus can achieve high performance when the testing data is sampled from the same distribution as the training. However, in practice, the deep models fail to perform perfectly because the testing data is usually Out-of-Distribution (OOD) compared to the training, which is known as the OOD Generalization problem. The underlying reason is that, in the training, besides the causal effect, i.e., the causalities between inputs and outputs which describe the data generation process and will not change under any data distribution, the models also learn the bias, i.e., the spurious correlations between inputs and outputs which only exists in the training distribution, and thus learning such bias will make the model fail to generalize to OOD data. To help the models achieve better OOD Generalization performance, we need to pursue the causal effect by removing the learned bias. However, due to the various data organization formats and different given inputs, it is hard to propose a uniform bias removal strategy, and thus we categorize the OOD Generalization tasks into three camps and conduct specific case studies for each one: 1) OOD Generalization with Multiple Modalities, where multiple modalities, such as language and image, are provided in the training, and we focus on a specific case, Visual Dialog, to analyze its underlying causal relationships between the modalities and propose two causal principles to remove the history bias and user bias for better OOD performance. 2) OOD Generalization with Multiple Domains, where there is only one modality, images, but multiple training domains and their domain annotations are given. We focus on Domain Generalization (DG) and propose to create a new domain by cross-domain influence to remove the ``spurious invariance'' bias to help current DG methods achieve better OOD performance. 3) OOD Generalization with no Additional Annotations, where only one modality, images, and one training domain with no additional annotations, such as domain annotations or bias annotations, are given in the training. We focus on a specific case, Debiasing, and propose two algorithms for removing bias. First, we design a two-stage pipeline with re-weighting methods to effectively remove the underlying context bias. Second, due to the context estimation method used by current re-weighting is hard to succeed when class effect and context effect are entangled, we propose Invariant Risk Minimization for Context to disentangle the context to achieve better re-weighting for removing context bias to achieve better OOD Generalization for debiasing.
author2 Zhang Hanwang
author_facet Zhang Hanwang
Qi, Jiaxin
format Thesis-Doctor of Philosophy
author Qi, Jiaxin
author_sort Qi, Jiaxin
title Removing bias for out-of-distribution generalization
title_short Removing bias for out-of-distribution generalization
title_full Removing bias for out-of-distribution generalization
title_fullStr Removing bias for out-of-distribution generalization
title_full_unstemmed Removing bias for out-of-distribution generalization
title_sort removing bias for out-of-distribution generalization
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/168654
_version_ 1772826880902168576