Removing bias for out-of-distribution generalization

Deep models have a strong ability to fit the training data, and thus can achieve high performance when the testing data is sampled from the same distribution as the training. However, in practice, the deep models fail to perform perfectly because the testing data is usually Out-of-Distribution (OOD)...

Full description

Saved in:

Bibliographic Details
Main Author:	Qi, Jiaxin
Other Authors:	Zhang Hanwang
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/168654
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-168654
record_format	dspace
spelling	sg-ntu-dr.10356-1686542023-07-04T01:52:12Z Removing bias for out-of-distribution generalization Qi, Jiaxin Zhang Hanwang School of Computer Science and Engineering hanwangzhang@ntu.edu.sg Engineering::Computer science and engineering Deep models have a strong ability to fit the training data, and thus can achieve high performance when the testing data is sampled from the same distribution as the training. However, in practice, the deep models fail to perform perfectly because the testing data is usually Out-of-Distribution (OOD) compared to the training, which is known as the OOD Generalization problem. The underlying reason is that, in the training, besides the causal effect, i.e., the causalities between inputs and outputs which describe the data generation process and will not change under any data distribution, the models also learn the bias, i.e., the spurious correlations between inputs and outputs which only exists in the training distribution, and thus learning such bias will make the model fail to generalize to OOD data. To help the models achieve better OOD Generalization performance, we need to pursue the causal effect by removing the learned bias. However, due to the various data organization formats and different given inputs, it is hard to propose a uniform bias removal strategy, and thus we categorize the OOD Generalization tasks into three camps and conduct specific case studies for each one: 1) OOD Generalization with Multiple Modalities, where multiple modalities, such as language and image, are provided in the training, and we focus on a specific case, Visual Dialog, to analyze its underlying causal relationships between the modalities and propose two causal principles to remove the history bias and user bias for better OOD performance. 2) OOD Generalization with Multiple Domains, where there is only one modality, images, but multiple training domains and their domain annotations are given. We focus on Domain Generalization (DG) and propose to create a new domain by cross-domain influence to remove the ``spurious invariance'' bias to help current DG methods achieve better OOD performance. 3) OOD Generalization with no Additional Annotations, where only one modality, images, and one training domain with no additional annotations, such as domain annotations or bias annotations, are given in the training. We focus on a specific case, Debiasing, and propose two algorithms for removing bias. First, we design a two-stage pipeline with re-weighting methods to effectively remove the underlying context bias. Second, due to the context estimation method used by current re-weighting is hard to succeed when class effect and context effect are entangled, we propose Invariant Risk Minimization for Context to disentangle the context to achieve better re-weighting for removing context bias to achieve better OOD Generalization for debiasing. Doctor of Philosophy 2023-06-14T06:23:54Z 2023-06-14T06:23:54Z 2023 Thesis-Doctor of Philosophy Qi, J. (2023). Removing bias for out-of-distribution generalization. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/168654 https://hdl.handle.net/10356/168654 10.32657/10356/168654 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Qi, Jiaxin Removing bias for out-of-distribution generalization
description	Deep models have a strong ability to fit the training data, and thus can achieve high performance when the testing data is sampled from the same distribution as the training. However, in practice, the deep models fail to perform perfectly because the testing data is usually Out-of-Distribution (OOD) compared to the training, which is known as the OOD Generalization problem. The underlying reason is that, in the training, besides the causal effect, i.e., the causalities between inputs and outputs which describe the data generation process and will not change under any data distribution, the models also learn the bias, i.e., the spurious correlations between inputs and outputs which only exists in the training distribution, and thus learning such bias will make the model fail to generalize to OOD data. To help the models achieve better OOD Generalization performance, we need to pursue the causal effect by removing the learned bias. However, due to the various data organization formats and different given inputs, it is hard to propose a uniform bias removal strategy, and thus we categorize the OOD Generalization tasks into three camps and conduct specific case studies for each one: 1) OOD Generalization with Multiple Modalities, where multiple modalities, such as language and image, are provided in the training, and we focus on a specific case, Visual Dialog, to analyze its underlying causal relationships between the modalities and propose two causal principles to remove the history bias and user bias for better OOD performance. 2) OOD Generalization with Multiple Domains, where there is only one modality, images, but multiple training domains and their domain annotations are given. We focus on Domain Generalization (DG) and propose to create a new domain by cross-domain influence to remove the ``spurious invariance'' bias to help current DG methods achieve better OOD performance. 3) OOD Generalization with no Additional Annotations, where only one modality, images, and one training domain with no additional annotations, such as domain annotations or bias annotations, are given in the training. We focus on a specific case, Debiasing, and propose two algorithms for removing bias. First, we design a two-stage pipeline with re-weighting methods to effectively remove the underlying context bias. Second, due to the context estimation method used by current re-weighting is hard to succeed when class effect and context effect are entangled, we propose Invariant Risk Minimization for Context to disentangle the context to achieve better re-weighting for removing context bias to achieve better OOD Generalization for debiasing.
author2	Zhang Hanwang
author_facet	Zhang Hanwang Qi, Jiaxin
format	Thesis-Doctor of Philosophy
author	Qi, Jiaxin
author_sort	Qi, Jiaxin
title	Removing bias for out-of-distribution generalization
title_short	Removing bias for out-of-distribution generalization
title_full	Removing bias for out-of-distribution generalization
title_fullStr	Removing bias for out-of-distribution generalization
title_full_unstemmed	Removing bias for out-of-distribution generalization
title_sort	removing bias for out-of-distribution generalization
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/168654
_version_	1772826880902168576

Removing bias for out-of-distribution generalization

Similar Items