Transductive transfer learning for visual recognition

In recent years, deep neural networks (DNNs) have brought great advances to various computer vision tasks, such as image classification, object detection, semantic segmentation, etc. However, the considerable successes of DNNs are achieved at a high cost of quite a lot of densely labeled training im...

Full description

Saved in:
Bibliographic Details
Main Author: Huang, Jiaxing
Other Authors: Lu Shijian
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/164573
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-164573
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Huang, Jiaxing
Transductive transfer learning for visual recognition
description In recent years, deep neural networks (DNNs) have brought great advances to various computer vision tasks, such as image classification, object detection, semantic segmentation, etc. However, the considerable successes of DNNs are achieved at a high cost of quite a lot of densely labeled training images that are extremely costly and laborious to establish. One way of circumventing such a limitation is to utilize the annotated images from existing related datasets (called the “source domain”) in network training. Unfortunately, the DNNs trained on source domains often undergo a drastic performance degradation when applied to the “target domain” due to the distribution mismatch. In such scenarios, transfer learning (or called knowledge transfer) between domains is desirable and necessary. In this thesis, we explore Transductive Transfer Learning for visual recognition, where the data distributions of labeled source and unlabeled target domains are different while the source and target tasks are the same. More specifically, we investigate three representative types of transductive transfer learning, including domain generalization, unsupervised domain adaptation and source-free unsupervised domain adaptation. In domain generalization, given labeled source domain data, the goal is to learn a generalized visual recognition model that well performs over unseen target-domain data. In other words, domain generalization aims to learn domain invariant features (or transferable features) without requiring target-domain data in training. In this thesis, we proposed a novel domain generalization approach that effectively randomizes source-domain images in frequency space, which encourages DNNs to learn style-invariant visual features that generalize well to unseen target domains. In unsupervised domain adaptation, given labeled source-domain data and unla beled target-domain data, the goal is to learn an adaptive visual recognition model that well performs over target-domain data. Different from domain generalization, in the transfer learning setup of unsupervised domain adaptation, the unlabeled target-domain data is accessible during training. Therefore, unsupervised domain adaptation largely focuses on exploiting unlabelled target-domain data to improve network performance. In this thesis, we developed four novel unsupervised domain adaptation techniques that effectively transfer knowledge from labeled source domains to the unlabeled target domain. More specifically, we designed different unsupervised losses on unlabeled target-domain data for learning a well-performed model in the target domain. In source-free unsupervised domain adaptation, given a source-trained model and unlabeled target-domain data, the goal is to adapt the source-trained model to per form well on unlabeled target-domain data. Different from unsupervised domain adaptation, in the transfer learning setup of source-free unsupervised domain adaptation, the labeled source-domain data is not accessible during training, where we aim to adapt source-trained models to fit target data distribution without accessing the source-domain data. Under a such transfer learning setup, the only information carried forward is a portable source-trained model, which largely alleviates the concern of data privacy, data portability and data transmission efficiency. To this end, we proposed a novel source-free unsupervised domain adaptation approach that exploits historical source hypothesis to make up for the absence of source-domain data in this transfer learning setup. Experimental results over various visual recognition benchmarks indicate our pro posed transfer learning approaches achieve superior performance, enabling transferring DNNs across different domains.
author2 Lu Shijian
author_facet Lu Shijian
Huang, Jiaxing
format Thesis-Doctor of Philosophy
author Huang, Jiaxing
author_sort Huang, Jiaxing
title Transductive transfer learning for visual recognition
title_short Transductive transfer learning for visual recognition
title_full Transductive transfer learning for visual recognition
title_fullStr Transductive transfer learning for visual recognition
title_full_unstemmed Transductive transfer learning for visual recognition
title_sort transductive transfer learning for visual recognition
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/164573
_version_ 1759855784143880192
spelling sg-ntu-dr.10356-1645732023-03-06T07:30:04Z Transductive transfer learning for visual recognition Huang, Jiaxing Lu Shijian School of Computer Science and Engineering Shijian.Lu@ntu.edu.sg Engineering::Computer science and engineering In recent years, deep neural networks (DNNs) have brought great advances to various computer vision tasks, such as image classification, object detection, semantic segmentation, etc. However, the considerable successes of DNNs are achieved at a high cost of quite a lot of densely labeled training images that are extremely costly and laborious to establish. One way of circumventing such a limitation is to utilize the annotated images from existing related datasets (called the “source domain”) in network training. Unfortunately, the DNNs trained on source domains often undergo a drastic performance degradation when applied to the “target domain” due to the distribution mismatch. In such scenarios, transfer learning (or called knowledge transfer) between domains is desirable and necessary. In this thesis, we explore Transductive Transfer Learning for visual recognition, where the data distributions of labeled source and unlabeled target domains are different while the source and target tasks are the same. More specifically, we investigate three representative types of transductive transfer learning, including domain generalization, unsupervised domain adaptation and source-free unsupervised domain adaptation. In domain generalization, given labeled source domain data, the goal is to learn a generalized visual recognition model that well performs over unseen target-domain data. In other words, domain generalization aims to learn domain invariant features (or transferable features) without requiring target-domain data in training. In this thesis, we proposed a novel domain generalization approach that effectively randomizes source-domain images in frequency space, which encourages DNNs to learn style-invariant visual features that generalize well to unseen target domains. In unsupervised domain adaptation, given labeled source-domain data and unla beled target-domain data, the goal is to learn an adaptive visual recognition model that well performs over target-domain data. Different from domain generalization, in the transfer learning setup of unsupervised domain adaptation, the unlabeled target-domain data is accessible during training. Therefore, unsupervised domain adaptation largely focuses on exploiting unlabelled target-domain data to improve network performance. In this thesis, we developed four novel unsupervised domain adaptation techniques that effectively transfer knowledge from labeled source domains to the unlabeled target domain. More specifically, we designed different unsupervised losses on unlabeled target-domain data for learning a well-performed model in the target domain. In source-free unsupervised domain adaptation, given a source-trained model and unlabeled target-domain data, the goal is to adapt the source-trained model to per form well on unlabeled target-domain data. Different from unsupervised domain adaptation, in the transfer learning setup of source-free unsupervised domain adaptation, the labeled source-domain data is not accessible during training, where we aim to adapt source-trained models to fit target data distribution without accessing the source-domain data. Under a such transfer learning setup, the only information carried forward is a portable source-trained model, which largely alleviates the concern of data privacy, data portability and data transmission efficiency. To this end, we proposed a novel source-free unsupervised domain adaptation approach that exploits historical source hypothesis to make up for the absence of source-domain data in this transfer learning setup. Experimental results over various visual recognition benchmarks indicate our pro posed transfer learning approaches achieve superior performance, enabling transferring DNNs across different domains. Doctor of Philosophy 2023-02-06T04:22:32Z 2023-02-06T04:22:32Z 2023 Thesis-Doctor of Philosophy Huang, J. (2023). Transductive transfer learning for visual recognition. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/164573 https://hdl.handle.net/10356/164573 10.32657/10356/164573 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University