Learning cross-domain semantic-visual relationships for transductive zero-shot learning
Zero-Shot Learning (ZSL) learns models for recognizing new classes. One of the main challenges in ZSL is the domain discrepancy caused by the category inconsistency between training and testing data. Domain adaptation is the most intuitive way to address this challenge. However, existing domain adap...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172041 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-172041 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1720412023-11-20T04:58:52Z Learning cross-domain semantic-visual relationships for transductive zero-shot learning Lv, Fengmao Zhang, Jianyang Yang, Guowu Feng, Lei Yu, Yufeng Duan, Lixin School of Computer Science and Engineering Engineering::Computer science and engineering Transfer Learning Domain Adaptation Zero-Shot Learning (ZSL) learns models for recognizing new classes. One of the main challenges in ZSL is the domain discrepancy caused by the category inconsistency between training and testing data. Domain adaptation is the most intuitive way to address this challenge. However, existing domain adaptation techniques cannot be directly applied into ZSL due to the disjoint label space between source and target domains. This work proposes the Transferrable Semantic-Visual Relation (TSVR) approach towards transductive ZSL. TSVR redefines image recognition as predicting the similarity/dissimilarity labels for semantic-visual fusions consisting of class attributes and visual features. After the above transformation, the source and target domains can have the same label space, which hence enables to quantify domain discrepancy. For the redefined problem, the number of similar semantic-visual pairs is significantly smaller than that of dissimilar ones. To this end, we further propose to use Domain-Specific Batch Normalization to align the domain discrepancy. This paper is supported by the National Natural Science Foundation of China [grant numbers 62106204 , 62172075 ], the Natural Science Foundation of Sichuan [grant numbers 2022NSFSC0911, 2022YFG0031], the Fundamental Research Funds for the Central Universities of China [grant number 2682022CX068 ], and the Science and Technology Planning Project of Guangzhou [grant number 202102020699]. 2023-11-20T04:58:52Z 2023-11-20T04:58:52Z 2023 Journal Article Lv, F., Zhang, J., Yang, G., Feng, L., Yu, Y. & Duan, L. (2023). Learning cross-domain semantic-visual relationships for transductive zero-shot learning. Pattern Recognition, 141, 109591-. https://dx.doi.org/10.1016/j.patcog.2023.109591 0031-3203 https://hdl.handle.net/10356/172041 10.1016/j.patcog.2023.109591 2-s2.0-85153277243 141 109591 en Pattern Recognition © 2023 Elsevier Ltd. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Transfer Learning Domain Adaptation |
spellingShingle |
Engineering::Computer science and engineering Transfer Learning Domain Adaptation Lv, Fengmao Zhang, Jianyang Yang, Guowu Feng, Lei Yu, Yufeng Duan, Lixin Learning cross-domain semantic-visual relationships for transductive zero-shot learning |
description |
Zero-Shot Learning (ZSL) learns models for recognizing new classes. One of the main challenges in ZSL is the domain discrepancy caused by the category inconsistency between training and testing data. Domain adaptation is the most intuitive way to address this challenge. However, existing domain adaptation techniques cannot be directly applied into ZSL due to the disjoint label space between source and target domains. This work proposes the Transferrable Semantic-Visual Relation (TSVR) approach towards transductive ZSL. TSVR redefines image recognition as predicting the similarity/dissimilarity labels for semantic-visual fusions consisting of class attributes and visual features. After the above transformation, the source and target domains can have the same label space, which hence enables to quantify domain discrepancy. For the redefined problem, the number of similar semantic-visual pairs is significantly smaller than that of dissimilar ones. To this end, we further propose to use Domain-Specific Batch Normalization to align the domain discrepancy. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Lv, Fengmao Zhang, Jianyang Yang, Guowu Feng, Lei Yu, Yufeng Duan, Lixin |
format |
Article |
author |
Lv, Fengmao Zhang, Jianyang Yang, Guowu Feng, Lei Yu, Yufeng Duan, Lixin |
author_sort |
Lv, Fengmao |
title |
Learning cross-domain semantic-visual relationships for transductive zero-shot learning |
title_short |
Learning cross-domain semantic-visual relationships for transductive zero-shot learning |
title_full |
Learning cross-domain semantic-visual relationships for transductive zero-shot learning |
title_fullStr |
Learning cross-domain semantic-visual relationships for transductive zero-shot learning |
title_full_unstemmed |
Learning cross-domain semantic-visual relationships for transductive zero-shot learning |
title_sort |
learning cross-domain semantic-visual relationships for transductive zero-shot learning |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/172041 |
_version_ |
1783955625216049152 |