Zero-shot learning via category-specific visual-semantic mapping and label refinement
Zero-shot learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the in...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/142785 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-142785 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1427852020-06-30T07:17:03Z Zero-shot learning via category-specific visual-semantic mapping and label refinement Niu, Li Cai, Jianfei Veeraraghavan, Ashok Zhang, Liqing School of Computer Science and Engineering Engineering::Computer science and engineering Zero-shot Learning (ZSL) Domain Adaptation Zero-shot learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the intermediate semantic space. However, the visual-semantic mapping (i.e., projection) learnt based on seen categories may not generalize well to unseen categories, which is known as the projection domain shift in ZSL. To address this projection domain shift issue, we propose a method named adaptive embedding ZSL (AEZSL) to learn an adaptive visual-semantic mapping for each unseen category, followed by progressive label refinement. Moreover, to avoid learning visual-semantic mapping for each unseen category in the large-scale classification task, we additionally propose a deep adaptive embedding model named deep AEZSL sharing the similar idea (i.e., visual-semantic mapping should be category specific and related to the semantic space) with AEZSL, which only needs to be trained once, but can be applied to arbitrary number of unseen categories. Extensive experiments demonstrate that our proposed methods achieve the state-of-the-art results for image classification on three small-scale benchmark datasets and one large-scale benchmark dataset. Accepted version 2020-06-30T07:17:03Z 2020-06-30T07:17:03Z 2018 Journal Article Niu, L., Cai, J., Veeraraghavan, A., & Zhang, L. (2019). Zero-shot learning via category-specific visual-semantic mapping and label refinement. IEEE Transactions on Image Processing, 28(2), 965-979. doi:10.1109/tip.2018.2872916 1057-7149 https://hdl.handle.net/10356/142785 10.1109/TIP.2018.2872916 2 28 965 979 en IEEE Transactions on Image Processing © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TIP.2018.2872916. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Zero-shot Learning (ZSL) Domain Adaptation |
spellingShingle |
Engineering::Computer science and engineering Zero-shot Learning (ZSL) Domain Adaptation Niu, Li Cai, Jianfei Veeraraghavan, Ashok Zhang, Liqing Zero-shot learning via category-specific visual-semantic mapping and label refinement |
description |
Zero-shot learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the intermediate semantic space. However, the visual-semantic mapping (i.e., projection) learnt based on seen categories may not generalize well to unseen categories, which is known as the projection domain shift in ZSL. To address this projection domain shift issue, we propose a method named adaptive embedding ZSL (AEZSL) to learn an adaptive visual-semantic mapping for each unseen category, followed by progressive label refinement. Moreover, to avoid learning visual-semantic mapping for each unseen category in the large-scale classification task, we additionally propose a deep adaptive embedding model named deep AEZSL sharing the similar idea (i.e., visual-semantic mapping should be category specific and related to the semantic space) with AEZSL, which only needs to be trained once, but can be applied to arbitrary number of unseen categories. Extensive experiments demonstrate that our proposed methods achieve the state-of-the-art results for image classification on three small-scale benchmark datasets and one large-scale benchmark dataset. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Niu, Li Cai, Jianfei Veeraraghavan, Ashok Zhang, Liqing |
format |
Article |
author |
Niu, Li Cai, Jianfei Veeraraghavan, Ashok Zhang, Liqing |
author_sort |
Niu, Li |
title |
Zero-shot learning via category-specific visual-semantic mapping and label refinement |
title_short |
Zero-shot learning via category-specific visual-semantic mapping and label refinement |
title_full |
Zero-shot learning via category-specific visual-semantic mapping and label refinement |
title_fullStr |
Zero-shot learning via category-specific visual-semantic mapping and label refinement |
title_full_unstemmed |
Zero-shot learning via category-specific visual-semantic mapping and label refinement |
title_sort |
zero-shot learning via category-specific visual-semantic mapping and label refinement |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/142785 |
_version_ |
1681059335190347776 |