Zero-shot learning via category-specific visual-semantic mapping and label refinement

Zero-shot learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the in...

Full description

Saved in:
Bibliographic Details
Main Authors: Niu, Li, Cai, Jianfei, Veeraraghavan, Ashok, Zhang, Liqing
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/142785
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-142785
record_format dspace
spelling sg-ntu-dr.10356-1427852020-06-30T07:17:03Z Zero-shot learning via category-specific visual-semantic mapping and label refinement Niu, Li Cai, Jianfei Veeraraghavan, Ashok Zhang, Liqing School of Computer Science and Engineering Engineering::Computer science and engineering Zero-shot Learning (ZSL) Domain Adaptation Zero-shot learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the intermediate semantic space. However, the visual-semantic mapping (i.e., projection) learnt based on seen categories may not generalize well to unseen categories, which is known as the projection domain shift in ZSL. To address this projection domain shift issue, we propose a method named adaptive embedding ZSL (AEZSL) to learn an adaptive visual-semantic mapping for each unseen category, followed by progressive label refinement. Moreover, to avoid learning visual-semantic mapping for each unseen category in the large-scale classification task, we additionally propose a deep adaptive embedding model named deep AEZSL sharing the similar idea (i.e., visual-semantic mapping should be category specific and related to the semantic space) with AEZSL, which only needs to be trained once, but can be applied to arbitrary number of unseen categories. Extensive experiments demonstrate that our proposed methods achieve the state-of-the-art results for image classification on three small-scale benchmark datasets and one large-scale benchmark dataset. Accepted version 2020-06-30T07:17:03Z 2020-06-30T07:17:03Z 2018 Journal Article Niu, L., Cai, J., Veeraraghavan, A., & Zhang, L. (2019). Zero-shot learning via category-specific visual-semantic mapping and label refinement. IEEE Transactions on Image Processing, 28(2), 965-979. doi:10.1109/tip.2018.2872916 1057-7149 https://hdl.handle.net/10356/142785 10.1109/TIP.2018.2872916 2 28 965 979 en IEEE Transactions on Image Processing © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TIP.2018.2872916. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Zero-shot Learning (ZSL)
Domain Adaptation
spellingShingle Engineering::Computer science and engineering
Zero-shot Learning (ZSL)
Domain Adaptation
Niu, Li
Cai, Jianfei
Veeraraghavan, Ashok
Zhang, Liqing
Zero-shot learning via category-specific visual-semantic mapping and label refinement
description Zero-shot learning (ZSL) aims to classify a test instance from an unseen category based on the training instances from seen categories in which the gap between seen categories and unseen categories is generally bridged via visual-semantic mapping between the low-level visual feature space and the intermediate semantic space. However, the visual-semantic mapping (i.e., projection) learnt based on seen categories may not generalize well to unseen categories, which is known as the projection domain shift in ZSL. To address this projection domain shift issue, we propose a method named adaptive embedding ZSL (AEZSL) to learn an adaptive visual-semantic mapping for each unseen category, followed by progressive label refinement. Moreover, to avoid learning visual-semantic mapping for each unseen category in the large-scale classification task, we additionally propose a deep adaptive embedding model named deep AEZSL sharing the similar idea (i.e., visual-semantic mapping should be category specific and related to the semantic space) with AEZSL, which only needs to be trained once, but can be applied to arbitrary number of unseen categories. Extensive experiments demonstrate that our proposed methods achieve the state-of-the-art results for image classification on three small-scale benchmark datasets and one large-scale benchmark dataset.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Niu, Li
Cai, Jianfei
Veeraraghavan, Ashok
Zhang, Liqing
format Article
author Niu, Li
Cai, Jianfei
Veeraraghavan, Ashok
Zhang, Liqing
author_sort Niu, Li
title Zero-shot learning via category-specific visual-semantic mapping and label refinement
title_short Zero-shot learning via category-specific visual-semantic mapping and label refinement
title_full Zero-shot learning via category-specific visual-semantic mapping and label refinement
title_fullStr Zero-shot learning via category-specific visual-semantic mapping and label refinement
title_full_unstemmed Zero-shot learning via category-specific visual-semantic mapping and label refinement
title_sort zero-shot learning via category-specific visual-semantic mapping and label refinement
publishDate 2020
url https://hdl.handle.net/10356/142785
_version_ 1681059335190347776