Cross-Modal Deep Variational Hashing
In this paper, we propose a cross-modal deep variational hashing (CMDVH) method to learn compact binary codes for cross-modality multimedia retrieval. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/85091 http://hdl.handle.net/10220/44014 http://openaccess.thecvf.com/content_iccv_2017/html/Liong_Cross-Modal_Deep_Variational_ICCV_2017_paper.html |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-85091 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-850912020-11-01T04:44:01Z Cross-Modal Deep Variational Hashing Liong, Venice Erin Lu, Jiwen Zhou, Jie Tan, Yap Peng School of Electrical and Electronic Engineering Interdisciplinary Graduate School (IGS) 2017 IEEE International Conference on Computer Vision (ICCV 17) Image Retrieval Deep Learning In this paper, we propose a cross-modal deep variational hashing (CMDVH) method to learn compact binary codes for cross-modality multimedia retrieval. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep fusion neural network to learn non-linear transformations from image-text input pairs, such that a unified binary code is achieved in a discrete and discriminative manner using a classification-based hinge-loss criterion. We then design modality-specific neural networks in a probabilistic manner such that we model a latent variable to be close as possible from the inferred binary codes, at the same time approximated by a posterior distribution regularized by a known prior, which is suitable for out-of-sample extension. Experimental results on three benchmark datasets show the efficacy of the proposed approach. Accepted version 2017-11-09T03:20:21Z 2019-12-06T15:56:52Z 2017-11-09T03:20:21Z 2019-12-06T15:56:52Z 2017 Conference Paper Liong, V. E., Lu, J., Tan, Y.-P., & Zhou, J. (2017). Cross-Modal Deep Variational Hashing. 2017 IEEE International Conference on Computer Vision (ICCV 17), 4077-4085. https://hdl.handle.net/10356/85091 http://hdl.handle.net/10220/44014 http://openaccess.thecvf.com/content_iccv_2017/html/Liong_Cross-Modal_Deep_Variational_ICCV_2017_paper.html en © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://openaccess.thecvf.com/content_iccv_2017/html/Liong_Cross-Modal_Deep_Variational_ICCV_2017_paper.html]. 9 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Image Retrieval Deep Learning |
spellingShingle |
Image Retrieval Deep Learning Liong, Venice Erin Lu, Jiwen Zhou, Jie Tan, Yap Peng Cross-Modal Deep Variational Hashing |
description |
In this paper, we propose a cross-modal deep variational hashing (CMDVH) method to learn compact binary codes for cross-modality multimedia retrieval. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep fusion neural network to learn non-linear transformations from image-text input pairs, such that a unified binary code is achieved in a discrete and discriminative manner using a classification-based hinge-loss criterion. We then design modality-specific neural networks in a probabilistic manner such that we model a latent variable to be close as possible from the inferred binary codes, at the same time approximated by a posterior distribution regularized by a known prior, which is suitable for out-of-sample extension. Experimental results on three benchmark datasets show the efficacy of the proposed approach. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Liong, Venice Erin Lu, Jiwen Zhou, Jie Tan, Yap Peng |
format |
Conference or Workshop Item |
author |
Liong, Venice Erin Lu, Jiwen Zhou, Jie Tan, Yap Peng |
author_sort |
Liong, Venice Erin |
title |
Cross-Modal Deep Variational Hashing |
title_short |
Cross-Modal Deep Variational Hashing |
title_full |
Cross-Modal Deep Variational Hashing |
title_fullStr |
Cross-Modal Deep Variational Hashing |
title_full_unstemmed |
Cross-Modal Deep Variational Hashing |
title_sort |
cross-modal deep variational hashing |
publishDate |
2017 |
url |
https://hdl.handle.net/10356/85091 http://hdl.handle.net/10220/44014 http://openaccess.thecvf.com/content_iccv_2017/html/Liong_Cross-Modal_Deep_Variational_ICCV_2017_paper.html |
_version_ |
1683494629218975744 |