Cross-Modal Deep Variational Hashing

In this paper, we propose a cross-modal deep variational hashing (CMDVH) method to learn compact binary codes for cross-modality multimedia retrieval. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep...

Full description

Saved in:
Bibliographic Details
Main Authors: Liong, Venice Erin, Lu, Jiwen, Zhou, Jie, Tan, Yap Peng
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2017
Subjects:
Online Access:https://hdl.handle.net/10356/85091
http://hdl.handle.net/10220/44014
http://openaccess.thecvf.com/content_iccv_2017/html/Liong_Cross-Modal_Deep_Variational_ICCV_2017_paper.html
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this paper, we propose a cross-modal deep variational hashing (CMDVH) method to learn compact binary codes for cross-modality multimedia retrieval. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep fusion neural network to learn non-linear transformations from image-text input pairs, such that a unified binary code is achieved in a discrete and discriminative manner using a classification-based hinge-loss criterion. We then design modality-specific neural networks in a probabilistic manner such that we model a latent variable to be close as possible from the inferred binary codes, at the same time approximated by a posterior distribution regularized by a known prior, which is suitable for out-of-sample extension. Experimental results on three benchmark datasets show the efficacy of the proposed approach.