Learning a cross-modal hashing network for multimedia search

In this paper, we propose a cross-modal hashing network (CMHN) method to learn compact binary codes for cross-modality multimedia search. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep neural networ...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Tan, Yap Peng, Liong, Venice Erin, Lu, Jiwen
مؤلفون آخرون:	School of Electrical and Electronic Engineering
التنسيق:	Conference or Workshop Item
اللغة:	English
منشور في:	2018
الموضوعات:	Hashing Cross-modal Retrieval
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/85331 http://hdl.handle.net/10220/44604
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-85331
record_format	dspace
spelling	sg-ntu-dr.10356-853312020-11-01T04:43:03Z Learning a cross-modal hashing network for multimedia search Tan, Yap Peng Liong, Venice Erin Lu, Jiwen School of Electrical and Electronic Engineering Interdisciplinary Graduate School (IGS) 2017 IEEE International Conference on Image Processing (ICIP) Hashing Cross-modal Retrieval In this paper, we propose a cross-modal hashing network (CMHN) method to learn compact binary codes for cross-modality multimedia search. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep neural network to learn multiple pairs of hierarchical non-linear transformations, under which the nonlinear characteristics of samples can be well exploited and the modality gap is well reduced. Our model is trained under an iterative optimization procedure which learns a (1) unified binary code discretely and discriminatively through a classification-based hinge-loss criterion, and (2) cross-modal hashing network, one deep network for each modality, through minimizing the quantization loss between real-valued neural code and binary code, and maximizing the variance of the learned neural codes. Experimental results on two benchmark datasets show the efficacy of the proposed approach. Accepted version 2018-03-23T06:16:21Z 2019-12-06T16:01:45Z 2018-03-23T06:16:21Z 2019-12-06T16:01:45Z 2017 Conference Paper Liong, V. E., Lu, J., & Tan, Y.-P. (2017, September). Learning a cross-modal hashing network for multimedia search. Paper presented at 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China (pp. 3700-3704). IEEE. https://hdl.handle.net/10356/85331 http://hdl.handle.net/10220/44604 10.1109/ICIP.2017.8296973 en © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/ICIP.2017.8296973]. 5 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Hashing Cross-modal Retrieval
spellingShingle	Hashing Cross-modal Retrieval Tan, Yap Peng Liong, Venice Erin Lu, Jiwen Learning a cross-modal hashing network for multimedia search
description	In this paper, we propose a cross-modal hashing network (CMHN) method to learn compact binary codes for cross-modality multimedia search. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep neural network to learn multiple pairs of hierarchical non-linear transformations, under which the nonlinear characteristics of samples can be well exploited and the modality gap is well reduced. Our model is trained under an iterative optimization procedure which learns a (1) unified binary code discretely and discriminatively through a classification-based hinge-loss criterion, and (2) cross-modal hashing network, one deep network for each modality, through minimizing the quantization loss between real-valued neural code and binary code, and maximizing the variance of the learned neural codes. Experimental results on two benchmark datasets show the efficacy of the proposed approach.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Tan, Yap Peng Liong, Venice Erin Lu, Jiwen
format	Conference or Workshop Item
author	Tan, Yap Peng Liong, Venice Erin Lu, Jiwen
author_sort	Tan, Yap Peng
title	Learning a cross-modal hashing network for multimedia search
title_short	Learning a cross-modal hashing network for multimedia search
title_full	Learning a cross-modal hashing network for multimedia search
title_fullStr	Learning a cross-modal hashing network for multimedia search
title_full_unstemmed	Learning a cross-modal hashing network for multimedia search
title_sort	learning a cross-modal hashing network for multimedia search
publishDate	2018
url	https://hdl.handle.net/10356/85331 http://hdl.handle.net/10220/44604
_version_	1683493137333354496

Learning a cross-modal hashing network for multimedia search

مواد مشابهة