Online multimodal distance metric learning with application to image retrieval
Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance fu...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2013
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/2333 https://ink.library.smu.edu.sg/context/sis_research/article/3333/viewcontent/p153_wu.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-3333 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-33332020-04-01T06:30:28Z Online multimodal distance metric learning with application to image retrieval WU, Pengcheng HOI, Steven C. H. XIA, Hao ZHAO, Peilin WANG, Dayong MIAO, Chunyan Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance function on the input feature space, in which the assumption of linearity limits their capacity of measuring the similarity on complex patterns in real-world applications; (ii) they are often designed for learning distance metrics on uni-modal data, which may not effectively handle the similarity measures for multimedia objects with multimodal representations. To address these limitations, in this paper, we propose a novel framework of online multimodal deep similarity learning (OMDSL), which aims to optimally integrate multiple deep neural networks pretrained with stacked denoising autoencoder. In particular, the proposed framework explores a unified two-stage online learning scheme that consists of (i) learning a flexible nonlinear transformation function for each individual modality, and (ii) learning to find the optimal combination of multiple diverse modalities simultaneously in a coherent process. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms for multimodal image retrieval tasks, in which the encouraging results validate the effectiveness of the proposed technique. 2013-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/2333 info:doi/10.1145/2502081.2502112 https://ink.library.smu.edu.sg/context/sis_research/article/3333/viewcontent/p153_wu.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Deep learning Distance metric learning Image retrieval Online learning Similarity learning Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Deep learning Distance metric learning Image retrieval Online learning Similarity learning Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Deep learning Distance metric learning Image retrieval Online learning Similarity learning Computer Sciences Databases and Information Systems Numerical Analysis and Scientific Computing WU, Pengcheng HOI, Steven C. H. XIA, Hao ZHAO, Peilin WANG, Dayong MIAO, Chunyan Online multimodal distance metric learning with application to image retrieval |
description |
Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance function on the input feature space, in which the assumption of linearity limits their capacity of measuring the similarity on complex patterns in real-world applications; (ii) they are often designed for learning distance metrics on uni-modal data, which may not effectively handle the similarity measures for multimedia objects with multimodal representations. To address these limitations, in this paper, we propose a novel framework of online multimodal deep similarity learning (OMDSL), which aims to optimally integrate multiple deep neural networks pretrained with stacked denoising autoencoder. In particular, the proposed framework explores a unified two-stage online learning scheme that consists of (i) learning a flexible nonlinear transformation function for each individual modality, and (ii) learning to find the optimal combination of multiple diverse modalities simultaneously in a coherent process. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms for multimodal image retrieval tasks, in which the encouraging results validate the effectiveness of the proposed technique. |
format |
text |
author |
WU, Pengcheng HOI, Steven C. H. XIA, Hao ZHAO, Peilin WANG, Dayong MIAO, Chunyan |
author_facet |
WU, Pengcheng HOI, Steven C. H. XIA, Hao ZHAO, Peilin WANG, Dayong MIAO, Chunyan |
author_sort |
WU, Pengcheng |
title |
Online multimodal distance metric learning with application to image retrieval |
title_short |
Online multimodal distance metric learning with application to image retrieval |
title_full |
Online multimodal distance metric learning with application to image retrieval |
title_fullStr |
Online multimodal distance metric learning with application to image retrieval |
title_full_unstemmed |
Online multimodal distance metric learning with application to image retrieval |
title_sort |
online multimodal distance metric learning with application to image retrieval |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2013 |
url |
https://ink.library.smu.edu.sg/sis_research/2333 https://ink.library.smu.edu.sg/context/sis_research/article/3333/viewcontent/p153_wu.pdf |
_version_ |
1770572100490231808 |