Learning query and image similarities with ranking canonical correlation analysis

One of the fundamental problems in image search is to learn the ranking functions, i.e., similarity between the query and image. The research on this topic has evolved through two paradigms: feature-based vector model and image ranker learning. The former relies on the image surrounding texts, while...

Full description

Saved in:
Bibliographic Details
Main Authors: YAO, Ting, MEI, Tao, NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6519
https://ink.library.smu.edu.sg/context/sis_research/article/7522/viewcontent/Yao_Learning_Query_and_ICCV_2015_paper.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:One of the fundamental problems in image search is to learn the ranking functions, i.e., similarity between the query and image. The research on this topic has evolved through two paradigms: feature-based vector model and image ranker learning. The former relies on the image surrounding texts, while the latter learns a ranker based on human labeled query-image pairs. Each of the paradigms has its own limitation. The vector model is sensitive to the quality of text descriptions, and the learning paradigm is difficult to be scaled up as human labeling is always too expensive to obtain. We demonstrate in this paper that the above two limitations can be well mitigated by jointly exploring subspace learning and the use of click-through data. Specifically, we propose a novel Ranking Canonical Correlation Analysis (RCCA) for learning query and image similarities. RCCA initially finds a common subspace between query and image views by maximizing their correlations, and further simultaneously learns a bilinear query-image similarity function and adjusts the subspace to preserve the preference relations implicit in the click-through data. Once the subspace is finalized, query-image similarity can be computed by the bilinear similarity function on their mappings in this subspace. On a large-scale click-based image dataset with 11.7 million queries and one million images, RCCA is shown to be powerful for image search with superior performance over several state-of-the-art methods on both keyword-based and query-by-example tasks.