A latent model for visual disambiguation of keyword-based image search

The problem of polysemy in keyword-based image search arises mainly from the inherent ambiguity in user queries. We propose a latent model based approach that resolves user search ambiguity by allowing sense specific diversity in search results. Given a query keyword and the images retrieved by issu...

Full description

Saved in:
Bibliographic Details
Main Authors: WAN, Kong-Wah, TAN, Ah-hwee, LIM, Joo-Hwee, CHIA, Liang-Tien, ROY, Sujoy
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2009
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6745
https://ink.library.smu.edu.sg/context/sis_research/article/7748/viewcontent/Latent_BMVC_2009.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7748
record_format dspace
spelling sg-smu-ink.sis_research-77482022-01-27T10:50:46Z A latent model for visual disambiguation of keyword-based image search WAN, Kong-Wah TAN, Ah-hwee LIM, Joo-Hwee CHIA, Liang-Tien ROY, Sujoy The problem of polysemy in keyword-based image search arises mainly from the inherent ambiguity in user queries. We propose a latent model based approach that resolves user search ambiguity by allowing sense specific diversity in search results. Given a query keyword and the images retrieved by issuing the query to an image search engine, we first learn a latent visual sense model of these polysemous images. Next, we use Wikipedia to disambiguate the word sense of the original query, and issue these Wiki-senses as new queries to retrieve sense specific images. A sense-specific image classifier is then learnt by combining information from the latent visual sense model, and used to cluster and re-rank the polysemous images from the original query keyword into its specific senses. Results on a ground truth of 17K image set returned by 10 keyword searches and their 62 word senses provides empirical indications that our method can improve upon existing keyword based search engines. Our method learns the visual word sense models in a totally unsupervised manner, effectively filters out irrelevant images, and is able to mine the long tail of image search. 2009-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6745 info:doi/10.5244/C.23.67 https://ink.library.smu.edu.sg/context/sis_research/article/7748/viewcontent/Latent_BMVC_2009.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle Databases and Information Systems
Graphics and Human Computer Interfaces
WAN, Kong-Wah
TAN, Ah-hwee
LIM, Joo-Hwee
CHIA, Liang-Tien
ROY, Sujoy
A latent model for visual disambiguation of keyword-based image search
description The problem of polysemy in keyword-based image search arises mainly from the inherent ambiguity in user queries. We propose a latent model based approach that resolves user search ambiguity by allowing sense specific diversity in search results. Given a query keyword and the images retrieved by issuing the query to an image search engine, we first learn a latent visual sense model of these polysemous images. Next, we use Wikipedia to disambiguate the word sense of the original query, and issue these Wiki-senses as new queries to retrieve sense specific images. A sense-specific image classifier is then learnt by combining information from the latent visual sense model, and used to cluster and re-rank the polysemous images from the original query keyword into its specific senses. Results on a ground truth of 17K image set returned by 10 keyword searches and their 62 word senses provides empirical indications that our method can improve upon existing keyword based search engines. Our method learns the visual word sense models in a totally unsupervised manner, effectively filters out irrelevant images, and is able to mine the long tail of image search.
format text
author WAN, Kong-Wah
TAN, Ah-hwee
LIM, Joo-Hwee
CHIA, Liang-Tien
ROY, Sujoy
author_facet WAN, Kong-Wah
TAN, Ah-hwee
LIM, Joo-Hwee
CHIA, Liang-Tien
ROY, Sujoy
author_sort WAN, Kong-Wah
title A latent model for visual disambiguation of keyword-based image search
title_short A latent model for visual disambiguation of keyword-based image search
title_full A latent model for visual disambiguation of keyword-based image search
title_fullStr A latent model for visual disambiguation of keyword-based image search
title_full_unstemmed A latent model for visual disambiguation of keyword-based image search
title_sort latent model for visual disambiguation of keyword-based image search
publisher Institutional Knowledge at Singapore Management University
publishDate 2009
url https://ink.library.smu.edu.sg/sis_research/6745
https://ink.library.smu.edu.sg/context/sis_research/article/7748/viewcontent/Latent_BMVC_2009.pdf
_version_ 1770576057937690624