Semantics-preserving bag-of-words models for efficient image annotation

The Bag-of-Words (BoW) model is a promising image representation for annotation. One critical limitation of existing BoW models is the semantic loss during the codebook generation process, in which BoW simply clusters visual words in Euclidian space. However, distance between two visual words in Euc...

Full description

Saved in:

Bibliographic Details
Main Authors:	WU, Lei, HOI, Steven C. H., YU, Nenghai
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2009
Subjects:	Distance metric learning Bag-of-words model Semantic gap Image annotation Databases and Information Systems Data Storage Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/4189 https://ink.library.smu.edu.sg/context/sis_research/article/5192/viewcontent/p19_wu.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-5192
record_format	dspace
spelling	sg-smu-ink.sis_research-51922018-12-13T09:28:07Z Semantics-preserving bag-of-words models for efficient image annotation WU, Lei HOI, Steven C. H. YU, Nenghai The Bag-of-Words (BoW) model is a promising image representation for annotation. One critical limitation of existing BoW models is the semantic loss during the codebook generation process, in which BoW simply clusters visual words in Euclidian space. However, distance between two visual words in Euclidean space does not necessarily reflect the semantic distance between the two concepts, due to the semantic gap between low-level features and high-level semantics. In this paper, we propose a novel scheme for learning a codebook such that semantically related features will be mapped to the same visual word. In particular, we consider the distance between semantically identical features as a measurement of the semantic gap, and attempt to learn an optimized codebook by minimizing this gap. We refer to such a new codebook method as Semantics-Preserving Codebook (SPC) and the corresponding model as Semantics-Preserving Bag-of-Words model (SPBoW). This novel model generates codebook for each object category and only needs to update the codebook for a specific category when incomes an object, which makes it convenient to scale up with the increasing number of objects. Experiments on image annotation tasks with a public testbed from MIT's Labelme project, which contains 11,281 objects of 495 categories, show that the SPC learning scheme is efficient in handling large number of objects and is able to greatly improve the performance of the existing BoW model. 2009-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4189 info:doi/10.1145/1631058.1631064 https://ink.library.smu.edu.sg/context/sis_research/article/5192/viewcontent/p19_wu.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Distance metric learning Bag-of-words model Semantic gap Image annotation Databases and Information Systems Data Storage Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Distance metric learning Bag-of-words model Semantic gap Image annotation Databases and Information Systems Data Storage Systems
spellingShingle	Distance metric learning Bag-of-words model Semantic gap Image annotation Databases and Information Systems Data Storage Systems WU, Lei HOI, Steven C. H. YU, Nenghai Semantics-preserving bag-of-words models for efficient image annotation
description	The Bag-of-Words (BoW) model is a promising image representation for annotation. One critical limitation of existing BoW models is the semantic loss during the codebook generation process, in which BoW simply clusters visual words in Euclidian space. However, distance between two visual words in Euclidean space does not necessarily reflect the semantic distance between the two concepts, due to the semantic gap between low-level features and high-level semantics. In this paper, we propose a novel scheme for learning a codebook such that semantically related features will be mapped to the same visual word. In particular, we consider the distance between semantically identical features as a measurement of the semantic gap, and attempt to learn an optimized codebook by minimizing this gap. We refer to such a new codebook method as Semantics-Preserving Codebook (SPC) and the corresponding model as Semantics-Preserving Bag-of-Words model (SPBoW). This novel model generates codebook for each object category and only needs to update the codebook for a specific category when incomes an object, which makes it convenient to scale up with the increasing number of objects. Experiments on image annotation tasks with a public testbed from MIT's Labelme project, which contains 11,281 objects of 495 categories, show that the SPC learning scheme is efficient in handling large number of objects and is able to greatly improve the performance of the existing BoW model.
format	text
author	WU, Lei HOI, Steven C. H. YU, Nenghai
author_facet	WU, Lei HOI, Steven C. H. YU, Nenghai
author_sort	WU, Lei
title	Semantics-preserving bag-of-words models for efficient image annotation
title_short	Semantics-preserving bag-of-words models for efficient image annotation
title_full	Semantics-preserving bag-of-words models for efficient image annotation
title_fullStr	Semantics-preserving bag-of-words models for efficient image annotation
title_full_unstemmed	Semantics-preserving bag-of-words models for efficient image annotation
title_sort	semantics-preserving bag-of-words models for efficient image annotation
publisher	Institutional Knowledge at Singapore Management University
publishDate	2009
url	https://ink.library.smu.edu.sg/sis_research/4189 https://ink.library.smu.edu.sg/context/sis_research/article/5192/viewcontent/p19_wu.pdf
_version_	1770574423438393344

Semantics-preserving bag-of-words models for efficient image annotation

Similar Items