Semantics-Preserving Bag-of-Words Models and Applications

The Bag-of-Words (BoW) model is a promising image representation technique for image categorization and annotation tasks. One critical limitation of existing BoW models is that much semantic information is lost during the codebook generation process, an important step of BoW. This is because the cod...

Full description

Saved in:
Bibliographic Details
Main Authors: WU, Lei, HOI, Steven C. H., YU, Nenghai
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2010
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2309
https://ink.library.smu.edu.sg/context/sis_research/article/3309/viewcontent/TIP_05056_2009_SPBOW_1_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-3309
record_format dspace
spelling sg-smu-ink.sis_research-33092018-12-06T00:39:02Z Semantics-Preserving Bag-of-Words Models and Applications WU, Lei HOI, Steven C. H. YU, Nenghai The Bag-of-Words (BoW) model is a promising image representation technique for image categorization and annotation tasks. One critical limitation of existing BoW models is that much semantic information is lost during the codebook generation process, an important step of BoW. This is because the codebook generated by BoW is often obtained via building the codebook simply by clustering visual features in Euclidian space. However, visual features related to the same semantics may not distribute in clusters in the Euclidian space, which is primarily due to the semantic gap between low-level features and high-level semantics. In this paper, we propose a novel scheme to learn optimized BoW models, which aims to map semantically related features to the same visual words. In particular, we consider the distance between semantically identical features as a measurement of the semantic gap, and attempt to learn an optimized codebook by minimizing this gap, aiming to achieve the minimal loss of the semantics. We refer to such kind of novel codebook as semantics-preserving codebook (SPC) and the corresponding model as the Semantics-Preserving Bag-of-Words (SPBoW) model. Extensive experiments on image annotation and object detection tasks with public testbeds from MIT's Labelme and PASCAL VOC challenge databases show that the proposed SPC learning scheme is effective for optimizing the codebook generation process, and the SPBoW model is able to greatly enhance the performance of the existing BoW model. 2010-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/2309 info:doi/10.1109/TIP.2010.2045169 https://ink.library.smu.edu.sg/context/sis_research/article/3309/viewcontent/TIP_05056_2009_SPBOW_1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Image retrieval Research and development Image storage Image segmentation Image representation Loss measurement Particle measurements Object detection Testing Image databases Computer Sciences Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Image retrieval
Research and development
Image storage
Image segmentation
Image representation
Loss measurement
Particle measurements
Object detection
Testing
Image databases
Computer Sciences
Databases and Information Systems
spellingShingle Image retrieval
Research and development
Image storage
Image segmentation
Image representation
Loss measurement
Particle measurements
Object detection
Testing
Image databases
Computer Sciences
Databases and Information Systems
WU, Lei
HOI, Steven C. H.
YU, Nenghai
Semantics-Preserving Bag-of-Words Models and Applications
description The Bag-of-Words (BoW) model is a promising image representation technique for image categorization and annotation tasks. One critical limitation of existing BoW models is that much semantic information is lost during the codebook generation process, an important step of BoW. This is because the codebook generated by BoW is often obtained via building the codebook simply by clustering visual features in Euclidian space. However, visual features related to the same semantics may not distribute in clusters in the Euclidian space, which is primarily due to the semantic gap between low-level features and high-level semantics. In this paper, we propose a novel scheme to learn optimized BoW models, which aims to map semantically related features to the same visual words. In particular, we consider the distance between semantically identical features as a measurement of the semantic gap, and attempt to learn an optimized codebook by minimizing this gap, aiming to achieve the minimal loss of the semantics. We refer to such kind of novel codebook as semantics-preserving codebook (SPC) and the corresponding model as the Semantics-Preserving Bag-of-Words (SPBoW) model. Extensive experiments on image annotation and object detection tasks with public testbeds from MIT's Labelme and PASCAL VOC challenge databases show that the proposed SPC learning scheme is effective for optimizing the codebook generation process, and the SPBoW model is able to greatly enhance the performance of the existing BoW model.
format text
author WU, Lei
HOI, Steven C. H.
YU, Nenghai
author_facet WU, Lei
HOI, Steven C. H.
YU, Nenghai
author_sort WU, Lei
title Semantics-Preserving Bag-of-Words Models and Applications
title_short Semantics-Preserving Bag-of-Words Models and Applications
title_full Semantics-Preserving Bag-of-Words Models and Applications
title_fullStr Semantics-Preserving Bag-of-Words Models and Applications
title_full_unstemmed Semantics-Preserving Bag-of-Words Models and Applications
title_sort semantics-preserving bag-of-words models and applications
publisher Institutional Knowledge at Singapore Management University
publishDate 2010
url https://ink.library.smu.edu.sg/sis_research/2309
https://ink.library.smu.edu.sg/context/sis_research/article/3309/viewcontent/TIP_05056_2009_SPBOW_1_.pdf
_version_ 1770572094187241472