Scene recognition by semantic visual words
In this paper, we propose a novel approach to introduce semantic relations into the bag-of-words framework. We use the latent semantic models, such as latent semantic analysis (LSA) and probabilistic latent semantic analysis (pLSA), in order to define semantically rich features and embed the visual...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/80983 http://hdl.handle.net/10220/38975 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-80983 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-809832020-05-28T07:18:57Z Scene recognition by semantic visual words Farahzadeh, Elahe Sluzek, Andrzej Cham Tat Jen (SCE) School of Computer Engineering Centre for Computational Intelligence Centre for Multimedia and Network Technology Scene recognition Semantic vocabulary Visual words In this paper, we propose a novel approach to introduce semantic relations into the bag-of-words framework. We use the latent semantic models, such as latent semantic analysis (LSA) and probabilistic latent semantic analysis (pLSA), in order to define semantically rich features and embed the visual features into a semantic space. The semantic features used in LSA technique are derived from the low-rank approximation of word–image occurrence matrix by singular value decomposition. Similarly, by using the pLSA approach, the topic-specific distributions of words can be considered dimensions of a concept space. In the proposed space, the distances between words represent the semantic distances which are used for constructing a discriminative and semantically meaningful vocabulary. Position information significantly improves scene recognition accuracy. Inspired by this, in this paper, we bring position information into the proposed semantic vocabulary frameworks. We have tested our approach on the 15-Scene and 67-MIT Indoor datasets and have achieved very promising results. Accepted version 2015-12-07T04:48:57Z 2019-12-06T14:18:53Z 2015-12-07T04:48:57Z 2019-12-06T14:18:53Z 2014 Journal Article Farahzadeh, E., Cham, T. J., & Sluzek, A. (2015). Scene recognition by semantic visual words. Signal, Image and Video Processing, 9(8), 1935-1944. 1863-1703 https://hdl.handle.net/10356/80983 http://hdl.handle.net/10220/38975 10.1007/s11760-014-0687-7 en Signal, Image and Video Processing © 2014 Springer-Verlag London. This is the author created version of a work that has been peer reviewed and accepted for publication by Signal, Image and Video Processing, Springer-Verlag London. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1007/s11760-014-0687-7]. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Scene recognition Semantic vocabulary Visual words |
spellingShingle |
Scene recognition Semantic vocabulary Visual words Farahzadeh, Elahe Sluzek, Andrzej Cham Tat Jen (SCE) Scene recognition by semantic visual words |
description |
In this paper, we propose a novel approach to introduce semantic relations into the bag-of-words framework. We use the latent semantic models, such as latent semantic analysis (LSA) and probabilistic latent semantic analysis (pLSA), in order to define semantically rich features and embed the visual features into a semantic space. The semantic features used in LSA technique are derived from the low-rank approximation of word–image occurrence matrix by singular value decomposition. Similarly, by using the pLSA approach, the topic-specific distributions of words can be considered dimensions of a concept space. In the proposed space, the distances between words represent the semantic distances which are used for constructing a discriminative and semantically meaningful vocabulary. Position information significantly improves scene recognition accuracy. Inspired by this, in this paper, we bring position information into the proposed semantic vocabulary frameworks. We have tested our approach on the 15-Scene and 67-MIT Indoor datasets and have achieved very promising results. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Farahzadeh, Elahe Sluzek, Andrzej Cham Tat Jen (SCE) |
format |
Article |
author |
Farahzadeh, Elahe Sluzek, Andrzej Cham Tat Jen (SCE) |
author_sort |
Farahzadeh, Elahe |
title |
Scene recognition by semantic visual words |
title_short |
Scene recognition by semantic visual words |
title_full |
Scene recognition by semantic visual words |
title_fullStr |
Scene recognition by semantic visual words |
title_full_unstemmed |
Scene recognition by semantic visual words |
title_sort |
scene recognition by semantic visual words |
publishDate |
2015 |
url |
https://hdl.handle.net/10356/80983 http://hdl.handle.net/10220/38975 |
_version_ |
1681056065389592576 |