Training-free attentive-patch selection for visual place recognition
Visual Place Recognition (VPR) utilizing patch descriptors from Convolutional Neural Networks (CNNs) has shown impressive performance in recent years. Existing works either perform exhaustive matching of all patch descriptors, or employ complex networks to select good candidate patches for further g...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/178533 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-178533 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1785332024-06-25T08:54:14Z Training-free attentive-patch selection for visual place recognition Zhang, Dongshuo Wu, Meiqing Lam, Siew-Kei College of Computing and Data Science School of Computer Science and Engineering 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Hardware & Embedded Systems Lab (HESL) Computer and Information Science Robotics Simultaneous localization and mapping Visual place recognition Visual Place Recognition (VPR) utilizing patch descriptors from Convolutional Neural Networks (CNNs) has shown impressive performance in recent years. Existing works either perform exhaustive matching of all patch descriptors, or employ complex networks to select good candidate patches for further geometric verification. In this work, we develop a novel two-step training-free patch selection method that is fast, while being robust to large occlusions and extreme viewpoint variations. In the first step, a self-attention mechanism is used to select sparse and evenly distributed discriminative patches in the query image. Next, a novel spatial-matching method is used to rapidly select corresponding patches with high similar appearances between the query and each reference image. The proposed method is inspired by how humans perform place recognition by first identifying prominent regions in the query image, and then relying on back-and-forth visual inspection of the query and reference image to attentively identify similar regions while ignoring dissimilar ones. Extensive experiment results show that our proposed method outperforms state-of-the-art (SOTA) methods in both place recognition precision and runtime, on various challenging conditions. Ministry of Education (MOE) Submitted/Accepted version This work was supported in part by the Ministry of Education, Singapore, under its IEO Decentralized Funding, under Grant NGF-2020-09-028; and in part by the Ministry of Education, Singapore, under its Academic Research Fund Tier 1, under Grant RG78/21. 2024-06-25T08:54:14Z 2024-06-25T08:54:14Z 2023 Conference Paper Zhang, D., Wu, M. & Lam, S. (2023). Training-free attentive-patch selection for visual place recognition. 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 9169-9174. https://dx.doi.org/10.1109/IROS55552.2023.10342347 978-1-6654-9190-7 https://hdl.handle.net/10356/178533 10.1109/IROS55552.2023.10342347 9169 9174 en NGF-2020-09-028 RG78/21 © 2023 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/IROS55552.2023.10342347. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Robotics Simultaneous localization and mapping Visual place recognition |
spellingShingle |
Computer and Information Science Robotics Simultaneous localization and mapping Visual place recognition Zhang, Dongshuo Wu, Meiqing Lam, Siew-Kei Training-free attentive-patch selection for visual place recognition |
description |
Visual Place Recognition (VPR) utilizing patch descriptors from Convolutional Neural Networks (CNNs) has shown impressive performance in recent years. Existing works either perform exhaustive matching of all patch descriptors, or employ complex networks to select good candidate patches for further geometric verification. In this work, we develop a novel two-step training-free patch selection method that is fast, while being robust to large occlusions and extreme viewpoint variations. In the first step, a self-attention mechanism is used to select sparse and evenly distributed discriminative patches in the query image. Next, a novel spatial-matching method is used to rapidly select corresponding patches with high similar appearances between the query and each reference image. The proposed method is inspired by how humans perform place recognition by first identifying prominent regions in the query image, and then relying on back-and-forth visual inspection of the query and reference image to attentively identify similar regions while ignoring dissimilar ones. Extensive experiment results show that our proposed method outperforms state-of-the-art (SOTA) methods in both place recognition precision and runtime, on various challenging conditions. |
author2 |
College of Computing and Data Science |
author_facet |
College of Computing and Data Science Zhang, Dongshuo Wu, Meiqing Lam, Siew-Kei |
format |
Conference or Workshop Item |
author |
Zhang, Dongshuo Wu, Meiqing Lam, Siew-Kei |
author_sort |
Zhang, Dongshuo |
title |
Training-free attentive-patch selection for visual place recognition |
title_short |
Training-free attentive-patch selection for visual place recognition |
title_full |
Training-free attentive-patch selection for visual place recognition |
title_fullStr |
Training-free attentive-patch selection for visual place recognition |
title_full_unstemmed |
Training-free attentive-patch selection for visual place recognition |
title_sort |
training-free attentive-patch selection for visual place recognition |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/178533 |
_version_ |
1806059850747084800 |