PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion
Place recognition plays a crucial role in the fields of robotics and computer vision, finding applications in areas such as autonomous driving, mapping, and localization. Place recognition identifies a place using query sensor data and a known database. One of the main challenges is to develop a mod...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182557 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-182557 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1825572025-02-10T01:01:19Z PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion Wang, Sijie Kang, Qiyu She, Rui Zhao, Kai Song, Yang Tay, Wee Peng School of Electrical and Electronic Engineering Engineering Place recognition Multi-modal fusion Place recognition plays a crucial role in the fields of robotics and computer vision, finding applications in areas such as autonomous driving, mapping, and localization. Place recognition identifies a place using query sensor data and a known database. One of the main challenges is to develop a model that can deliver accurate results while being robust to environmental variations. We propose two multi-modal place recognition models, namely PRFusion and PRFusion++. PRFusion utilizes global fusion with manifold metric attention, enabling effective interaction between features without requiring camera-LiDAR extrinsic calibrations. In contrast, PRFusion++ assumes the availability of extrinsic calibrations and leverages pixel-point correspondences to enhance feature learning on local windows. Additionally, both models incorporate neural diffusion layers, which enable reliable operation even in challenging environments. We verify the state-of-the-art performance of both models on three large-scale benchmarks. Notably, they outperform existing models by a substantial margin of +3.0 AR@1 on the demanding Boreas dataset. Furthermore, we conduct ablation studies to validate the effectiveness of our proposed methods. Ministry of Education (MOE) This work was supported by Singapore Ministry of Education Academic Research Fund Tier 2 under Grant MOE-T2EP20220-0002. 2025-02-10T01:01:19Z 2025-02-10T01:01:19Z 2024 Journal Article Wang, S., Kang, Q., She, R., Zhao, K., Song, Y. & Tay, W. P. (2024). PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion. IEEE Transactions On Intelligent Transportation Systems, 25(12), 20523-20534. https://dx.doi.org/10.1109/TITS.2024.3465830 1524-9050 https://hdl.handle.net/10356/182557 10.1109/TITS.2024.3465830 2-s2.0-85207135005 12 25 20523 20534 en MOE-T2EP20220-0002 IEEE Transactions on Intelligent Transportation Systems © 2024 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering Place recognition Multi-modal fusion |
spellingShingle |
Engineering Place recognition Multi-modal fusion Wang, Sijie Kang, Qiyu She, Rui Zhao, Kai Song, Yang Tay, Wee Peng PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
description |
Place recognition plays a crucial role in the fields of robotics and computer vision, finding applications in areas such as autonomous driving, mapping, and localization. Place recognition identifies a place using query sensor data and a known database. One of the main challenges is to develop a model that can deliver accurate results while being robust to environmental variations. We propose two multi-modal place recognition models, namely PRFusion and PRFusion++. PRFusion utilizes global fusion with manifold metric attention, enabling effective interaction between features without requiring camera-LiDAR extrinsic calibrations. In contrast, PRFusion++ assumes the availability of extrinsic calibrations and leverages pixel-point correspondences to enhance feature learning on local windows. Additionally, both models incorporate neural diffusion layers, which enable reliable operation even in challenging environments. We verify the state-of-the-art performance of both models on three large-scale benchmarks. Notably, they outperform existing models by a substantial margin of +3.0 AR@1 on the demanding Boreas dataset. Furthermore, we conduct ablation studies to validate the effectiveness of our proposed methods. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Wang, Sijie Kang, Qiyu She, Rui Zhao, Kai Song, Yang Tay, Wee Peng |
format |
Article |
author |
Wang, Sijie Kang, Qiyu She, Rui Zhao, Kai Song, Yang Tay, Wee Peng |
author_sort |
Wang, Sijie |
title |
PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
title_short |
PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
title_full |
PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
title_fullStr |
PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
title_full_unstemmed |
PRFusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
title_sort |
prfusion: toward effective and robust multi-modal place recognition with image and point cloud fusion |
publishDate |
2025 |
url |
https://hdl.handle.net/10356/182557 |
_version_ |
1823807398444269568 |