Accelerating learned descriptor generation for visual localization

Visual SLAM systems use traditional feature extractor to retrieve features, a pair consisting of a keypoint and descriptor, from images. These features can then be matched to estimate the camera pose. However, these traditional feature extractors are surpassed by newer deep learning-based feature ex...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Woon Kit
Other Authors: Lam Siew Kei
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175279
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Visual SLAM systems use traditional feature extractor to retrieve features, a pair consisting of a keypoint and descriptor, from images. These features can then be matched to estimate the camera pose. However, these traditional feature extractors are surpassed by newer deep learning-based feature extractor in the presence of imaging noise, illumination, or viewpoint changes. However, such AI models may suffer performance issues when deployed to embedded devices, which prioritises low-powered consumption. This report investigates the potential of deep learning accelerator libraries to accelerate feature extractor models for application in visual SLAM systems, particularly on embedded devices. TensorRT, is such a library that this can help achieve a significant speedup compared to traditional feature extraction methods.