Visual localization at NTU campus

Visual localization is a key problem in various computer vision applications such as augmented reality and autonomous driving. Major challenges for visual localization include varying weather conditions, dynamic foregrounds, and varying viewpoints as seen in environments with dynamic objects such as...

Full description

Saved in:

Bibliographic Details
Main Author:	Abhinaya, Kesarimangalam Srinivasan
Other Authors:	Lin Weisi
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Online Access:	https://hdl.handle.net/10356/165975
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-165975
record_format	dspace
spelling	sg-ntu-dr.10356-1659752023-04-21T15:37:46Z Visual localization at NTU campus Abhinaya, Kesarimangalam Srinivasan Lin Weisi School of Computer Science and Engineering WSLin@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Visual localization is a key problem in various computer vision applications such as augmented reality and autonomous driving. Major challenges for visual localization include varying weather conditions, dynamic foregrounds, and varying viewpoints as seen in environments with dynamic objects such as the Nanyang Technological University Campus. Some efficient methods to represent images for the Visual Place Recognition task like Fischer Vectors (FV), Scale-Invariant Feature Transform (SIFT), and Vector of Locally Aggregated Descriptors (VLAD) can handle some of these challenges. Although VLAD provides a rich and effective method for image storage and retrieval, it models a static function. NetVLAD modifies the same to create a trainable function, that minimizes the Euclidean distance between the query and the correct positive image and is used as baseline in this work. Soft assignment to clusters makes NetVLAD readily pluggable into Convolutional Neural Network architectures for end - to - end training. Instead of uniform pooling as in the case of NetVLAD, Attention Pyramid Pooling of Salient Visual Residuals (APPSVR) uses attention, generated based on semantic segmentation, to de-prioritize task irrelevant features. Three levels of attention in the form of local integration, global integration and parametric pooling handle the cases of task - irrelevant features, contextual information and weighting between clusters respectively. This paper aims to study the effect of semantic segmentation in visual localization; NetVLAD and APPVSR as potential solutions for visual localization in an indoor location like the Nanyang Technological University (NTU) Campus. Utilizing semantic information to generate attention has shown to be helpful with an increase in Recall@1 rates from 0.8381 to 0.8563. Bachelor of Engineering (Computer Engineering) 2023-04-17T13:37:02Z 2023-04-17T13:37:02Z 2023 Final Year Project (FYP) Abhinaya, K. S. (2023). Visual localization at NTU campus. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165975 https://hdl.handle.net/10356/165975 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Abhinaya, Kesarimangalam Srinivasan Visual localization at NTU campus
description	Visual localization is a key problem in various computer vision applications such as augmented reality and autonomous driving. Major challenges for visual localization include varying weather conditions, dynamic foregrounds, and varying viewpoints as seen in environments with dynamic objects such as the Nanyang Technological University Campus. Some efficient methods to represent images for the Visual Place Recognition task like Fischer Vectors (FV), Scale-Invariant Feature Transform (SIFT), and Vector of Locally Aggregated Descriptors (VLAD) can handle some of these challenges. Although VLAD provides a rich and effective method for image storage and retrieval, it models a static function. NetVLAD modifies the same to create a trainable function, that minimizes the Euclidean distance between the query and the correct positive image and is used as baseline in this work. Soft assignment to clusters makes NetVLAD readily pluggable into Convolutional Neural Network architectures for end - to - end training. Instead of uniform pooling as in the case of NetVLAD, Attention Pyramid Pooling of Salient Visual Residuals (APPSVR) uses attention, generated based on semantic segmentation, to de-prioritize task irrelevant features. Three levels of attention in the form of local integration, global integration and parametric pooling handle the cases of task - irrelevant features, contextual information and weighting between clusters respectively. This paper aims to study the effect of semantic segmentation in visual localization; NetVLAD and APPVSR as potential solutions for visual localization in an indoor location like the Nanyang Technological University (NTU) Campus. Utilizing semantic information to generate attention has shown to be helpful with an increase in Recall@1 rates from 0.8381 to 0.8563.
author2	Lin Weisi
author_facet	Lin Weisi Abhinaya, Kesarimangalam Srinivasan
format	Final Year Project
author	Abhinaya, Kesarimangalam Srinivasan
author_sort	Abhinaya, Kesarimangalam Srinivasan
title	Visual localization at NTU campus
title_short	Visual localization at NTU campus
title_full	Visual localization at NTU campus
title_fullStr	Visual localization at NTU campus
title_full_unstemmed	Visual localization at NTU campus
title_sort	visual localization at ntu campus
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/165975
_version_	1764208018115788800

Visual localization at NTU campus

Similar Items