Cognitive-inspired visual navigation system

This thesis summarizes current research on human navigation, with emphasis on human visual memory and spatial cognition. We study how different types of navigation assistance, different interaction modalities, and external stimuli (i.e., repetition priming) affect our navigation performance. With t...

Full description

Saved in:
Bibliographic Details
Main Author: Mukawa, Michal Akira
Other Authors: Miao Chun Yan
Format: Theses and Dissertations
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/72658
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This thesis summarizes current research on human navigation, with emphasis on human visual memory and spatial cognition. We study how different types of navigation assistance, different interaction modalities, and external stimuli (i.e., repetition priming) affect our navigation performance. With the rapid growth of wearable computing devices, indoor navigation guidance will become popular in the near future like the GPS-based navigation tools for drivers today. However, how indoor navigation guidance affects human’s memory of a novel environment has not been well studied. This thesis investigates route memory with three types of navigation assistance, i.e., 2D map, wearable navigation assistant, and human usher. The results show that the participants have similar patterns in remembering visual scenes, even using different types of assistance. These findings support previous work on scene memorability and provide the new insight that scene memorability is not affected by the type of navigation guidance. This may indicate that spatial working memory and visual memory are dissociated. We also show that scenes with navigation information are more memorable than scenes without such information. Finally, we provide some evidence that the location of a scene is linked to its memorability. During wayfinding in a novel environment, we encounter many novel places. Some of those places are encoded by our spatial memory. But how does the human brain “decide” which locations are more important than others, and how do backtracking and repetition priming enhance memorization of these scenes? This thesis explores how backtracking improves encoding of encountered locations. We also investigate if repetition priming helps with further memory enhancement. The results show that backtracking alone significantly improves spatial memory of visited places. Surprisingly, repetition priming does not further ix enhance memorization of these places. This result may suggest that spatial reasoning causes significant cognitive load that thwarts further improvement of spatial memory of locations. Next, as wearable devices, with a head-mounted display, become more popular, we noticed that there is a need to evaluate these devices as navigation systems and to investigate their effects on our spatial cognition. Our work investigates how Google Glass, with different interaction modalities (i.e., voice, display, and voice+display), affects spatial cognition during guided navigation. The results show that the voice+display modality can guide as effectively as Human Guide. However, this modality negatively affects backtracking. Also, the participants perceived higher system intelligence when using the voice+display modality. The results do not show any indication that the participants’ visual memory is affected by any of the modalities. Finally, we review cognitive systems that mimic human visual memory or navigation abilities. These systems play a vital role as a foundation for building cognitive-inspired, autonomous navigation systems. Next, we propose a novel, wearable, cognitive-inspired navigation system. The system is able to extract a topological representation of an explored environment and to provide efficient guidance through already mapped or pre-defined places. The topological structure is represented by visual scene categories (nodes of the topology) and their spatial relations (edges between nodes). Visual scene categories are detected by CNN (Convolutional Neural Network) with SVM (Support Vector Machine); while spatial relations are obtained from accelerometer and gyroscope data. To the best of our knowledge, it is the first design and demo of such a system. This system provides a strong foundation for efficient, real-life, navigation solutions that may be deployed in a number of scenarios (e.g., shopping malls, airports, offices, etc.).