Human tracking and path prediction for mobile robot navigation in crowded environment

Tracking humans and forecasting their future path in crowded environments is an essential feature for a mobile robot navigating in a crowded environment to achieve high-level tasks such as human behavior analysis, human interaction modeling, collision-free path planning, and unfreezing robot problem...

Full description

Saved in:
Bibliographic Details
Main Author: Bhujel, Niraj
Other Authors: Wang Han
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/159235
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Tracking humans and forecasting their future path in crowded environments is an essential feature for a mobile robot navigating in a crowded environment to achieve high-level tasks such as human behavior analysis, human interaction modeling, collision-free path planning, and unfreezing robot problems. Tracking multiple humans from a robot perspective is a challenging problem due to appearance changes, similar-looking persons, viewpoint variations, occlusions, pose changes, etc. To tackle these challenges, a guided second-order attention network (GSAN) is proposed to learn the fine-grained salient features of each person. The proposed GSAN is evaluated on a popular person re-identification dataset and the learned features are used for visual multi-object tracking in popular multi-object tracking datasets. Similarly, the challenges of path prediction in a crowded environment stem from complex human-human interactions, multi-modal human behavior, uncertainty in human decisions, and various social norms. To this end, first, an interaction model based on Message Passing Graph Convolutional Neural Network (MPGCN) is introduced. As human interactions can be asymmetric, such interactions are learned through an edge-wise gating mechanism between the nodes of MPGCN. Using this mechanism, an improvement of ~20 percent over the state-of-the-art methods on popular trajectory prediction datasets is achieved. Secondly, the multi-model behavior is addressed using the Conditional Variational Autoencoder(cVAE) approach. A novel self-critical GatedGCN (SC-GCN) is proposed to learn social behaviors like collision avoidance and goal-reaching using the Actor-critic framework. An ablation study on the crowd datasets shows that SC-GCN with collision rewards can significantly reduce the number of false collisions in the predicted trajectories. Finally, a novel disentanglement learning method is proposed to learn complex human interactions more effectively by decomposing human interactions into spatial and temporal factors. Such a disentanglement approach increases the confidence of the predicted trajectories and can learn human interactions further up to eight meters without affecting prediction performances.