Partially-observable monocular autonomous navigation for UAVs through deep reinforcement learning

In recent years, the widespread applications of UAVs have brought higher requirements to enhance their autonomy. Obstacle detection and avoidance (ODA) are the key technologies to achieve this purpose. Unlike traditional ground-based robots, UAV navigation is more challenging because their motion...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang, Yuhang, Low, Kin Huat, Chen, Lyu
Other Authors: School of Mechanical and Aerospace Engineering
Format: Conference or Workshop Item
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/170079
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In recent years, the widespread applications of UAVs have brought higher requirements to enhance their autonomy. Obstacle detection and avoidance (ODA) are the key technologies to achieve this purpose. Unlike traditional ground-based robots, UAV navigation is more challenging because their motions are not easily limited by the well-defined ground. Considering the constraints on onboard sensors posed by the UAV’s size, this paper proposes a monocular vision-based ODA framework. To address the environment-dependent limitations of existing vision-aided obstacle avoidance (OA) algorithms, we propose an approach leveraging deep reinforcement learning (DRL) techniques to enhance UAV’s navigation capability in unknown and unstructured environments. Central to our approach is the concept of partial observability and the end-to-end controller, which takes the RGB images captured by the monocular camera and the destination information as input to generate collision-free trajectory directly. Besides, the policy network relies on the DQN algorithm and its derivatives to approximate the nonlinear mapping between image inputs and action command outputs. Additionally, we build various training and validation environments with different alignment patterns via Gazebo. Experiment results show that the proposed framework can successfully avoid obstacles and reach the destination with only local observation information.