Learning transferable skills in complex 3D scenarios via deep reinforcement learning

Deep Reinforcement Learning combines reinforcement learning, the framework that assists an intelligent agent towards its goal, with a deep neural network. The deep neural network follows a black-box model, performing complex functional approximation calculations to achieve the best results by minimi...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, You Rong
Other Authors: Bo An
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156376
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Deep Reinforcement Learning combines reinforcement learning, the framework that assists an intelligent agent towards its goal, with a deep neural network. The deep neural network follows a black-box model, performing complex functional approximation calculations to achieve the best results by minimizing the output error through back-propagation. This process is both time and computationally expensive as it could take millions of iterations for the agent to master complex tasks. Recent success in Transfer Learning with Deep Reinforcement Learning has demonstrated the capability of jump-starting the learning process, resulting in better overall performance. Additionally, using previously attained knowledge can allow the agent to achieve minimum threshold performance in fewer training steps. These successes reduce the training steps required to master a complex task, saving computation resources and time. Therefore, my contributions would be investigating Deep Reinforcement Learning and its ability to learn and apply transferable skills within a complex environment involving sparse rewards and domain randomization through Transfer Learning. The study includes attaining transferable skills with Curriculum Learning and Reward Shaping to tackle the sparse rewards problem. Popular reinforcement learning algorithms Proximal Policy Optimisation (PPO) and Soft Actor-Critic(SAC) enabled the agent to learn the policy required to pass the minimum threshold. Following that, Transfer Learning was performed on the agent and trained in new scenarios. These experiments evaluate the capability of the policy to generalize a problem and encourage the agent to alter its existing policy under the new settings. The new settings involved inclined surfaces and changing the agent shape from ovoid to cubic. The results demonstrate that agent with transfer learning outperforms the untrained model under various metrics where the agent successfully adapted to changes by grasping observations without external interference.