Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation

In this project, model-free Deep Reinforcement Learning (DRL) algorithms were implemented to solve complex robotic environments. These include low- dimensional and high-dimensional robotic tasks. Low-dimensional tasks have state inputs that are discrete values such as robotic arm joint angles, posit...

Full description

Saved in:
Bibliographic Details
Main Author: Cheng, Jason Kuan Yong
Other Authors: Soong Boon Hee
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148803
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-148803
record_format dspace
spelling sg-ntu-dr.10356-1488032023-07-07T18:30:56Z Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation Cheng, Jason Kuan Yong Soong Boon Hee School of Electrical and Electronic Engineering Institute of High-Powered Computing Toh Wei Qi EBHSOONG@ntu.edu.sg Engineering::Electrical and electronic engineering In this project, model-free Deep Reinforcement Learning (DRL) algorithms were implemented to solve complex robotic environments. These include low- dimensional and high-dimensional robotic tasks. Low-dimensional tasks have state inputs that are discrete values such as robotic arm joint angles, position, and velocity. High-dimensional tasks have state inputs that are images, with camera views of the environment in various angles. The low dimensional robotic environments involve CartPole Continuous, Hop- per, Half-Cheetah and Ant Bullet environments using the PyBullet (Coumans and Bai, 2016–2019) as the back-end physics engine for the robotic simulator. The high dimensional robotic manipulation tasks involve Open Box, Close Box, Pick- up Cup, and Scoop with Spatula, from the RLBench (James et al., 2020) task implementations. From the results of the experiments, off-policy algorithms like Deep Deter- ministic Policy Gradients (DDPG) and Twin-Delayed Deep Deterministic Policy Gradeints (TD3) outperformed the other algorithms on low dimensional tasks due to learning from experience replay, thereby having superior sample efficiency compared to on-policy algorithms like Trust Region Policy Optimisation (TRPO) and Proximal Policy Optimisation (PPO). For the high-dimensional environments, only the Option-Critic algorithm was able to solve some of the environments like open-box and close-box. Off-policy algorithms do not perform well due to the high memory constraint related to holding images in experience replay, thus the agent could not learn well from experience replay. On-policy algorithms are also not able to learn well from high-dimensional environments as they are unable to generalise due to the sparse reward signals. No algorithms implemented were able to solve the more complex manipulation tasks like scoop with spatula and pick-up cup as the agent were not able to solve the task using random exploration to get the sparse reward signal to learn from. Bachelor of Engineering (Information Engineering and Media) 2021-05-17T13:06:15Z 2021-05-17T13:06:15Z 2021 Final Year Project (FYP) Cheng, J. K. Y. (2021). Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148803 https://hdl.handle.net/10356/148803 en B3224-201 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Cheng, Jason Kuan Yong
Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
description In this project, model-free Deep Reinforcement Learning (DRL) algorithms were implemented to solve complex robotic environments. These include low- dimensional and high-dimensional robotic tasks. Low-dimensional tasks have state inputs that are discrete values such as robotic arm joint angles, position, and velocity. High-dimensional tasks have state inputs that are images, with camera views of the environment in various angles. The low dimensional robotic environments involve CartPole Continuous, Hop- per, Half-Cheetah and Ant Bullet environments using the PyBullet (Coumans and Bai, 2016–2019) as the back-end physics engine for the robotic simulator. The high dimensional robotic manipulation tasks involve Open Box, Close Box, Pick- up Cup, and Scoop with Spatula, from the RLBench (James et al., 2020) task implementations. From the results of the experiments, off-policy algorithms like Deep Deter- ministic Policy Gradients (DDPG) and Twin-Delayed Deep Deterministic Policy Gradeints (TD3) outperformed the other algorithms on low dimensional tasks due to learning from experience replay, thereby having superior sample efficiency compared to on-policy algorithms like Trust Region Policy Optimisation (TRPO) and Proximal Policy Optimisation (PPO). For the high-dimensional environments, only the Option-Critic algorithm was able to solve some of the environments like open-box and close-box. Off-policy algorithms do not perform well due to the high memory constraint related to holding images in experience replay, thus the agent could not learn well from experience replay. On-policy algorithms are also not able to learn well from high-dimensional environments as they are unable to generalise due to the sparse reward signals. No algorithms implemented were able to solve the more complex manipulation tasks like scoop with spatula and pick-up cup as the agent were not able to solve the task using random exploration to get the sparse reward signal to learn from.
author2 Soong Boon Hee
author_facet Soong Boon Hee
Cheng, Jason Kuan Yong
format Final Year Project
author Cheng, Jason Kuan Yong
author_sort Cheng, Jason Kuan Yong
title Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
title_short Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
title_full Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
title_fullStr Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
title_full_unstemmed Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
title_sort investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/148803
_version_ 1772827324986687488