Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation
In this project, model-free Deep Reinforcement Learning (DRL) algorithms were implemented to solve complex robotic environments. These include low- dimensional and high-dimensional robotic tasks. Low-dimensional tasks have state inputs that are discrete values such as robotic arm joint angles, posit...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148803 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-148803 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1488032023-07-07T18:30:56Z Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation Cheng, Jason Kuan Yong Soong Boon Hee School of Electrical and Electronic Engineering Institute of High-Powered Computing Toh Wei Qi EBHSOONG@ntu.edu.sg Engineering::Electrical and electronic engineering In this project, model-free Deep Reinforcement Learning (DRL) algorithms were implemented to solve complex robotic environments. These include low- dimensional and high-dimensional robotic tasks. Low-dimensional tasks have state inputs that are discrete values such as robotic arm joint angles, position, and velocity. High-dimensional tasks have state inputs that are images, with camera views of the environment in various angles. The low dimensional robotic environments involve CartPole Continuous, Hop- per, Half-Cheetah and Ant Bullet environments using the PyBullet (Coumans and Bai, 2016–2019) as the back-end physics engine for the robotic simulator. The high dimensional robotic manipulation tasks involve Open Box, Close Box, Pick- up Cup, and Scoop with Spatula, from the RLBench (James et al., 2020) task implementations. From the results of the experiments, off-policy algorithms like Deep Deter- ministic Policy Gradients (DDPG) and Twin-Delayed Deep Deterministic Policy Gradeints (TD3) outperformed the other algorithms on low dimensional tasks due to learning from experience replay, thereby having superior sample efficiency compared to on-policy algorithms like Trust Region Policy Optimisation (TRPO) and Proximal Policy Optimisation (PPO). For the high-dimensional environments, only the Option-Critic algorithm was able to solve some of the environments like open-box and close-box. Off-policy algorithms do not perform well due to the high memory constraint related to holding images in experience replay, thus the agent could not learn well from experience replay. On-policy algorithms are also not able to learn well from high-dimensional environments as they are unable to generalise due to the sparse reward signals. No algorithms implemented were able to solve the more complex manipulation tasks like scoop with spatula and pick-up cup as the agent were not able to solve the task using random exploration to get the sparse reward signal to learn from. Bachelor of Engineering (Information Engineering and Media) 2021-05-17T13:06:15Z 2021-05-17T13:06:15Z 2021 Final Year Project (FYP) Cheng, J. K. Y. (2021). Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148803 https://hdl.handle.net/10356/148803 en B3224-201 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering |
spellingShingle |
Engineering::Electrical and electronic engineering Cheng, Jason Kuan Yong Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
description |
In this project, model-free Deep Reinforcement Learning (DRL) algorithms were implemented to solve complex robotic environments. These include low- dimensional and high-dimensional robotic tasks. Low-dimensional tasks have state inputs that are discrete values such as robotic arm joint angles, position, and velocity. High-dimensional tasks have state inputs that are images, with camera views of the environment in various angles.
The low dimensional robotic environments involve CartPole Continuous, Hop- per, Half-Cheetah and Ant Bullet environments using the PyBullet (Coumans and Bai, 2016–2019) as the back-end physics engine for the robotic simulator. The high dimensional robotic manipulation tasks involve Open Box, Close Box, Pick- up Cup, and Scoop with Spatula, from the RLBench (James et al., 2020) task implementations.
From the results of the experiments, off-policy algorithms like Deep Deter- ministic Policy Gradients (DDPG) and Twin-Delayed Deep Deterministic Policy Gradeints (TD3) outperformed the other algorithms on low dimensional tasks due to learning from experience replay, thereby having superior sample efficiency compared to on-policy algorithms like Trust Region Policy Optimisation (TRPO) and Proximal Policy Optimisation (PPO).
For the high-dimensional environments, only the Option-Critic algorithm was able to solve some of the environments like open-box and close-box. Off-policy algorithms do not perform well due to the high memory constraint related to holding images in experience replay, thus the agent could not learn well from experience replay. On-policy algorithms are also not able to learn well from high-dimensional environments as they are unable to generalise due to the sparse reward signals. No algorithms implemented were able to solve the more complex manipulation tasks like scoop with spatula and pick-up cup as the agent were not able to solve the task using random exploration to get the sparse reward signal to learn from. |
author2 |
Soong Boon Hee |
author_facet |
Soong Boon Hee Cheng, Jason Kuan Yong |
format |
Final Year Project |
author |
Cheng, Jason Kuan Yong |
author_sort |
Cheng, Jason Kuan Yong |
title |
Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
title_short |
Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
title_full |
Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
title_fullStr |
Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
title_full_unstemmed |
Investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
title_sort |
investigating sim-to-real transfer for reinforcement learning-based robotic manipulation |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/148803 |
_version_ |
1772827324986687488 |