Investigating of deep reinforcement learning-based techniques for robotic manipulation
This project is a continuation of the earlier work on reinforcement learning. The project will investigate on reinforcement learning based techniques for high dimensional robotic manipulation tasks. From earlier work, 4 reinforcement learning algorithms were implemented and tested on high dimen...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/157904 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This project is a continuation of the earlier work on reinforcement learning. The project
will investigate on reinforcement learning based techniques for high dimensional robotic
manipulation tasks. From earlier work, 4 reinforcement learning algorithms were
implemented and tested on high dimensional robotic manipulation tasks.
The tasks involved Open Box, Close Box, Pick up Cup, and Scoop with Spatula, from the
RLBench task implementations. From earlier results, Option-Critic showed the best
results which was able to solve Open box and Close box. The Option-Critic algorithm
previously learnt to open and close the box by forcing open the box and hitting the box lid
closed. This was due to a bug in RLBench collision function which caused the lid to ignore
collisions allowing the lid to be opened and closed by hitting the lid. The function has
been fixed in recent updates to RLBench which led to the algorithm not being able to solve
the tasks. Thus, we will be moving with the notion of the algorithms not being able to
solve any robotic manipulation tasks. The project will be focusing on Reach Target and
Pick Up Cup tasks.
From the conclusion of previous works, sparse reward signal and hyper parameters were
attributed as the reasons which hindered the robotic manipulation tasks to be solved. Thus,
we will be implementing dense reward signal to help the algorithms converge towards the
goal. Another method we will be looking into is hyper parameter optimization. |
---|