Curriculum learning for robotic agents

Model-free deep RL algorithms, rooted in the concept of tabula rasa, suffer from poor sample efficiency, which is a major drawback for these methods to be applicable to real-world robotics problems. To improve sample efficiency, curriculum learning leverages prior knowledge in a way to solve diff...

Full description

Saved in:

Bibliographic Details
Main Author:	Kurkcu, Anil
Other Authors:	Domenico Campolo
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Mechanical engineering::Robots Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/160311
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-160311
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Mechanical engineering::Robots Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle	Engineering::Mechanical engineering::Robots Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Kurkcu, Anil Curriculum learning for robotic agents
description	Model-free deep RL algorithms, rooted in the concept of tabula rasa, suffer from poor sample efficiency, which is a major drawback for these methods to be applicable to real-world robotics problems. To improve sample efficiency, curriculum learning leverages prior knowledge in a way to solve difficult task setups with the policy obtained from simpler ones. Being a more systematic way of learning compared to starting from scratch, an important point to tackle is on how to order a set of learning tasks. It is not trivial to decide upon a task difficulty concerning robotic learning setups. For this purpose, we present, in this thesis, three contributions in terms of algorithms for developing a curriculum involving multiple tasks of varying difficulties. Our first contribution is an approach that generates a curriculum in a parameter-space task setup with a goal task available for the agent to learn. A double-inverted pendulum setup is used where a two-dimensional parameter space is created based on the lengths of the two pendulum links. By parameterizing task difficulty in terms of pendulum link lengths, our algorithm starts from the pendulum with the shortest links and creates a curriculum by finding intermediate tasks from the parameter space to learn. Results showed that transferring the start task policy to even a single intermediate task before tackling the goal task is more sample efficient compared to both a direct transfer from start to goal task by two-fold and learning the goal task from scratch by ten-fold. Our second and third contributions are oriented towards environment setups where several tasks are to be learned by the agent with minimal assumptions on task difficulty, with tasks related to robotic setups, such as grasping objects of varying difficulty with a robotic manipulator. The second curriculum learning algorithm is based on the idea of selecting tasks that are neither too easy nor too difficult, i.e. medium difficulty, so as to maintain positive learning progress. Evaluation score is based on successful grasps from a number of trials. Medium difficulty tasks are selected, based on their learning evaluation scores against the upper and lower limits for difficulty assessment, and then assigned to the agent. The agent continues learning the same task if a greater evaluation score is obtained after learning, meaning that the agent is making positive learning progress with the task and could eventually achieve a sufficiently high evaluation score such that the task could be marked as learned. We apply this approach in a robotic grasping environment with objects of various geometric shapes and show that the manipulator can learn to grasp difficult objects with our curriculum learning approach, but not when learning from scratch (within a reasonable amount of time). Our third contribution is an approach, named GloCAL, which is oriented towards a generalization framework where the aim is to achieve few-shot learning for tasks that are similar yet novel for the agent. This algorithm labels tasks as global or local based on clustering of their learning evaluation scores. The global tasks are responsible for transferring the policy from simple to difficult tasks, as well as being the prior for similar yet novel tasks. We compare this approach with a recent curriculum learning algorithm named ALP-GMM and a randomly generated curriculum in a robotic grasping environment and show its generalization capabilities with similar yet novel objects that did not exist while generating the curriculum. Results showed that our GloCAL algorithm can not only surpass existing curriculum learning methods in terms of effectiveness by grasping 100% of objects while others converge to 86% despite 1.5x training time, but also perform few-shot transfer to similar yet novel objects.
author2	Domenico Campolo
author_facet	Domenico Campolo Kurkcu, Anil
format	Thesis-Doctor of Philosophy
author	Kurkcu, Anil
author_sort	Kurkcu, Anil
title	Curriculum learning for robotic agents
title_short	Curriculum learning for robotic agents
title_full	Curriculum learning for robotic agents
title_fullStr	Curriculum learning for robotic agents
title_full_unstemmed	Curriculum learning for robotic agents
title_sort	curriculum learning for robotic agents
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/160311
_version_	1761781716316323840
spelling	sg-ntu-dr.10356-1603112023-03-11T18:09:25Z Curriculum learning for robotic agents Kurkcu, Anil Domenico Campolo School of Mechanical and Aerospace Engineering Robotics Research Centre d.campolo@ntu.edu.sg Engineering::Mechanical engineering::Robots Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Model-free deep RL algorithms, rooted in the concept of tabula rasa, suffer from poor sample efficiency, which is a major drawback for these methods to be applicable to real-world robotics problems. To improve sample efficiency, curriculum learning leverages prior knowledge in a way to solve difficult task setups with the policy obtained from simpler ones. Being a more systematic way of learning compared to starting from scratch, an important point to tackle is on how to order a set of learning tasks. It is not trivial to decide upon a task difficulty concerning robotic learning setups. For this purpose, we present, in this thesis, three contributions in terms of algorithms for developing a curriculum involving multiple tasks of varying difficulties. Our first contribution is an approach that generates a curriculum in a parameter-space task setup with a goal task available for the agent to learn. A double-inverted pendulum setup is used where a two-dimensional parameter space is created based on the lengths of the two pendulum links. By parameterizing task difficulty in terms of pendulum link lengths, our algorithm starts from the pendulum with the shortest links and creates a curriculum by finding intermediate tasks from the parameter space to learn. Results showed that transferring the start task policy to even a single intermediate task before tackling the goal task is more sample efficient compared to both a direct transfer from start to goal task by two-fold and learning the goal task from scratch by ten-fold. Our second and third contributions are oriented towards environment setups where several tasks are to be learned by the agent with minimal assumptions on task difficulty, with tasks related to robotic setups, such as grasping objects of varying difficulty with a robotic manipulator. The second curriculum learning algorithm is based on the idea of selecting tasks that are neither too easy nor too difficult, i.e. medium difficulty, so as to maintain positive learning progress. Evaluation score is based on successful grasps from a number of trials. Medium difficulty tasks are selected, based on their learning evaluation scores against the upper and lower limits for difficulty assessment, and then assigned to the agent. The agent continues learning the same task if a greater evaluation score is obtained after learning, meaning that the agent is making positive learning progress with the task and could eventually achieve a sufficiently high evaluation score such that the task could be marked as learned. We apply this approach in a robotic grasping environment with objects of various geometric shapes and show that the manipulator can learn to grasp difficult objects with our curriculum learning approach, but not when learning from scratch (within a reasonable amount of time). Our third contribution is an approach, named GloCAL, which is oriented towards a generalization framework where the aim is to achieve few-shot learning for tasks that are similar yet novel for the agent. This algorithm labels tasks as global or local based on clustering of their learning evaluation scores. The global tasks are responsible for transferring the policy from simple to difficult tasks, as well as being the prior for similar yet novel tasks. We compare this approach with a recent curriculum learning algorithm named ALP-GMM and a randomly generated curriculum in a robotic grasping environment and show its generalization capabilities with similar yet novel objects that did not exist while generating the curriculum. Results showed that our GloCAL algorithm can not only surpass existing curriculum learning methods in terms of effectiveness by grasping 100% of objects while others converge to 86% despite 1.5x training time, but also perform few-shot transfer to similar yet novel objects. Doctor of Philosophy 2022-07-20T06:36:24Z 2022-07-20T06:36:24Z 2022 Thesis-Doctor of Philosophy Kurkcu, A. (2022). Curriculum learning for robotic agents. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/160311 https://hdl.handle.net/10356/160311 10.32657/10356/160311 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Curriculum learning for robotic agents

Similar Items