Reinforcement learning and dynamic motion primitives

Multi-agent algorithms in Reinforcement Learning are a close approximation of real-world scenarios where there is a complex interplay between competition and collaboration between agents existing in an unpredictable environment. MultiAgent POsthumous Credit Assignment (MA-POCA) is a novel algorithm...

Full description

Saved in:

Bibliographic Details
Main Author:	Mudgal, Saurabh
Other Authors:	Domenico Campolo
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Mechanical engineering
Online Access:	https://hdl.handle.net/10356/150858
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-150858
record_format	dspace
spelling	sg-ntu-dr.10356-1508582021-06-03T06:21:44Z Reinforcement learning and dynamic motion primitives Mudgal, Saurabh Domenico Campolo School of Mechanical and Aerospace Engineering d.campolo@ntu.edu.sg Engineering::Mechanical engineering Multi-agent algorithms in Reinforcement Learning are a close approximation of real-world scenarios where there is a complex interplay between competition and collaboration between agents existing in an unpredictable environment. MultiAgent POsthumous Credit Assignment (MA-POCA) is a novel algorithm by Unity that has the potential to adapt the theories of multi-agent Reinforcement Learning to industrial applications. In this thesis, we study the theory of underlying concepts and literature of Reinforcement Learning that lead to such a sophisticated algorithm. Following that, we run evaluative experiments implementing the MA-POCA algorithm in simulated multi-agent environments. We discover that MA-POCA uses a fixed ratio parameter to balance collaborative and competitive self-play. This introduces problems similar to that seen in a Trust Region Policy Optimization (TRPO) and can be fixed using concepts of Proximal Policy Gradient (PPO). Further work is suggested to benchmark performance improvements from such modifications. Bachelor of Engineering (Mechanical Engineering) 2021-06-03T06:21:44Z 2021-06-03T06:21:44Z 2021 Final Year Project (FYP) Mudgal, S. (2021). Reinforcement learning and dynamic motion primitives. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/150858 https://hdl.handle.net/10356/150858 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Mechanical engineering
spellingShingle	Engineering::Mechanical engineering Mudgal, Saurabh Reinforcement learning and dynamic motion primitives
description	Multi-agent algorithms in Reinforcement Learning are a close approximation of real-world scenarios where there is a complex interplay between competition and collaboration between agents existing in an unpredictable environment. MultiAgent POsthumous Credit Assignment (MA-POCA) is a novel algorithm by Unity that has the potential to adapt the theories of multi-agent Reinforcement Learning to industrial applications. In this thesis, we study the theory of underlying concepts and literature of Reinforcement Learning that lead to such a sophisticated algorithm. Following that, we run evaluative experiments implementing the MA-POCA algorithm in simulated multi-agent environments. We discover that MA-POCA uses a fixed ratio parameter to balance collaborative and competitive self-play. This introduces problems similar to that seen in a Trust Region Policy Optimization (TRPO) and can be fixed using concepts of Proximal Policy Gradient (PPO). Further work is suggested to benchmark performance improvements from such modifications.
author2	Domenico Campolo
author_facet	Domenico Campolo Mudgal, Saurabh
format	Final Year Project
author	Mudgal, Saurabh
author_sort	Mudgal, Saurabh
title	Reinforcement learning and dynamic motion primitives
title_short	Reinforcement learning and dynamic motion primitives
title_full	Reinforcement learning and dynamic motion primitives
title_fullStr	Reinforcement learning and dynamic motion primitives
title_full_unstemmed	Reinforcement learning and dynamic motion primitives
title_sort	reinforcement learning and dynamic motion primitives
publisher	Nanyang Technological University
publishDate	2021
url	https://hdl.handle.net/10356/150858
_version_	1702431197234200576

Reinforcement learning and dynamic motion primitives

Similar Items