Multi-agent deep deterministic policy gradient algorithm for swarm systems

This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it t...

Full description

Saved in:

Bibliographic Details
Main Author:	Bedi, Jannat
Other Authors:	Zinovi Rabinovich
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/148106
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-148106
record_format	dspace
spelling	sg-ntu-dr.10356-1481062021-04-23T13:57:02Z Multi-agent deep deterministic policy gradient algorithm for swarm systems Bedi, Jannat Zinovi Rabinovich School of Computer Science and Engineering zinovi@ntu.edu.sg Engineering::Computer science and engineering This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it to swarm systems. The paper begins by analyzing the difficulty of traditional algorithms in multi-agent and swarm systems: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. It presents an existing adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. This adaptation is then extended to swarm systems via swarm parameter tuning and the feasibility of the new algorithm for swarm systems is analysed. The results are discussed and future improvements are suggested. MADDPG can prove to be a good algorithm for swarm systems in the chosen environment on a small scale. However, further studies need to be done to extract the full potential of MADDPG algorithms for swarm systems. Bachelor of Engineering (Computer Science) 2021-04-23T13:57:01Z 2021-04-23T13:57:01Z 2021 Final Year Project (FYP) Bedi, J. (2021). Multi-agent deep deterministic policy gradient algorithm for swarm systems. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148106 https://hdl.handle.net/10356/148106 en SCSE20-0481 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Bedi, Jannat Multi-agent deep deterministic policy gradient algorithm for swarm systems
description	This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it to swarm systems. The paper begins by analyzing the difficulty of traditional algorithms in multi-agent and swarm systems: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. It presents an existing adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. This adaptation is then extended to swarm systems via swarm parameter tuning and the feasibility of the new algorithm for swarm systems is analysed. The results are discussed and future improvements are suggested. MADDPG can prove to be a good algorithm for swarm systems in the chosen environment on a small scale. However, further studies need to be done to extract the full potential of MADDPG algorithms for swarm systems.
author2	Zinovi Rabinovich
author_facet	Zinovi Rabinovich Bedi, Jannat
format	Final Year Project
author	Bedi, Jannat
author_sort	Bedi, Jannat
title	Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_short	Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_full	Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_fullStr	Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_full_unstemmed	Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_sort	multi-agent deep deterministic policy gradient algorithm for swarm systems
publisher	Nanyang Technological University
publishDate	2021
url	https://hdl.handle.net/10356/148106
_version_	1698713724610674688

Multi-agent deep deterministic policy gradient algorithm for swarm systems

Similar Items