Multi-agent deep deterministic policy gradient algorithm for swarm systems

This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it t...

Full description

Saved in:
Bibliographic Details
Main Author: Bedi, Jannat
Other Authors: Zinovi Rabinovich
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148106
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-148106
record_format dspace
spelling sg-ntu-dr.10356-1481062021-04-23T13:57:02Z Multi-agent deep deterministic policy gradient algorithm for swarm systems Bedi, Jannat Zinovi Rabinovich School of Computer Science and Engineering zinovi@ntu.edu.sg Engineering::Computer science and engineering This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it to swarm systems. The paper begins by analyzing the difficulty of traditional algorithms in multi-agent and swarm systems: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. It presents an existing adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. This adaptation is then extended to swarm systems via swarm parameter tuning and the feasibility of the new algorithm for swarm systems is analysed. The results are discussed and future improvements are suggested. MADDPG can prove to be a good algorithm for swarm systems in the chosen environment on a small scale. However, further studies need to be done to extract the full potential of MADDPG algorithms for swarm systems. Bachelor of Engineering (Computer Science) 2021-04-23T13:57:01Z 2021-04-23T13:57:01Z 2021 Final Year Project (FYP) Bedi, J. (2021). Multi-agent deep deterministic policy gradient algorithm for swarm systems. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148106 https://hdl.handle.net/10356/148106 en SCSE20-0481 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Bedi, Jannat
Multi-agent deep deterministic policy gradient algorithm for swarm systems
description This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it to swarm systems. The paper begins by analyzing the difficulty of traditional algorithms in multi-agent and swarm systems: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. It presents an existing adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. This adaptation is then extended to swarm systems via swarm parameter tuning and the feasibility of the new algorithm for swarm systems is analysed. The results are discussed and future improvements are suggested. MADDPG can prove to be a good algorithm for swarm systems in the chosen environment on a small scale. However, further studies need to be done to extract the full potential of MADDPG algorithms for swarm systems.
author2 Zinovi Rabinovich
author_facet Zinovi Rabinovich
Bedi, Jannat
format Final Year Project
author Bedi, Jannat
author_sort Bedi, Jannat
title Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_short Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_full Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_fullStr Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_full_unstemmed Multi-agent deep deterministic policy gradient algorithm for swarm systems
title_sort multi-agent deep deterministic policy gradient algorithm for swarm systems
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/148106
_version_ 1698713724610674688