Multi-agent deep deterministic policy gradient algorithm for swarm systems
This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148106 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This paper demonstrates the need to develop more suitable decentralized reinforcement learning methods for multi-agents and swarm systems, and consequently explores one such pre-existing algorithm (Multi-Agent Deep Deterministic Policy Gradient - MADDPG) for multi-agent domains and then extends it to swarm systems.
The paper begins by analyzing the difficulty of traditional algorithms in multi-agent and swarm systems: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows.
It presents an existing adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination.
This adaptation is then extended to swarm systems via swarm parameter tuning and the feasibility of the new algorithm for swarm systems is analysed.
The results are discussed and future improvements are suggested. MADDPG can prove to be a good algorithm for swarm systems in the chosen environment on a small scale. However, further studies need to be done to extract the full potential of MADDPG algorithms for swarm systems. |
---|