Successor features based multi-agent RL for event-based decentralized MDPs
Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents t...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5057 https://ink.library.smu.edu.sg/context/sis_research/article/6060/viewcontent/4561_Article_Text_7600_1_10_20190707.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-6060 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-60602020-03-12T07:58:12Z Successor features based multi-agent RL for event-based decentralized MDPs GUPTA, Tarun KUMAR, Akshat PARUCHURI, Praveen Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents that are tied together through a global reward function that depends upon their entire histories of states and actions to accomplish joint tasks. To overcome scalability barrier, our main contributions are: (a) We propose a new actor-critic based Reinforcement Learning (RL) approach for event-based Dec-MDPs using successor features (SF) which is a value function representation that decouples the dynamics of the environment from the rewards; (b) We then present Dec-ESR (Decentralized Event based Successor Representation) which generalizes learning for event-based Dec-MDPs using SF within an end-to-end deep RL framework; (c) We also show that Dec-ESR allows useful transfer of information on related but different tasks, hence bootstraps the learning for faster convergence on new tasks; (d) For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches 2019-01-02T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5057 info:doi/10.1609/aaai.v33i01.33016054 https://ink.library.smu.edu.sg/context/sis_research/article/6060/viewcontent/4561_Article_Text_7600_1_10_20190707.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Software Engineering |
spellingShingle |
Software Engineering GUPTA, Tarun KUMAR, Akshat PARUCHURI, Praveen Successor features based multi-agent RL for event-based decentralized MDPs |
description |
Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents that are tied together through a global reward function that depends upon their entire histories of states and actions to accomplish joint tasks. To overcome scalability barrier, our main contributions are: (a) We propose a new actor-critic based Reinforcement Learning (RL) approach for event-based Dec-MDPs using successor features (SF) which is a value function representation that decouples the dynamics of the environment from the rewards; (b) We then present Dec-ESR (Decentralized Event based Successor Representation) which generalizes learning for event-based Dec-MDPs using SF within an end-to-end deep RL framework; (c) We also show that Dec-ESR allows useful transfer of information on related but different tasks, hence bootstraps the learning for faster convergence on new tasks; (d) For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches |
format |
text |
author |
GUPTA, Tarun KUMAR, Akshat PARUCHURI, Praveen |
author_facet |
GUPTA, Tarun KUMAR, Akshat PARUCHURI, Praveen |
author_sort |
GUPTA, Tarun |
title |
Successor features based multi-agent RL for event-based decentralized MDPs |
title_short |
Successor features based multi-agent RL for event-based decentralized MDPs |
title_full |
Successor features based multi-agent RL for event-based decentralized MDPs |
title_fullStr |
Successor features based multi-agent RL for event-based decentralized MDPs |
title_full_unstemmed |
Successor features based multi-agent RL for event-based decentralized MDPs |
title_sort |
successor features based multi-agent rl for event-based decentralized mdps |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2019 |
url |
https://ink.library.smu.edu.sg/sis_research/5057 https://ink.library.smu.edu.sg/context/sis_research/article/6060/viewcontent/4561_Article_Text_7600_1_10_20190707.pdf |
_version_ |
1770575201886535680 |