Credit assignment for collective multiagent RL with global rewards

Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their collective influence'...

Full description

Saved in:
Bibliographic Details
Main Authors: NGUYEN, Duc Thien, KUMAR, Akshat, LAU, Hoong Chuin
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4287
https://ink.library.smu.edu.sg/context/sis_research/article/5290/viewcontent/NIPS_2018_Credit_Assignment_For_Collective_Multiagent_RL.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5290
record_format dspace
spelling sg-smu-ink.sis_research-52902020-03-24T05:33:07Z Credit assignment for collective multiagent RL with global rewards NGUYEN, Duc Thien KUMAR, Akshat LAU, Hoong Chuin Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain. 2018-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4287 https://ink.library.smu.edu.sg/context/sis_research/article/5290/viewcontent/NIPS_2018_Credit_Assignment_For_Collective_Multiagent_RL.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Credit assignment methods Decision-theoretic planning Faster convergence High-quality solutions Multi-agent patrolling Multi-agent planning Partial observability Real-world problem Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Credit assignment methods
Decision-theoretic planning
Faster convergence
High-quality solutions
Multi-agent patrolling
Multi-agent planning
Partial observability
Real-world problem
Artificial Intelligence and Robotics
Operations Research, Systems Engineering and Industrial Engineering
spellingShingle Credit assignment methods
Decision-theoretic planning
Faster convergence
High-quality solutions
Multi-agent patrolling
Multi-agent planning
Partial observability
Real-world problem
Artificial Intelligence and Robotics
Operations Research, Systems Engineering and Industrial Engineering
NGUYEN, Duc Thien
KUMAR, Akshat
LAU, Hoong Chuin
Credit assignment for collective multiagent RL with global rewards
description Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain.
format text
author NGUYEN, Duc Thien
KUMAR, Akshat
LAU, Hoong Chuin
author_facet NGUYEN, Duc Thien
KUMAR, Akshat
LAU, Hoong Chuin
author_sort NGUYEN, Duc Thien
title Credit assignment for collective multiagent RL with global rewards
title_short Credit assignment for collective multiagent RL with global rewards
title_full Credit assignment for collective multiagent RL with global rewards
title_fullStr Credit assignment for collective multiagent RL with global rewards
title_full_unstemmed Credit assignment for collective multiagent RL with global rewards
title_sort credit assignment for collective multiagent rl with global rewards
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/4287
https://ink.library.smu.edu.sg/context/sis_research/article/5290/viewcontent/NIPS_2018_Credit_Assignment_For_Collective_Multiagent_RL.pdf
_version_ 1770574600181121024