Credit assignment for collective multiagent RL with global rewards

Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their collective influence'...

Full description

Saved in:

Bibliographic Details
Main Authors:	NGUYEN, Duc Thien, KUMAR, Akshat, LAU, Hoong Chuin
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2018
Subjects:	Credit assignment methods Decision-theoretic planning Faster convergence High-quality solutions Multi-agent patrolling Multi-agent planning Partial observability Real-world problem Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/4287 https://ink.library.smu.edu.sg/context/sis_research/article/5290/viewcontent/NIPS_2018_Credit_Assignment_For_Collective_Multiagent_RL.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-5290
record_format	dspace
spelling	sg-smu-ink.sis_research-52902020-03-24T05:33:07Z Credit assignment for collective multiagent RL with global rewards NGUYEN, Duc Thien KUMAR, Akshat LAU, Hoong Chuin Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain. 2018-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4287 https://ink.library.smu.edu.sg/context/sis_research/article/5290/viewcontent/NIPS_2018_Credit_Assignment_For_Collective_Multiagent_RL.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Credit assignment methods Decision-theoretic planning Faster convergence High-quality solutions Multi-agent patrolling Multi-agent planning Partial observability Real-world problem Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Credit assignment methods Decision-theoretic planning Faster convergence High-quality solutions Multi-agent patrolling Multi-agent planning Partial observability Real-world problem Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
spellingShingle	Credit assignment methods Decision-theoretic planning Faster convergence High-quality solutions Multi-agent patrolling Multi-agent planning Partial observability Real-world problem Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering NGUYEN, Duc Thien KUMAR, Akshat LAU, Hoong Chuin Credit assignment for collective multiagent RL with global rewards
description	Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain.
format	text
author	NGUYEN, Duc Thien KUMAR, Akshat LAU, Hoong Chuin
author_facet	NGUYEN, Duc Thien KUMAR, Akshat LAU, Hoong Chuin
author_sort	NGUYEN, Duc Thien
title	Credit assignment for collective multiagent RL with global rewards
title_short	Credit assignment for collective multiagent RL with global rewards
title_full	Credit assignment for collective multiagent RL with global rewards
title_fullStr	Credit assignment for collective multiagent RL with global rewards
title_full_unstemmed	Credit assignment for collective multiagent RL with global rewards
title_sort	credit assignment for collective multiagent rl with global rewards
publisher	Institutional Knowledge at Singapore Management University
publishDate	2018
url	https://ink.library.smu.edu.sg/sis_research/4287 https://ink.library.smu.edu.sg/context/sis_research/article/5290/viewcontent/NIPS_2018_Credit_Assignment_For_Collective_Multiagent_RL.pdf
_version_	1770574600181121024

Credit assignment for collective multiagent RL with global rewards

Similar Items