Probabilistic Inference Techniques for Scalable Multiagent Decision Making

Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models---NEXP-Complete even for two agents---has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections be...

Full description

Saved in:
Bibliographic Details
Main Authors: Akshat KUMAR, ZILBERSTEIN, Shlomo, TOUSSAINT, Marc
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3076
https://ink.library.smu.edu.sg/context/sis_research/article/4076/viewcontent/10944_Article_Text_20419_1_10_20180216.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models---NEXP-Complete even for two agents---has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.