Dual formulations for optimizing Dec-POMDP controllers

Decentralized POMDP is an expressive model for multi-agent planning. Finite-state controllers (FSCs)---often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Akshat KUMAR, MOSTAFA, Hala, ZILBERSTEIN, Shlomo
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2016
Subjects:	Maximum principle Message passing Multi agent systems Scheduling Solar concentrators Stochastic systems Artificial Intelligence and Robotics Computer Sciences Operations Research, Systems Engineering and Industrial Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/3395 https://ink.library.smu.edu.sg/context/sis_research/article/4396/viewcontent/DualFormulations.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-4396
record_format	dspace
spelling	sg-smu-ink.sis_research-43962018-06-27T06:04:00Z Dual formulations for optimizing Dec-POMDP controllers Akshat KUMAR, MOSTAFA, Hala ZILBERSTEIN, Shlomo Decentralized POMDP is an expressive model for multi-agent planning. Finite-state controllers (FSCs)---often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for optimizing deterministic FSCs. We exploit the Dec-POMDP structure to devise a compact MIP and formulate constraints that result in policies executable in partially-observable decentralized settings. We show analytically that the dual formulation can also be exploited within the expectation maximization (EM) framework to optimize stochastic FSCs. The resulting EM algorithm can be implemented by solving a sequence of linear programs, without requiring expensive message-passing over the Dec-POMDP DBN. We also present an efficient technique for policy improvement based on a weighted entropy measure. Compared with state-of-the-art FSC methods, our approach offers over an order-of-magnitude speedup, while producing similar or better solutions. 2016-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3395 https://ink.library.smu.edu.sg/context/sis_research/article/4396/viewcontent/DualFormulations.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Maximum principle Message passing Multi agent systems Scheduling Solar concentrators Stochastic systems Artificial Intelligence and Robotics Computer Sciences Operations Research, Systems Engineering and Industrial Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Maximum principle Message passing Multi agent systems Scheduling Solar concentrators Stochastic systems Artificial Intelligence and Robotics Computer Sciences Operations Research, Systems Engineering and Industrial Engineering
spellingShingle	Maximum principle Message passing Multi agent systems Scheduling Solar concentrators Stochastic systems Artificial Intelligence and Robotics Computer Sciences Operations Research, Systems Engineering and Industrial Engineering Akshat KUMAR, MOSTAFA, Hala ZILBERSTEIN, Shlomo Dual formulations for optimizing Dec-POMDP controllers
description	Decentralized POMDP is an expressive model for multi-agent planning. Finite-state controllers (FSCs)---often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for optimizing deterministic FSCs. We exploit the Dec-POMDP structure to devise a compact MIP and formulate constraints that result in policies executable in partially-observable decentralized settings. We show analytically that the dual formulation can also be exploited within the expectation maximization (EM) framework to optimize stochastic FSCs. The resulting EM algorithm can be implemented by solving a sequence of linear programs, without requiring expensive message-passing over the Dec-POMDP DBN. We also present an efficient technique for policy improvement based on a weighted entropy measure. Compared with state-of-the-art FSC methods, our approach offers over an order-of-magnitude speedup, while producing similar or better solutions.
format	text
author	Akshat KUMAR, MOSTAFA, Hala ZILBERSTEIN, Shlomo
author_facet	Akshat KUMAR, MOSTAFA, Hala ZILBERSTEIN, Shlomo
author_sort	Akshat KUMAR,
title	Dual formulations for optimizing Dec-POMDP controllers
title_short	Dual formulations for optimizing Dec-POMDP controllers
title_full	Dual formulations for optimizing Dec-POMDP controllers
title_fullStr	Dual formulations for optimizing Dec-POMDP controllers
title_full_unstemmed	Dual formulations for optimizing Dec-POMDP controllers
title_sort	dual formulations for optimizing dec-pomdp controllers
publisher	Institutional Knowledge at Singapore Management University
publishDate	2016
url	https://ink.library.smu.edu.sg/sis_research/3395 https://ink.library.smu.edu.sg/context/sis_research/article/4396/viewcontent/DualFormulations.pdf
_version_	1770573157002903552

Dual formulations for optimizing Dec-POMDP controllers

Similar Items