History-Based Controller Design and Optimization for Partially Observable MDPs

Partially observable MDPs provide an elegant framework forsequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-to-execute plans, and adjustable tradeoff between computational comp...

Full description

Saved in:

Bibliographic Details
Main Authors:	KUMAR, Akshat, ZILBERSTEIN, Shlomo
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2015
Subjects:	Artificial Intelligence and Robotics Computer Sciences Operations Research, Systems Engineering and Industrial Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/2915 https://ink.library.smu.edu.sg/context/sis_research/article/3915/viewcontent/History_Based_Controller_Design_afv.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Description
Summary:	Partially observable MDPs provide an elegant framework forsequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-to-execute plans, and adjustable tradeoff between computational complexityand policy size. We develop novel connections between optimizing FSCs for POMDPs and the dual linear programfor MDPs. Building on that, we present a dual mixed integer linear program (MIP) for optimizing FSCs. To assign well-defined meaning to FSC nodes as well as aid in policy search, we show how to associate history-based features with each FSC node. Using this representation, we address another challenging problem, that of iteratively deciding which nodes to add to FSC to get a better policy. Using an efficient off-the-shelf MIP solver, we show that this new approach can find compact near-optimal FSCs for severallarge benchmark domains, and is competitive with previous best approaches.

History-Based Controller Design and Optimization for Partially Observable MDPs

Similar Items