Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs

While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorith...

Full description

Saved in:

Bibliographic Details
Main Authors:	VARAKANTHAM, Pradeep Reddy, Maheswaran, Rajiv, GUPTA, Tapana, Tambe, Milind
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2007
Subjects:	Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-1955
record_format	dspace
spelling	sg-smu-ink.sis_research-19552016-05-17T07:55:29Z Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds. 2007-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
spellingShingle	Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
description	While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds.
format	text
author	VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind
author_facet	VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind
author_sort	VARAKANTHAM, Pradeep Reddy
title	Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_short	Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_full	Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_fullStr	Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_full_unstemmed	Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_sort	towards efficient computation of quality bounded solutions in pomdps: expected value approximation and dynamic disjunctive beliefs
publisher	Institutional Knowledge at Singapore Management University
publishDate	2007
url	https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf
_version_	1770570793417179136

Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs

Similar Items