Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs

While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorith...

Full description

Saved in:
Bibliographic Details
Main Authors: VARAKANTHAM, Pradeep Reddy, Maheswaran, Rajiv, GUPTA, Tapana, Tambe, Milind
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2007
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/956
https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1955
record_format dspace
spelling sg-smu-ink.sis_research-19552016-05-17T07:55:29Z Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds. 2007-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
Business
Operations Research, Systems Engineering and Industrial Engineering
spellingShingle Artificial Intelligence and Robotics
Business
Operations Research, Systems Engineering and Industrial Engineering
VARAKANTHAM, Pradeep Reddy
Maheswaran, Rajiv
GUPTA, Tapana
Tambe, Milind
Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
description While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds.
format text
author VARAKANTHAM, Pradeep Reddy
Maheswaran, Rajiv
GUPTA, Tapana
Tambe, Milind
author_facet VARAKANTHAM, Pradeep Reddy
Maheswaran, Rajiv
GUPTA, Tapana
Tambe, Milind
author_sort VARAKANTHAM, Pradeep Reddy
title Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_short Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_full Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_fullStr Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_full_unstemmed Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
title_sort towards efficient computation of quality bounded solutions in pomdps: expected value approximation and dynamic disjunctive beliefs
publisher Institutional Knowledge at Singapore Management University
publishDate 2007
url https://ink.library.smu.edu.sg/sis_research/956
https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf
_version_ 1770570793417179136