Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs
While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorith...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2007
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-1955 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-19552016-05-17T07:55:29Z Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds. 2007-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering |
spellingShingle |
Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs |
description |
While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds. |
format |
text |
author |
VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind |
author_facet |
VARAKANTHAM, Pradeep Reddy Maheswaran, Rajiv GUPTA, Tapana Tambe, Milind |
author_sort |
VARAKANTHAM, Pradeep Reddy |
title |
Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs |
title_short |
Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs |
title_full |
Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs |
title_fullStr |
Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs |
title_full_unstemmed |
Towards Efficient Computation of Quality Bounded Solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs |
title_sort |
towards efficient computation of quality bounded solutions in pomdps: expected value approximation and dynamic disjunctive beliefs |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2007 |
url |
https://ink.library.smu.edu.sg/sis_research/956 https://ink.library.smu.edu.sg/context/sis_research/article/1955/viewcontent/IJCAI07.pdf |
_version_ |
1770570793417179136 |