Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty

Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specif...

Full description

Saved in:
Bibliographic Details
Main Authors: ADULYASAK, Yossiri, VARAKANTHAM, Pradeep, AHMED, Asrar, JAILLET, Patrick
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2916
https://ink.library.smu.edu.sg/context/sis_research/article/3916/viewcontent/9843_44958_1_PB.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-3916
record_format dspace
spelling sg-smu-ink.sis_research-39162020-03-24T08:21:44Z Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty ADULYASAK, Yossiri VARAKANTHAM, Pradeep AHMED, Asrar JAILLET, Patrick Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specified. Existing research has primarily focussed on computing infinite horizon stationary policies when optimizing robustness, regret and percentile based objectives. We focus specifically on finite horizon problems with a special emphasis on objectives that are separable over individual instantiations of model uncertainty (i.e., objectives that can be expressed as a sum over instantiations of model uncertainty): (a) First, we identify two separable objectives for uncertain MDPs: Average Value Maximization (AVM) and Confidence Probability Maximisation (CPM). (b) Second, we provide optimization based solutions to compute policies for uncertain MDPs with such objectives. In particular, we exploit the separability of AVM and CPM objectives by employing Lagrangian dual decomposition (LDD). (c) Finally, we demonstrate the utility of the LDD approach on a benchmark problem from the literature. 2015-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/2916 https://ink.library.smu.edu.sg/context/sis_research/article/3916/viewcontent/9843_44958_1_PB.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Markov Decision Problems (MDPs) Lagrangian Dual Decomposition Bayesian Reinforcement Learning Robust MDPs Artificial Intelligence and Robotics Computer Sciences Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Markov Decision Problems (MDPs)
Lagrangian Dual Decomposition
Bayesian Reinforcement Learning
Robust MDPs
Artificial Intelligence and Robotics
Computer Sciences
Numerical Analysis and Scientific Computing
spellingShingle Markov Decision Problems (MDPs)
Lagrangian Dual Decomposition
Bayesian Reinforcement Learning
Robust MDPs
Artificial Intelligence and Robotics
Computer Sciences
Numerical Analysis and Scientific Computing
ADULYASAK, Yossiri
VARAKANTHAM, Pradeep
AHMED, Asrar
JAILLET, Patrick
Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
description Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specified. Existing research has primarily focussed on computing infinite horizon stationary policies when optimizing robustness, regret and percentile based objectives. We focus specifically on finite horizon problems with a special emphasis on objectives that are separable over individual instantiations of model uncertainty (i.e., objectives that can be expressed as a sum over instantiations of model uncertainty): (a) First, we identify two separable objectives for uncertain MDPs: Average Value Maximization (AVM) and Confidence Probability Maximisation (CPM). (b) Second, we provide optimization based solutions to compute policies for uncertain MDPs with such objectives. In particular, we exploit the separability of AVM and CPM objectives by employing Lagrangian dual decomposition (LDD). (c) Finally, we demonstrate the utility of the LDD approach on a benchmark problem from the literature.
format text
author ADULYASAK, Yossiri
VARAKANTHAM, Pradeep
AHMED, Asrar
JAILLET, Patrick
author_facet ADULYASAK, Yossiri
VARAKANTHAM, Pradeep
AHMED, Asrar
JAILLET, Patrick
author_sort ADULYASAK, Yossiri
title Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
title_short Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
title_full Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
title_fullStr Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
title_full_unstemmed Solving Uncertain MDPs with Objectives that are Separable over Instantiations of Model Uncertainty
title_sort solving uncertain mdps with objectives that are separable over instantiations of model uncertainty
publisher Institutional Knowledge at Singapore Management University
publishDate 2015
url https://ink.library.smu.edu.sg/sis_research/2916
https://ink.library.smu.edu.sg/context/sis_research/article/3916/viewcontent/9843_44958_1_PB.pdf
_version_ 1770572735842353152