Regret based Robust Solutions for Uncertain Markov Decision Processes

In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust optimization approaches for these problems have focussed on the computation of maximin policies which maximize the value corresponding to the worst realization of the uncertainty. Recent work has propo...

Full description

Saved in:

Bibliographic Details
Main Authors:	AHMED, Asrar, VARAKANTHAM, Pradeep Reddy, Adulyasak, Yossiri, Jaillet, Patrick
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2013
Subjects:	Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/1932 https://ink.library.smu.edu.sg/context/sis_research/article/2931/viewcontent/RRMDP_finalversion.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-2931
record_format	dspace
spelling	sg-smu-ink.sis_research-29312013-12-27T07:00:37Z Regret based Robust Solutions for Uncertain Markov Decision Processes AHMED, Asrar VARAKANTHAM, Pradeep Reddy Adulyasak, Yossiri Jaillet, Patrick In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust optimization approaches for these problems have focussed on the computation of maximin policies which maximize the value corresponding to the worst realization of the uncertainty. Recent work has proposed minimax regret as a suitable alternative to the maximin objective for robust optimization. However, existing algorithms for handling minimax regret are restricted to models with uncertainty over rewards only. We provide algorithms that employ sampling to improve across multiple dimensions: (a) Handle uncertainties over both transition and reward models; (b) Dependence of model uncertainties across state, action pairs and decision epochs; (c) Scalability and quality bounds. Finally, to demonstrate the empirical effectiveness of our sampling approaches, we provide comparisons against benchmark algorithms on two domains from literature. We also provide a Sample Average Approximation (SAA) analysis to compute a posteriori error bounds. 2013-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/1932 https://ink.library.smu.edu.sg/context/sis_research/article/2931/viewcontent/RRMDP_finalversion.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering
spellingShingle	Artificial Intelligence and Robotics Business Operations Research, Systems Engineering and Industrial Engineering AHMED, Asrar VARAKANTHAM, Pradeep Reddy Adulyasak, Yossiri Jaillet, Patrick Regret based Robust Solutions for Uncertain Markov Decision Processes
description	In this paper, we seek robust policies for uncertain Markov Decision Processes (MDPs). Most robust optimization approaches for these problems have focussed on the computation of maximin policies which maximize the value corresponding to the worst realization of the uncertainty. Recent work has proposed minimax regret as a suitable alternative to the maximin objective for robust optimization. However, existing algorithms for handling minimax regret are restricted to models with uncertainty over rewards only. We provide algorithms that employ sampling to improve across multiple dimensions: (a) Handle uncertainties over both transition and reward models; (b) Dependence of model uncertainties across state, action pairs and decision epochs; (c) Scalability and quality bounds. Finally, to demonstrate the empirical effectiveness of our sampling approaches, we provide comparisons against benchmark algorithms on two domains from literature. We also provide a Sample Average Approximation (SAA) analysis to compute a posteriori error bounds.
format	text
author	AHMED, Asrar VARAKANTHAM, Pradeep Reddy Adulyasak, Yossiri Jaillet, Patrick
author_facet	AHMED, Asrar VARAKANTHAM, Pradeep Reddy Adulyasak, Yossiri Jaillet, Patrick
author_sort	AHMED, Asrar
title	Regret based Robust Solutions for Uncertain Markov Decision Processes
title_short	Regret based Robust Solutions for Uncertain Markov Decision Processes
title_full	Regret based Robust Solutions for Uncertain Markov Decision Processes
title_fullStr	Regret based Robust Solutions for Uncertain Markov Decision Processes
title_full_unstemmed	Regret based Robust Solutions for Uncertain Markov Decision Processes
title_sort	regret based robust solutions for uncertain markov decision processes
publisher	Institutional Knowledge at Singapore Management University
publishDate	2013
url	https://ink.library.smu.edu.sg/sis_research/1932 https://ink.library.smu.edu.sg/context/sis_research/article/2931/viewcontent/RRMDP_finalversion.pdf
_version_	1770571688269840384

Regret based Robust Solutions for Uncertain Markov Decision Processes

Similar Items