Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties

Markov Decision Processes (MDPs) are a classical model for decision making in the presence of uncertainty. Often they are viewed as state transformers with planning objectives defined with respect to paths over MDP states. An increasingly popular alternative is to view them as distribution transform...

Full description

Saved in:

Bibliographic Details
Main Authors:	AKSHAY, S., CHATTERJEE, Krishnendu, MEGGENDORFER, Tobias, ZIKELIC, Dorde
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Agent-based and Multi-agent Systems: MAS: Formal verification validation and synthesis Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/sis_research/9340 https://ink.library.smu.edu.sg/context/sis_research/article/10340/viewcontent/0001.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10340
record_format	dspace
spelling	sg-smu-ink.sis_research-103402024-10-08T06:55:42Z Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties AKSHAY, S. CHATTERJEE, Krishnendu MEGGENDORFER, Tobias ZIKELIC, Dorde Markov Decision Processes (MDPs) are a classical model for decision making in the presence of uncertainty. Often they are viewed as state transformers with planning objectives defined with respect to paths over MDP states. An increasingly popular alternative is to view them as distribution transformers, giving rise to a sequence of probability distributions over MDP states. For instance, reachability and safety properties in modeling robot swarms or chemical reaction networks are naturally defined in terms of probability distributions over states. Verifying such distributional properties is known to be hard and often beyond the reach of classical state-based verification techniques. In this work, we consider the problems of certified policy (i.e. controller) verification and synthesis in MDPs under distributional reach-avoidance specifications. By certified we mean that, along with a policy, we also aim to synthesize a (checkable) certificate ensuring that the MDP indeed satisfies the property. Thus, given the target set of distributions and an unsafe set of distributions over MDP states, our goal is to either synthesize a certificate for a given policy or synthesize a policy along with a certificate, proving that the target distribution can be reached while avoiding unsafe distributions. To solve this problem, we introduce the novel notion of distributional reach-avoid certificates and present automated procedures for (1) synthesizing a certificate for a given policy, and (2) synthesizing a policy together with the certificate, both providing formal guarantees on certificate correctness. Our experimental evaluation demonstrates the ability of our method to solve several non-trivial examples, including a multi-agent robot-swarm model, to synthesize certified policies and to certify existing policies. 2024-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9340 info:doi/10.24963/ijcai.2024/1 https://ink.library.smu.edu.sg/context/sis_research/article/10340/viewcontent/0001.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Agent-based and Multi-agent Systems: MAS: Formal verification validation and synthesis Artificial Intelligence and Robotics
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Agent-based and Multi-agent Systems: MAS: Formal verification validation and synthesis Artificial Intelligence and Robotics
spellingShingle	Agent-based and Multi-agent Systems: MAS: Formal verification validation and synthesis Artificial Intelligence and Robotics AKSHAY, S. CHATTERJEE, Krishnendu MEGGENDORFER, Tobias ZIKELIC, Dorde Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties
description	Markov Decision Processes (MDPs) are a classical model for decision making in the presence of uncertainty. Often they are viewed as state transformers with planning objectives defined with respect to paths over MDP states. An increasingly popular alternative is to view them as distribution transformers, giving rise to a sequence of probability distributions over MDP states. For instance, reachability and safety properties in modeling robot swarms or chemical reaction networks are naturally defined in terms of probability distributions over states. Verifying such distributional properties is known to be hard and often beyond the reach of classical state-based verification techniques. In this work, we consider the problems of certified policy (i.e. controller) verification and synthesis in MDPs under distributional reach-avoidance specifications. By certified we mean that, along with a policy, we also aim to synthesize a (checkable) certificate ensuring that the MDP indeed satisfies the property. Thus, given the target set of distributions and an unsafe set of distributions over MDP states, our goal is to either synthesize a certificate for a given policy or synthesize a policy along with a certificate, proving that the target distribution can be reached while avoiding unsafe distributions. To solve this problem, we introduce the novel notion of distributional reach-avoid certificates and present automated procedures for (1) synthesizing a certificate for a given policy, and (2) synthesizing a policy together with the certificate, both providing formal guarantees on certificate correctness. Our experimental evaluation demonstrates the ability of our method to solve several non-trivial examples, including a multi-agent robot-swarm model, to synthesize certified policies and to certify existing policies.
format	text
author	AKSHAY, S. CHATTERJEE, Krishnendu MEGGENDORFER, Tobias ZIKELIC, Dorde
author_facet	AKSHAY, S. CHATTERJEE, Krishnendu MEGGENDORFER, Tobias ZIKELIC, Dorde
author_sort	AKSHAY, S.
title	Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties
title_short	Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties
title_full	Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties
title_fullStr	Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties
title_full_unstemmed	Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties
title_sort	certified policy verification and synthesis for mdps under distributional reach-avoidance properties
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9340 https://ink.library.smu.edu.sg/context/sis_research/article/10340/viewcontent/0001.pdf
_version_	1814047914620616704

Certified policy verification and synthesis for MDPs under distributional reach-avoidance properties

Similar Items