Planning and learning for Non-Markovian negative side effects using finite state controllers

Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSE...

Full description

Saved in:
Bibliographic Details
Main Authors: SRIVASTAVA, Aishwarya, Saisubramanian, Sandhya, Paruchuri, Praveen, KUMAR, Akshat, Zilberstein, Shlomo
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8092
https://ink.library.smu.edu.sg/context/sis_research/article/9095/viewcontent/26767_pvoa.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9095
record_format dspace
spelling sg-smu-ink.sis_research-90952023-09-07T07:26:14Z Planning and learning for Non-Markovian negative side effects using finite state controllers SRIVASTAVA, Aishwarya Saisubramanian, Sandhya Paruchuri, Praveen KUMAR, Akshat Zilberstein, Shlomo Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSEs in environments modeled as Markov decision processes (MDPs). First, we learn a model of NSEs using observed data that contains state-action trajectories and severity of associated NSEs. Unlike previous works that associate NSEs with state-action pairs, our framework associates NSEs with entire trajectories, which is more general and captures non-Markovian dependence on states and actions. Second, we learn finite state controllers (FSCs) that predict NSE severity for a given trajectory and generalize well to unseen data. Finally, we develop a constrained MDP model that uses information from the underlying MDP and the learned FSC for planning while avoiding NSEs. Our empirical evaluation demonstrates the effectiveness of our approach in learning and mitigating Markovian and non-Markovian NSEs. 2023-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8092 info:doi/10.1609/aaai.v37i12.26767 https://ink.library.smu.edu.sg/context/sis_research/article/9095/viewcontent/26767_pvoa.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Constrained Markov decision process Finite-state controllers Incomplete model Non-Markovian Artificial Intelligence and Robotics
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Constrained Markov decision process
Finite-state controllers
Incomplete model
Non-Markovian
Artificial Intelligence and Robotics
spellingShingle Constrained Markov decision process
Finite-state controllers
Incomplete model
Non-Markovian
Artificial Intelligence and Robotics
SRIVASTAVA, Aishwarya
Saisubramanian, Sandhya
Paruchuri, Praveen
KUMAR, Akshat
Zilberstein, Shlomo
Planning and learning for Non-Markovian negative side effects using finite state controllers
description Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSEs in environments modeled as Markov decision processes (MDPs). First, we learn a model of NSEs using observed data that contains state-action trajectories and severity of associated NSEs. Unlike previous works that associate NSEs with state-action pairs, our framework associates NSEs with entire trajectories, which is more general and captures non-Markovian dependence on states and actions. Second, we learn finite state controllers (FSCs) that predict NSE severity for a given trajectory and generalize well to unseen data. Finally, we develop a constrained MDP model that uses information from the underlying MDP and the learned FSC for planning while avoiding NSEs. Our empirical evaluation demonstrates the effectiveness of our approach in learning and mitigating Markovian and non-Markovian NSEs.
format text
author SRIVASTAVA, Aishwarya
Saisubramanian, Sandhya
Paruchuri, Praveen
KUMAR, Akshat
Zilberstein, Shlomo
author_facet SRIVASTAVA, Aishwarya
Saisubramanian, Sandhya
Paruchuri, Praveen
KUMAR, Akshat
Zilberstein, Shlomo
author_sort SRIVASTAVA, Aishwarya
title Planning and learning for Non-Markovian negative side effects using finite state controllers
title_short Planning and learning for Non-Markovian negative side effects using finite state controllers
title_full Planning and learning for Non-Markovian negative side effects using finite state controllers
title_fullStr Planning and learning for Non-Markovian negative side effects using finite state controllers
title_full_unstemmed Planning and learning for Non-Markovian negative side effects using finite state controllers
title_sort planning and learning for non-markovian negative side effects using finite state controllers
publisher Institutional Knowledge at Singapore Management University
publishDate 2023
url https://ink.library.smu.edu.sg/sis_research/8092
https://ink.library.smu.edu.sg/context/sis_research/article/9095/viewcontent/26767_pvoa.pdf
_version_ 1779157152471973888