Planning and learning for Non-Markovian negative side effects using finite state controllers
Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSE...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8092 https://ink.library.smu.edu.sg/context/sis_research/article/9095/viewcontent/26767_pvoa.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9095 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-90952023-09-07T07:26:14Z Planning and learning for Non-Markovian negative side effects using finite state controllers SRIVASTAVA, Aishwarya Saisubramanian, Sandhya Paruchuri, Praveen KUMAR, Akshat Zilberstein, Shlomo Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSEs in environments modeled as Markov decision processes (MDPs). First, we learn a model of NSEs using observed data that contains state-action trajectories and severity of associated NSEs. Unlike previous works that associate NSEs with state-action pairs, our framework associates NSEs with entire trajectories, which is more general and captures non-Markovian dependence on states and actions. Second, we learn finite state controllers (FSCs) that predict NSE severity for a given trajectory and generalize well to unseen data. Finally, we develop a constrained MDP model that uses information from the underlying MDP and the learned FSC for planning while avoiding NSEs. Our empirical evaluation demonstrates the effectiveness of our approach in learning and mitigating Markovian and non-Markovian NSEs. 2023-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8092 info:doi/10.1609/aaai.v37i12.26767 https://ink.library.smu.edu.sg/context/sis_research/article/9095/viewcontent/26767_pvoa.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Constrained Markov decision process Finite-state controllers Incomplete model Non-Markovian Artificial Intelligence and Robotics |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Constrained Markov decision process Finite-state controllers Incomplete model Non-Markovian Artificial Intelligence and Robotics |
spellingShingle |
Constrained Markov decision process Finite-state controllers Incomplete model Non-Markovian Artificial Intelligence and Robotics SRIVASTAVA, Aishwarya Saisubramanian, Sandhya Paruchuri, Praveen KUMAR, Akshat Zilberstein, Shlomo Planning and learning for Non-Markovian negative side effects using finite state controllers |
description |
Autonomous systems are often deployed in the open world where it is hard to obtain complete specifications of objectives and constraints. Operating based on an incomplete model can produce negative side effects (NSEs), which affect the safety and reliability of the system. We focus on mitigating NSEs in environments modeled as Markov decision processes (MDPs). First, we learn a model of NSEs using observed data that contains state-action trajectories and severity of associated NSEs. Unlike previous works that associate NSEs with state-action pairs, our framework associates NSEs with entire trajectories, which is more general and captures non-Markovian dependence on states and actions. Second, we learn finite state controllers (FSCs) that predict NSE severity for a given trajectory and generalize well to unseen data. Finally, we develop a constrained MDP model that uses information from the underlying MDP and the learned FSC for planning while avoiding NSEs. Our empirical evaluation demonstrates the effectiveness of our approach in learning and mitigating Markovian and non-Markovian NSEs. |
format |
text |
author |
SRIVASTAVA, Aishwarya Saisubramanian, Sandhya Paruchuri, Praveen KUMAR, Akshat Zilberstein, Shlomo |
author_facet |
SRIVASTAVA, Aishwarya Saisubramanian, Sandhya Paruchuri, Praveen KUMAR, Akshat Zilberstein, Shlomo |
author_sort |
SRIVASTAVA, Aishwarya |
title |
Planning and learning for Non-Markovian negative side effects using finite state controllers |
title_short |
Planning and learning for Non-Markovian negative side effects using finite state controllers |
title_full |
Planning and learning for Non-Markovian negative side effects using finite state controllers |
title_fullStr |
Planning and learning for Non-Markovian negative side effects using finite state controllers |
title_full_unstemmed |
Planning and learning for Non-Markovian negative side effects using finite state controllers |
title_sort |
planning and learning for non-markovian negative side effects using finite state controllers |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/sis_research/8092 https://ink.library.smu.edu.sg/context/sis_research/article/9095/viewcontent/26767_pvoa.pdf |
_version_ |
1779157152471973888 |