Reinforcement learning for sequential decision making with constraints

Reinforcement learning is a widely used approach to tackle problems in sequential decision making where an agent learns from rewards or penalties. However, in decision-making problems that involve safety or limited resources, the agent's exploration is often limited by constraints. To model suc...

Full description

Saved in:

Bibliographic Details
Main Author:	LING, Jiajing
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	reinforcement learning sequential decision making and neuro-symbolic AI Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/etd_coll/513 https://ink.library.smu.edu.sg/context/etd_coll/article/1511/viewcontent/GPIS_AY2018_PhD_LING_Jiajing.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.etd_coll-1511
record_format	dspace
spelling	sg-smu-ink.etd_coll-15112023-10-03T06:11:27Z Reinforcement learning for sequential decision making with constraints LING, Jiajing Reinforcement learning is a widely used approach to tackle problems in sequential decision making where an agent learns from rewards or penalties. However, in decision-making problems that involve safety or limited resources, the agent's exploration is often limited by constraints. To model such problems, constrained Markov decision processes and constrained decentralized partially observable Markov decision processes have been proposed for single-agent and multi-agent settings, respectively. A significant challenge in solving constrained Dec-POMDP is determining the contribution of each agent to the primary objective and constraint violations. To address this issue, we propose a fictitious play-based method that uses Lagrangian Relaxation to perform credit assignment for both primary objectives and constraints in large-scale multi-agent systems. Another major challenge in solving both CMDP and constrained Dec-POMDP is the sample inefficiency issue, mainly resulting from finding valid actions that satisfy all constraints, which becomes even more difficult in large state and action spaces. Recent works in RL have attempted to incorporate domain knowledge from experts into the learning process through neuro-symbolic methods to address the sample inefficiency issue. We propose a knowledge compilation framework using decision diagrams by treating constraints as domain knowledge and introducing neuro-symbolic methods to support effective learning in constrained RL. Firstly, we propose a zone-based multi-agent pathfinding (ZBPF) framework that is motivated by drone delivery applications. We propose a neuro-symbolic method to efficiently solve the ZBPF problem with several domain constraints, such as simple path constraint and landmark constraint in ZBPF. Secondly, we propose another neuro-symbolic method to solve action constrained RL where the action space is discrete and combinatorial. Empirical results show that our proposed approaches achieve better performance than standard constrained RL algorithms in several real-world applications. 2023-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/etd_coll/513 https://ink.library.smu.edu.sg/context/etd_coll/article/1511/viewcontent/GPIS_AY2018_PhD_LING_Jiajing.pdf Dissertations and Theses Collection (Open Access) eng Institutional Knowledge at Singapore Management University reinforcement learning sequential decision making and neuro-symbolic AI Artificial Intelligence and Robotics
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	reinforcement learning sequential decision making and neuro-symbolic AI Artificial Intelligence and Robotics
spellingShingle	reinforcement learning sequential decision making and neuro-symbolic AI Artificial Intelligence and Robotics LING, Jiajing Reinforcement learning for sequential decision making with constraints
description	Reinforcement learning is a widely used approach to tackle problems in sequential decision making where an agent learns from rewards or penalties. However, in decision-making problems that involve safety or limited resources, the agent's exploration is often limited by constraints. To model such problems, constrained Markov decision processes and constrained decentralized partially observable Markov decision processes have been proposed for single-agent and multi-agent settings, respectively. A significant challenge in solving constrained Dec-POMDP is determining the contribution of each agent to the primary objective and constraint violations. To address this issue, we propose a fictitious play-based method that uses Lagrangian Relaxation to perform credit assignment for both primary objectives and constraints in large-scale multi-agent systems. Another major challenge in solving both CMDP and constrained Dec-POMDP is the sample inefficiency issue, mainly resulting from finding valid actions that satisfy all constraints, which becomes even more difficult in large state and action spaces. Recent works in RL have attempted to incorporate domain knowledge from experts into the learning process through neuro-symbolic methods to address the sample inefficiency issue. We propose a knowledge compilation framework using decision diagrams by treating constraints as domain knowledge and introducing neuro-symbolic methods to support effective learning in constrained RL. Firstly, we propose a zone-based multi-agent pathfinding (ZBPF) framework that is motivated by drone delivery applications. We propose a neuro-symbolic method to efficiently solve the ZBPF problem with several domain constraints, such as simple path constraint and landmark constraint in ZBPF. Secondly, we propose another neuro-symbolic method to solve action constrained RL where the action space is discrete and combinatorial. Empirical results show that our proposed approaches achieve better performance than standard constrained RL algorithms in several real-world applications.
format	text
author	LING, Jiajing
author_facet	LING, Jiajing
author_sort	LING, Jiajing
title	Reinforcement learning for sequential decision making with constraints
title_short	Reinforcement learning for sequential decision making with constraints
title_full	Reinforcement learning for sequential decision making with constraints
title_fullStr	Reinforcement learning for sequential decision making with constraints
title_full_unstemmed	Reinforcement learning for sequential decision making with constraints
title_sort	reinforcement learning for sequential decision making with constraints
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/etd_coll/513 https://ink.library.smu.edu.sg/context/etd_coll/article/1511/viewcontent/GPIS_AY2018_PhD_LING_Jiajing.pdf
_version_	1779157211262484480

Reinforcement learning for sequential decision making with constraints

Similar Items