Generalization through diversity: Improving unsupervised environment design

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze,...

Full description

Saved in:

Bibliographic Details
Main Authors:	LI, Wenjun, VARAKANTHAM, Pradeep, LI, Dexun
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Planning and Scheduling Search in planning and scheduling Machine Learning Deep reinforcement learning Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/8099 https://ink.library.smu.edu.sg/context/sis_research/article/9102/viewcontent/Generalization_0601_pvoa.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9102
record_format	dspace
spelling	sg-smu-ink.sis_research-91022023-09-07T07:21:22Z Generalization through diversity: Improving unsupervised environment design LI, Wenjun VARAKANTHAM, Pradeep LI, Dexun Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze, size of the board) can severely affect the effectiveness of the policy learned by the agent. To that end, existing work has proposed training RL agents on an adaptive curriculum of environments (generated automatically) to improve performance on out-of-distribution (OOD) test scenarios. Specifically, existing research has employed the potential for the agent to learn in an environment (captured using Generalized Advantage Estimation, GAE) as the key factor to select the next environment(s) to train the agent. However, such a mechanism can select similar environments (with a high potential to learn) thereby making agent training redundant on all but one of those environments. To that end, we provide a principled approach to adaptively identify diverse environments based on a novel distance measure relevant to environment design. We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design on three distinct benchmark problems used in literature. 2023-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8099 info:doi/10.24963/ijcai.2023/601 https://ink.library.smu.edu.sg/context/sis_research/article/9102/viewcontent/Generalization_0601_pvoa.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Planning and Scheduling Search in planning and scheduling Machine Learning Deep reinforcement learning Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Planning and Scheduling Search in planning and scheduling Machine Learning Deep reinforcement learning Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering
spellingShingle	Planning and Scheduling Search in planning and scheduling Machine Learning Deep reinforcement learning Artificial Intelligence and Robotics Operations Research, Systems Engineering and Industrial Engineering LI, Wenjun VARAKANTHAM, Pradeep LI, Dexun Generalization through diversity: Improving unsupervised environment design
description	Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze, size of the board) can severely affect the effectiveness of the policy learned by the agent. To that end, existing work has proposed training RL agents on an adaptive curriculum of environments (generated automatically) to improve performance on out-of-distribution (OOD) test scenarios. Specifically, existing research has employed the potential for the agent to learn in an environment (captured using Generalized Advantage Estimation, GAE) as the key factor to select the next environment(s) to train the agent. However, such a mechanism can select similar environments (with a high potential to learn) thereby making agent training redundant on all but one of those environments. To that end, we provide a principled approach to adaptively identify diverse environments based on a novel distance measure relevant to environment design. We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design on three distinct benchmark problems used in literature.
format	text
author	LI, Wenjun VARAKANTHAM, Pradeep LI, Dexun
author_facet	LI, Wenjun VARAKANTHAM, Pradeep LI, Dexun
author_sort	LI, Wenjun
title	Generalization through diversity: Improving unsupervised environment design
title_short	Generalization through diversity: Improving unsupervised environment design
title_full	Generalization through diversity: Improving unsupervised environment design
title_fullStr	Generalization through diversity: Improving unsupervised environment design
title_full_unstemmed	Generalization through diversity: Improving unsupervised environment design
title_sort	generalization through diversity: improving unsupervised environment design
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8099 https://ink.library.smu.edu.sg/context/sis_research/article/9102/viewcontent/Generalization_0601_pvoa.pdf
_version_	1779157154143404032

Generalization through diversity: Improving unsupervised environment design

Similar Items