Hierarchical reinforcement learning with integrated discovery of salient subgoals

Hierarchical Reinforcement Learning (HRL) is a promising approach to solve more complex tasks which may be challenging for the traditional reinforcement learning. HRL achieves this by decomposing a task into shorter-horizon subgoals which are simpler to achieve. Autonomous discovery of such subgoals...

全面介紹

Saved in:

書目詳細資料
Main Authors:	PATERIA, Shubham, SUBAGDJA, Budhitama, TAN, Ah-hwee
格式:	text
語言:	English
出版:	Institutional Knowledge at Singapore Management University 2020
主題:	Hierarchical Reinforcement Learning Reinforcement Learning Subgoal discovery Databases and Information Systems
在線閱讀:	https://ink.library.smu.edu.sg/sis_research/6171 https://ink.library.smu.edu.sg/context/sis_research/article/7174/viewcontent/p1963.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Singapore Management University
語言:	English

id	sg-smu-ink.sis_research-7174
record_format	dspace
spelling	sg-smu-ink.sis_research-71742021-09-29T10:26:23Z Hierarchical reinforcement learning with integrated discovery of salient subgoals PATERIA, Shubham SUBAGDJA, Budhitama TAN, Ah-hwee Hierarchical Reinforcement Learning (HRL) is a promising approach to solve more complex tasks which may be challenging for the traditional reinforcement learning. HRL achieves this by decomposing a task into shorter-horizon subgoals which are simpler to achieve. Autonomous discovery of such subgoals is an important part of HRL. Recently, end-to-end HRL methods have been used to reduce the overhead from offline subgoal discovery by seeking the useful subgoals while simultaneously learning optimal policies in a hierarchy. However, these methods may still suffer from slow learning when the search space used by a high level policy to find the subgoals is large. We propose LIDOSS, an end-to-end HRL method with an integrated heuristic for subgoal discovery. In LIDOSS, the search space of a high level policy can be reduced by focusing only on the subgoal states that have high saliency. We evaluate LIDOSS on continuous control tasks in the MuJoCo Ant domain. The results show that LIDOSS outperforms Hierarchical Actor Critic (HAC), a state-of-the-art HRL method, in the fixed goal tasks. 2020-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6171 info:doi/10.5555/3398761.3399042 https://ink.library.smu.edu.sg/context/sis_research/article/7174/viewcontent/p1963.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Hierarchical Reinforcement Learning Reinforcement Learning Subgoal discovery Databases and Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Hierarchical Reinforcement Learning Reinforcement Learning Subgoal discovery Databases and Information Systems
spellingShingle	Hierarchical Reinforcement Learning Reinforcement Learning Subgoal discovery Databases and Information Systems PATERIA, Shubham SUBAGDJA, Budhitama TAN, Ah-hwee Hierarchical reinforcement learning with integrated discovery of salient subgoals
description	Hierarchical Reinforcement Learning (HRL) is a promising approach to solve more complex tasks which may be challenging for the traditional reinforcement learning. HRL achieves this by decomposing a task into shorter-horizon subgoals which are simpler to achieve. Autonomous discovery of such subgoals is an important part of HRL. Recently, end-to-end HRL methods have been used to reduce the overhead from offline subgoal discovery by seeking the useful subgoals while simultaneously learning optimal policies in a hierarchy. However, these methods may still suffer from slow learning when the search space used by a high level policy to find the subgoals is large. We propose LIDOSS, an end-to-end HRL method with an integrated heuristic for subgoal discovery. In LIDOSS, the search space of a high level policy can be reduced by focusing only on the subgoal states that have high saliency. We evaluate LIDOSS on continuous control tasks in the MuJoCo Ant domain. The results show that LIDOSS outperforms Hierarchical Actor Critic (HAC), a state-of-the-art HRL method, in the fixed goal tasks.
format	text
author	PATERIA, Shubham SUBAGDJA, Budhitama TAN, Ah-hwee
author_facet	PATERIA, Shubham SUBAGDJA, Budhitama TAN, Ah-hwee
author_sort	PATERIA, Shubham
title	Hierarchical reinforcement learning with integrated discovery of salient subgoals
title_short	Hierarchical reinforcement learning with integrated discovery of salient subgoals
title_full	Hierarchical reinforcement learning with integrated discovery of salient subgoals
title_fullStr	Hierarchical reinforcement learning with integrated discovery of salient subgoals
title_full_unstemmed	Hierarchical reinforcement learning with integrated discovery of salient subgoals
title_sort	hierarchical reinforcement learning with integrated discovery of salient subgoals
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/6171 https://ink.library.smu.edu.sg/context/sis_research/article/7174/viewcontent/p1963.pdf
_version_	1770575841457078272

Hierarchical reinforcement learning with integrated discovery of salient subgoals

相似書籍