Fast reinforcement learning under uncertainties with self-organizing neural networks

Using feedback signals from the environment, a reinforcement learning (RL) system typically discovers action policies that recommend actions effective to the states based on a Q-value function. However, uncertainties over the estimation of the Q-values can delay the convergence of RL. For fast RL co...

Full description

Saved in:

Bibliographic Details
Main Authors:	TENG, Teck-Hou, TAN, Ah-hwee
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2015
Subjects:	Databases and Information Systems OS and Networks
Online Access:	https://ink.library.smu.edu.sg/sis_research/6797 https://ink.library.smu.edu.sg/context/sis_research/article/7800/viewcontent/Fast_RL___WI_IAT_2015.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-7800
record_format	dspace
spelling	sg-smu-ink.sis_research-78002022-01-27T08:34:42Z Fast reinforcement learning under uncertainties with self-organizing neural networks TENG, Teck-Hou TAN, Ah-hwee Using feedback signals from the environment, a reinforcement learning (RL) system typically discovers action policies that recommend actions effective to the states based on a Q-value function. However, uncertainties over the estimation of the Q-values can delay the convergence of RL. For fast RL convergence by accounting for such uncertainties, this paper proposes several enhancements to the estimation and learning of the Q-value using a self-organizing neural network. Specifically, a temporal difference method known as Q-learning is complemented by a Q-value Polarization procedure, which contrasts the Q-values using feedback signals on the effect of the recommended actions. The polarized Q-values are then learned by the self-organizing neural network using a Bi-directional Template Learning procedure. Furthermore, the polarized Q-values are in turn used to adapt the reward vigilance of the ART-based self-organizing neural network using a Bi-directional Adaptation procedure. The efficacy of the resultant system called Fast Learning (FL) FALCON is illustrated using two single-task problem domains with large MDPs. The experiment results from these problem domains unanimously show FL-FALCON converging faster than the compared approaches. 2015-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6797 info:doi/10.1109/WI-IAT.2015.103 https://ink.library.smu.edu.sg/context/sis_research/article/7800/viewcontent/Fast_RL___WI_IAT_2015.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems OS and Networks
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems OS and Networks
spellingShingle	Databases and Information Systems OS and Networks TENG, Teck-Hou TAN, Ah-hwee Fast reinforcement learning under uncertainties with self-organizing neural networks
description	Using feedback signals from the environment, a reinforcement learning (RL) system typically discovers action policies that recommend actions effective to the states based on a Q-value function. However, uncertainties over the estimation of the Q-values can delay the convergence of RL. For fast RL convergence by accounting for such uncertainties, this paper proposes several enhancements to the estimation and learning of the Q-value using a self-organizing neural network. Specifically, a temporal difference method known as Q-learning is complemented by a Q-value Polarization procedure, which contrasts the Q-values using feedback signals on the effect of the recommended actions. The polarized Q-values are then learned by the self-organizing neural network using a Bi-directional Template Learning procedure. Furthermore, the polarized Q-values are in turn used to adapt the reward vigilance of the ART-based self-organizing neural network using a Bi-directional Adaptation procedure. The efficacy of the resultant system called Fast Learning (FL) FALCON is illustrated using two single-task problem domains with large MDPs. The experiment results from these problem domains unanimously show FL-FALCON converging faster than the compared approaches.
format	text
author	TENG, Teck-Hou TAN, Ah-hwee
author_facet	TENG, Teck-Hou TAN, Ah-hwee
author_sort	TENG, Teck-Hou
title	Fast reinforcement learning under uncertainties with self-organizing neural networks
title_short	Fast reinforcement learning under uncertainties with self-organizing neural networks
title_full	Fast reinforcement learning under uncertainties with self-organizing neural networks
title_fullStr	Fast reinforcement learning under uncertainties with self-organizing neural networks
title_full_unstemmed	Fast reinforcement learning under uncertainties with self-organizing neural networks
title_sort	fast reinforcement learning under uncertainties with self-organizing neural networks
publisher	Institutional Knowledge at Singapore Management University
publishDate	2015
url	https://ink.library.smu.edu.sg/sis_research/6797 https://ink.library.smu.edu.sg/context/sis_research/article/7800/viewcontent/Fast_RL___WI_IAT_2015.pdf
_version_	1770576070756532224

Fast reinforcement learning under uncertainties with self-organizing neural networks

Similar Items