Self-organizing neural architectures and cooperative learning in a multiagent environment
Temporal-Difference–Fusion Architecture for Learning, Cognition, and Navigation (TD-FALCON) is a generalization of adaptive resonance theory (a class of self-organizing neural networks) that incorporates TD methods for real-time reinforcement learning. In this paper, we investigate how a team of TD-...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2007
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5221 https://ink.library.smu.edu.sg/context/sis_research/article/6224/viewcontent/MA20TSMC_B07.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-6224 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-62242020-07-23T18:33:29Z Self-organizing neural architectures and cooperative learning in a multiagent environment XIAO, Dan TAN, Ah-hwee Temporal-Difference–Fusion Architecture for Learning, Cognition, and Navigation (TD-FALCON) is a generalization of adaptive resonance theory (a class of self-organizing neural networks) that incorporates TD methods for real-time reinforcement learning. In this paper, we investigate how a team of TD-FALCON networks may cooperate to learn and function in a dynamic multiagent environment based on minefield navigation and a predator/prey pursuit tasks. Experiments on the navigation task demonstrate that TD-FALCON agent teams are able to adapt and function well in a multiagent environment without an explicit mechanism of collaboration. In comparison, traditional Q-learning agents using gradient-descent-based feedforward neural networks, trained with the standard backpropagation and the resilient-propagation (RPROP) algorithms, produce a significantly poorer level of performance. For the predator/prey pursuit task, we experiment with various cooperative strategies and find that a combination of a high-level compressed state representation and a hybrid reward function produces the best results. Using the same cooperative strategy, the TD-FALCON team also outperforms the RPROP-based reinforcement learners in terms of both task completion rate and learning efficiency. 2007-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5221 info:doi/10.1109/TSMCB.2007.907040 https://ink.library.smu.edu.sg/context/sis_research/article/6224/viewcontent/MA20TSMC_B07.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Multiagent cooperative learning reinforcement learning (RL) self-organizing neural architectures Computer and Systems Architecture Computer Engineering Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Multiagent cooperative learning reinforcement learning (RL) self-organizing neural architectures Computer and Systems Architecture Computer Engineering Databases and Information Systems |
spellingShingle |
Multiagent cooperative learning reinforcement learning (RL) self-organizing neural architectures Computer and Systems Architecture Computer Engineering Databases and Information Systems XIAO, Dan TAN, Ah-hwee Self-organizing neural architectures and cooperative learning in a multiagent environment |
description |
Temporal-Difference–Fusion Architecture for Learning, Cognition, and Navigation (TD-FALCON) is a generalization of adaptive resonance theory (a class of self-organizing neural networks) that incorporates TD methods for real-time reinforcement learning. In this paper, we investigate how a team of TD-FALCON networks may cooperate to learn and function in a dynamic multiagent environment based on minefield navigation and a predator/prey pursuit tasks. Experiments on the navigation task demonstrate that TD-FALCON agent teams are able to adapt and function well in a multiagent environment without an explicit mechanism of collaboration. In comparison, traditional Q-learning agents using gradient-descent-based feedforward neural networks, trained with the standard backpropagation and the resilient-propagation (RPROP) algorithms, produce a significantly poorer level of performance. For the predator/prey pursuit task, we experiment with various cooperative strategies and find that a combination of a high-level compressed state representation and a hybrid reward function produces the best results. Using the same cooperative strategy, the TD-FALCON team also outperforms the RPROP-based reinforcement learners in terms of both task completion rate and learning efficiency. |
format |
text |
author |
XIAO, Dan TAN, Ah-hwee |
author_facet |
XIAO, Dan TAN, Ah-hwee |
author_sort |
XIAO, Dan |
title |
Self-organizing neural architectures and cooperative learning in a multiagent environment |
title_short |
Self-organizing neural architectures and cooperative learning in a multiagent environment |
title_full |
Self-organizing neural architectures and cooperative learning in a multiagent environment |
title_fullStr |
Self-organizing neural architectures and cooperative learning in a multiagent environment |
title_full_unstemmed |
Self-organizing neural architectures and cooperative learning in a multiagent environment |
title_sort |
self-organizing neural architectures and cooperative learning in a multiagent environment |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2007 |
url |
https://ink.library.smu.edu.sg/sis_research/5221 https://ink.library.smu.edu.sg/context/sis_research/article/6224/viewcontent/MA20TSMC_B07.pdf |
_version_ |
1770575337607921664 |