Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots
In this thesis, a novel Reinforcement Learning (RL) methodology, termed Dynamic Self-Generated Fuzzy Q-Learning (DSGFQL) is developed for generating Fuzzy Neural Networks (FNNs). In the DSGFQL system, RL is adopted for both structure identification and parameters estimation of FNNs. Structure...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2008
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/13267 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-13267 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation::Robotics DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation::Robotics DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing Zhou, Yi Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
description |
In this thesis, a novel Reinforcement Learning (RL) methodology,
termed Dynamic Self-Generated Fuzzy Q-Learning (DSGFQL) is
developed for generating Fuzzy Neural Networks (FNNs).
In the DSGFQL system, RL is adopted for both structure
identification and parameters estimation of FNNs. Structure and
premise parameters can be dynamically adjusted according to
reinforcement evaluations. Besides evaluation signals for system
performance, a reinforcement sharing mechanism is adopted for
evaluating contributions of each fuzzy rules. Therefore, both
system performance and individual contributions of each fuzzy
rules can be evaluated through reinforcement signals. Fuzzy rules
with good contributions can be reinforced while fuzzy rules with
poor contributions will be penalized or eliminated. Therefore,
structure and premise parts of FNNs can be determined in an RL
manner.
The DSGFQL offers a novel view of generating FNNs. RL
methodologies are not only applied for selecting optimal actions
(consequent parameters) but also applied in determining the number
of rules, pruning and adjusting premise parameters. Similarly as
reinforcing good actions and penalizing poor actions in
convectional RL approaches, good \emph{rules} are promoted while
bad \emph{rules} are demoted or eliminated in the DSGFQL method.
Therefore, instead of only focusing on applying RL in training
consequent parameters (consequent-generation), RL is adopted at a
higher level (premise-generation level).
As structure and premise parameters of FNNs can be adjusted
according to reinforcement evaluations, efficient structure can be
determined through the DSGFQL method.
The novel DSGFQL methodology can automatically create, delete and
adjust fuzzy rules according to the evaluation of system
performance as well as contributions from individual fuzzy rules.
The whole learning process is based on evaluative information and
it does not required instructive training data or many human
efforts.
Besides self-generating FNNs without a \emph{priori} structure,
the DSGFQL approach can also be incorporated by domain knowledge
from human experts or from previous training. At premise level,
initial domain knowledge about tasks can be incorporated as bias
into the system by If-Then fuzzy rules. An NN structure for
incorporating bias components is proposed according to the
confidence in the initial knowledge. Therefore, rapid and safe
learning can be achieved. At consequents training level, a sharing
mechanism is proposed to initialize Q-values of newly generated
rules when applying the Q-learning. Instead of randomly assigning
Q-values of new rules, Q-values are initialized according to those
existing neighboring values. Therefore, previous knowledge can be
learned from those neighboring fuzzy rules and learning speed can
be increased.
Furthermore, extended studies for further developing the DSGFQL
algorithm are carried out. For non-Temporal Difference (TD)-based
RL approaches, a reward function scheme (DSGFQL-reward) is
proposed as a general approach for all RL problems. Global and
local rewards are adopted as evaluation criteria for system and
local performances respectively. As reward function is a basic
element for all RL problems, including non-TD-based approaches,
the reward scheme offers a general RL methodology for generating
FNNs.
Moreover, an enhanced version of the DSGFQL termed Enhanced
Dynamic Self-Generated Fuzzy Q-Learning (EDSGFQL) is proposed by
combining the DSGFQL with an extended Self-Organizing Map (SOM)
algorithm. An extended SOM is proposed and adopted to adjust the
center positions of fuzzy neurons for better feature
representation. With better allocation of fuzzy neurons, the
original DSGFQL is enhanced and the number of fuzzy rules can be
further reduced.
Besides extensional approaches in determining premise parameters
of FNNs, continuous action Q-learning is combined with the DSGFQL
in generating local continuous actions. Therefore, besides
applying fuzzy inference for generating continuous global actions,
local continuous actions can also be obtained instead of discrete
ones from each local fuzzy rules. In the DSGFQL-CA approach,
continuous consequent parameters are estimated instead of discrete
ones.
The DSGFQL algorithm and its extended methodologies
are applied in robotics tasks for navigation such as
wall-following and obstacle avoidance tasks. Comparison studies
with other existing fuzzy RL approaches demonstrate the
superiority of the proposed methods as more efficient FNNs can be
generated. A number of comparative studies are carried out to
validate the viability of the proposed approaches in both static
and dynamic training environments. |
author2 |
Er, Meng Joo |
author_facet |
Er, Meng Joo Zhou, Yi |
format |
Theses and Dissertations |
author |
Zhou, Yi |
author_sort |
Zhou, Yi |
title |
Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
title_short |
Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
title_full |
Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
title_fullStr |
Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
title_full_unstemmed |
Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
title_sort |
automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots |
publishDate |
2008 |
url |
https://hdl.handle.net/10356/13267 |
_version_ |
1772826740089946112 |
spelling |
sg-ntu-dr.10356-132672023-07-04T16:56:54Z Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots Zhou, Yi Er, Meng Joo School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation::Robotics DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing In this thesis, a novel Reinforcement Learning (RL) methodology, termed Dynamic Self-Generated Fuzzy Q-Learning (DSGFQL) is developed for generating Fuzzy Neural Networks (FNNs). In the DSGFQL system, RL is adopted for both structure identification and parameters estimation of FNNs. Structure and premise parameters can be dynamically adjusted according to reinforcement evaluations. Besides evaluation signals for system performance, a reinforcement sharing mechanism is adopted for evaluating contributions of each fuzzy rules. Therefore, both system performance and individual contributions of each fuzzy rules can be evaluated through reinforcement signals. Fuzzy rules with good contributions can be reinforced while fuzzy rules with poor contributions will be penalized or eliminated. Therefore, structure and premise parts of FNNs can be determined in an RL manner. The DSGFQL offers a novel view of generating FNNs. RL methodologies are not only applied for selecting optimal actions (consequent parameters) but also applied in determining the number of rules, pruning and adjusting premise parameters. Similarly as reinforcing good actions and penalizing poor actions in convectional RL approaches, good \emph{rules} are promoted while bad \emph{rules} are demoted or eliminated in the DSGFQL method. Therefore, instead of only focusing on applying RL in training consequent parameters (consequent-generation), RL is adopted at a higher level (premise-generation level). As structure and premise parameters of FNNs can be adjusted according to reinforcement evaluations, efficient structure can be determined through the DSGFQL method. The novel DSGFQL methodology can automatically create, delete and adjust fuzzy rules according to the evaluation of system performance as well as contributions from individual fuzzy rules. The whole learning process is based on evaluative information and it does not required instructive training data or many human efforts. Besides self-generating FNNs without a \emph{priori} structure, the DSGFQL approach can also be incorporated by domain knowledge from human experts or from previous training. At premise level, initial domain knowledge about tasks can be incorporated as bias into the system by If-Then fuzzy rules. An NN structure for incorporating bias components is proposed according to the confidence in the initial knowledge. Therefore, rapid and safe learning can be achieved. At consequents training level, a sharing mechanism is proposed to initialize Q-values of newly generated rules when applying the Q-learning. Instead of randomly assigning Q-values of new rules, Q-values are initialized according to those existing neighboring values. Therefore, previous knowledge can be learned from those neighboring fuzzy rules and learning speed can be increased. Furthermore, extended studies for further developing the DSGFQL algorithm are carried out. For non-Temporal Difference (TD)-based RL approaches, a reward function scheme (DSGFQL-reward) is proposed as a general approach for all RL problems. Global and local rewards are adopted as evaluation criteria for system and local performances respectively. As reward function is a basic element for all RL problems, including non-TD-based approaches, the reward scheme offers a general RL methodology for generating FNNs. Moreover, an enhanced version of the DSGFQL termed Enhanced Dynamic Self-Generated Fuzzy Q-Learning (EDSGFQL) is proposed by combining the DSGFQL with an extended Self-Organizing Map (SOM) algorithm. An extended SOM is proposed and adopted to adjust the center positions of fuzzy neurons for better feature representation. With better allocation of fuzzy neurons, the original DSGFQL is enhanced and the number of fuzzy rules can be further reduced. Besides extensional approaches in determining premise parameters of FNNs, continuous action Q-learning is combined with the DSGFQL in generating local continuous actions. Therefore, besides applying fuzzy inference for generating continuous global actions, local continuous actions can also be obtained instead of discrete ones from each local fuzzy rules. In the DSGFQL-CA approach, continuous consequent parameters are estimated instead of discrete ones. The DSGFQL algorithm and its extended methodologies are applied in robotics tasks for navigation such as wall-following and obstacle avoidance tasks. Comparison studies with other existing fuzzy RL approaches demonstrate the superiority of the proposed methods as more efficient FNNs can be generated. A number of comparative studies are carried out to validate the viability of the proposed approaches in both static and dynamic training environments. DOCTOR OF PHILOSOPHY (EEE) 2008-10-20T07:22:19Z 2008-10-20T07:22:19Z 2007 2007 Thesis Zhou, Y. (2007). Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/13267 10.32657/10356/13267 en 173 p. application/pdf |