Automatic generation of fuzzy neural networks via reinforcement learning with applications in path planning of mobile robots

In this thesis, a novel Reinforcement Learning (RL) methodology, termed Dynamic Self-Generated Fuzzy Q-Learning (DSGFQL) is developed for generating Fuzzy Neural Networks (FNNs). In the DSGFQL system, RL is adopted for both structure identification and parameters estimation of FNNs. Structure...

Full description

Saved in:
Bibliographic Details
Main Author: Zhou, Yi
Other Authors: Er, Meng Joo
Format: Theses and Dissertations
Language:English
Published: 2008
Subjects:
Online Access:https://hdl.handle.net/10356/13267
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this thesis, a novel Reinforcement Learning (RL) methodology, termed Dynamic Self-Generated Fuzzy Q-Learning (DSGFQL) is developed for generating Fuzzy Neural Networks (FNNs). In the DSGFQL system, RL is adopted for both structure identification and parameters estimation of FNNs. Structure and premise parameters can be dynamically adjusted according to reinforcement evaluations. Besides evaluation signals for system performance, a reinforcement sharing mechanism is adopted for evaluating contributions of each fuzzy rules. Therefore, both system performance and individual contributions of each fuzzy rules can be evaluated through reinforcement signals. Fuzzy rules with good contributions can be reinforced while fuzzy rules with poor contributions will be penalized or eliminated. Therefore, structure and premise parts of FNNs can be determined in an RL manner. The DSGFQL offers a novel view of generating FNNs. RL methodologies are not only applied for selecting optimal actions (consequent parameters) but also applied in determining the number of rules, pruning and adjusting premise parameters. Similarly as reinforcing good actions and penalizing poor actions in convectional RL approaches, good \emph{rules} are promoted while bad \emph{rules} are demoted or eliminated in the DSGFQL method. Therefore, instead of only focusing on applying RL in training consequent parameters (consequent-generation), RL is adopted at a higher level (premise-generation level). As structure and premise parameters of FNNs can be adjusted according to reinforcement evaluations, efficient structure can be determined through the DSGFQL method. The novel DSGFQL methodology can automatically create, delete and adjust fuzzy rules according to the evaluation of system performance as well as contributions from individual fuzzy rules. The whole learning process is based on evaluative information and it does not required instructive training data or many human efforts. Besides self-generating FNNs without a \emph{priori} structure, the DSGFQL approach can also be incorporated by domain knowledge from human experts or from previous training. At premise level, initial domain knowledge about tasks can be incorporated as bias into the system by If-Then fuzzy rules. An NN structure for incorporating bias components is proposed according to the confidence in the initial knowledge. Therefore, rapid and safe learning can be achieved. At consequents training level, a sharing mechanism is proposed to initialize Q-values of newly generated rules when applying the Q-learning. Instead of randomly assigning Q-values of new rules, Q-values are initialized according to those existing neighboring values. Therefore, previous knowledge can be learned from those neighboring fuzzy rules and learning speed can be increased. Furthermore, extended studies for further developing the DSGFQL algorithm are carried out. For non-Temporal Difference (TD)-based RL approaches, a reward function scheme (DSGFQL-reward) is proposed as a general approach for all RL problems. Global and local rewards are adopted as evaluation criteria for system and local performances respectively. As reward function is a basic element for all RL problems, including non-TD-based approaches, the reward scheme offers a general RL methodology for generating FNNs. Moreover, an enhanced version of the DSGFQL termed Enhanced Dynamic Self-Generated Fuzzy Q-Learning (EDSGFQL) is proposed by combining the DSGFQL with an extended Self-Organizing Map (SOM) algorithm. An extended SOM is proposed and adopted to adjust the center positions of fuzzy neurons for better feature representation. With better allocation of fuzzy neurons, the original DSGFQL is enhanced and the number of fuzzy rules can be further reduced. Besides extensional approaches in determining premise parameters of FNNs, continuous action Q-learning is combined with the DSGFQL in generating local continuous actions. Therefore, besides applying fuzzy inference for generating continuous global actions, local continuous actions can also be obtained instead of discrete ones from each local fuzzy rules. In the DSGFQL-CA approach, continuous consequent parameters are estimated instead of discrete ones. The DSGFQL algorithm and its extended methodologies are applied in robotics tasks for navigation such as wall-following and obstacle avoidance tasks. Comparison studies with other existing fuzzy RL approaches demonstrate the superiority of the proposed methods as more efficient FNNs can be generated. A number of comparative studies are carried out to validate the viability of the proposed approaches in both static and dynamic training environments.