Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network

Recurrent Neural Network (RNN) is a powerful tool for both theoretical modelling and practical applications. To utilize the RNN as a general learning tool, the understanding of its properties, particularly the robustness and stability, are required. In this thesis, we aim at studying the robustness...

Full description

Saved in:
Bibliographic Details
Main Author: Wu, Yilei
Other Authors: Song Qing
Format: Theses and Dissertations
Language:English
Published: 2008
Subjects:
Online Access:https://hdl.handle.net/10356/13326
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-13326
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
Wu, Yilei
Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
description Recurrent Neural Network (RNN) is a powerful tool for both theoretical modelling and practical applications. To utilize the RNN as a general learning tool, the understanding of its properties, particularly the robustness and stability, are required. In this thesis, we aim at studying the robustness of the gradient-type training algorithms of the RNN via input-output analysis method of nonlinear system theory. The work in this thesis originates from modern concepts of control theory, especially the techniques that have been developed for the analysis of feedback systems. A number of new results are presented that are able to effectively improve the transient response of RNN training algorithms. Further, the results lead to many new theoretical concepts and offer some practical approaches, which may be useful in a wide range applications, for instance, signal processing and control problems. In addition to the analytic derivations, we also demonstrate how the derived criterion can be evaluated numerically. Several examples of using RNN to learn dynamics in practical systems are given based on computer simulations. The overall thesis is organized as follows: Chapter 1 introduces the background, motivations, and major contributions of the thesis, as well as the fundamental knowledge of neural networks. Chapter 2 quickly reviews the related mathematical preliminaries of nonlinear system theory. Specifically, Cluett's law is introduced at the end of the Chapter as an extension to the Conic Sector Stability Theory of Safanov, which will be used in the theoretical analysis of the proposed algorithms that followed. In Chapter 3, firstly the shortcomings of the conventional training, e.g., Realtime Recurrent Learning (RTRL) and Normalized RTRL (N-RTRL), are described, and then the Normalized Adaptive Recurrent Learning (NARL) is proposed to overcome the slow convergence of these algorithms. Inspired by the works of the N-RTRL, normalization factors are used in the NARL to speed up the training. In addition, another two new elements are also introduced, namely, adaptive learning rate and augmented residual error gradient to strengthen the robustness of the training. Analytical analysis is given to compare the performance between the NARL and the other competitors. However, as shown in the proof of NARL that there are also limitations in training induced by the augmented residual error gradient. In order to address the problems, a novel Robust Adaptive Gradient Descent (RAGD) training algorithm is proposed in Chapter 4. In addition to the adaptive learning rate, normalization factors, and augmented error gradient, a concept so-called hybrid learning is proposed in the RAGD to ensure the convergence of RNN weights. The robust stability of the RAGD is proved via the Lyapunov approach and the Cluett's Law respectively. In Chapter 5, numerical simulations in realtime signal processing are carried out to evaluate the proposed algorithms, e.g., online adaptive filtering, time series prediction etc. Other training algorithms are also implemented with the same RNN structure to practically compare their difference with the RAGD. In Chapter 6, a comprehensive case study of Fault Tolerant Control (FTC) for biped robot tracking system is developed on the basis of RNN and the RAGD. Three fault cases are synthesized in the simulation to verify the effectiveness of the proposed schemes. Comparison with the single PD control scheme and other training algorithms are presented. Finally in Chapter 7, we draw the conclusions and give several advices on future works.
author2 Song Qing
author_facet Song Qing
Wu, Yilei
format Theses and Dissertations
author Wu, Yilei
author_sort Wu, Yilei
title Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
title_short Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
title_full Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
title_fullStr Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
title_full_unstemmed Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
title_sort stability analysis of gradient-based training algorithms of discrete-time recurrent neural network
publishDate 2008
url https://hdl.handle.net/10356/13326
_version_ 1772826703177973760
spelling sg-ntu-dr.10356-133262023-07-04T17:40:05Z Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network Wu, Yilei Song Qing School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems Recurrent Neural Network (RNN) is a powerful tool for both theoretical modelling and practical applications. To utilize the RNN as a general learning tool, the understanding of its properties, particularly the robustness and stability, are required. In this thesis, we aim at studying the robustness of the gradient-type training algorithms of the RNN via input-output analysis method of nonlinear system theory. The work in this thesis originates from modern concepts of control theory, especially the techniques that have been developed for the analysis of feedback systems. A number of new results are presented that are able to effectively improve the transient response of RNN training algorithms. Further, the results lead to many new theoretical concepts and offer some practical approaches, which may be useful in a wide range applications, for instance, signal processing and control problems. In addition to the analytic derivations, we also demonstrate how the derived criterion can be evaluated numerically. Several examples of using RNN to learn dynamics in practical systems are given based on computer simulations. The overall thesis is organized as follows: Chapter 1 introduces the background, motivations, and major contributions of the thesis, as well as the fundamental knowledge of neural networks. Chapter 2 quickly reviews the related mathematical preliminaries of nonlinear system theory. Specifically, Cluett's law is introduced at the end of the Chapter as an extension to the Conic Sector Stability Theory of Safanov, which will be used in the theoretical analysis of the proposed algorithms that followed. In Chapter 3, firstly the shortcomings of the conventional training, e.g., Realtime Recurrent Learning (RTRL) and Normalized RTRL (N-RTRL), are described, and then the Normalized Adaptive Recurrent Learning (NARL) is proposed to overcome the slow convergence of these algorithms. Inspired by the works of the N-RTRL, normalization factors are used in the NARL to speed up the training. In addition, another two new elements are also introduced, namely, adaptive learning rate and augmented residual error gradient to strengthen the robustness of the training. Analytical analysis is given to compare the performance between the NARL and the other competitors. However, as shown in the proof of NARL that there are also limitations in training induced by the augmented residual error gradient. In order to address the problems, a novel Robust Adaptive Gradient Descent (RAGD) training algorithm is proposed in Chapter 4. In addition to the adaptive learning rate, normalization factors, and augmented error gradient, a concept so-called hybrid learning is proposed in the RAGD to ensure the convergence of RNN weights. The robust stability of the RAGD is proved via the Lyapunov approach and the Cluett's Law respectively. In Chapter 5, numerical simulations in realtime signal processing are carried out to evaluate the proposed algorithms, e.g., online adaptive filtering, time series prediction etc. Other training algorithms are also implemented with the same RNN structure to practically compare their difference with the RAGD. In Chapter 6, a comprehensive case study of Fault Tolerant Control (FTC) for biped robot tracking system is developed on the basis of RNN and the RAGD. Three fault cases are synthesized in the simulation to verify the effectiveness of the proposed schemes. Comparison with the single PD control scheme and other training algorithms are presented. Finally in Chapter 7, we draw the conclusions and give several advices on future works. DOCTOR OF PHILOSOPHY (EEE) 2008-10-20T07:24:54Z 2008-10-20T07:24:54Z 2008 2008 Thesis Wu, Y. (2008). Stability analysis of gradient-based training algorithms of discrete-time recurrent neural network. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/13326 10.32657/10356/13326 en 166 p. application/pdf