Representation learning with efficient extreme learning machine auto-encoders

Extreme Learning Machine (ELM) is a ‘specialized’ Single Layer Feedforward Neural network (SLFN). The traditional SLFN is trained by Back-Propagation (BP), which has the problem of local minimum and slow learning speed. In contrast to that, hidden weights of ELM are randomly generated without any u...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Guanghao
Other Authors: Huang Guangbin
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/146297
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Extreme Learning Machine (ELM) is a ‘specialized’ Single Layer Feedforward Neural network (SLFN). The traditional SLFN is trained by Back-Propagation (BP), which has the problem of local minimum and slow learning speed. In contrast to that, hidden weights of ELM are randomly generated without any update while learning. The output weights have an analytical solution. And ELM has been successfully explored in classification and regression scenarios. Extreme Learning Machine Auto-Encoder (ELM-AE) was proposed as a variant of general ELM for unsupervised feature learning. Specifically, ELM-AE introduces linear ELM-AE, nonlinear ELM-AE, and sparse ELM-AE. Multi-Layer Extreme Learning Machine (ML-ELM) is built via stacking multiple nonlinear ELM-AEs, which presents a competitive generalization capability with other multi-layer neural networks such as Deep Boltzmann Machine (DBM) and Deep Belief Network (DBN). Based on ML-ELM, Hierarchical Extreme Learning Machine (H-ELM) mainly presents that the $\ell_1$-regularized ELM-AE variant can improve the performance for various applications. This thesis introduces the Bayesian learning scheme into ELM-AE referred to as Sparse Bayesian Extreme Learning Machine Auto-Encoder (SB-ELM-AE). Also, a parallel training strategy is proposed to accelerate the Bayesian learning procedure. The overall neural network, similar to ML-ELM and H-ELM, is referred to as Sparse Bayesian Auto-Encoding based Extreme Learning Machine (SBAE-ELM). Experimentally, it shows that the neural network with stacking SB-ELM-AEs have better generalization performance on traditional classification and face related challenges. Principal Component Analysis Network (PCANet), as an unsupervised shallow network, demonstrates noticeable effectiveness on datasets of various volumes. It carries a two-layer convolution with PCA as a filter learning method, followed by a block-wise histogram post-processing stage. Following the structure of PCANet, ELM-AE variants are employed to replace PCA’s role, which comes from the Extreme Learning Machine Network (ELMNet) and Hierarchical Extreme Learning Machine Network (H-ELMNet). ELMNet emphasizes the importance of orthogonal projection. H-ELMNet introduces a specialized ELM-AE variant with complex pre-processing steps. This thesis proposes a Regularized Extreme Learning Machine Auto-Encoder (R-ELM-AE), which combines nonlinear ELM learning and approximately orthogonal characteristic. Based on R-ELM-AE and the pipeline of PCANet, this thesis proposes Regularized Extreme Learning Machine Network (R-ELMNet) accordingly with minimum implementation. Experiments on image classification of various volumes show the effectiveness compared to unsupervised neural networks, including PCANet, ELMNet, and H-ELMNet. Also, R-ELMNet presents competitive performance with supervised convolutional neural networks. Despite the success of ELM-AE variants, it is not broadly used in the traditional scenarios where PCA is commonly integrated, such as the dimension reduction role in machine learning. There are two main reasons which restrict the propagation of ELM-AE variants. Firstly, the value scale after data transformation is not bounded that we need to add data normalization or value scaling operations to eliminate this problem. Secondly, PCA has only one hyper-parameter, the reduced dimension. While ELM-AE variants generally require additional hyper-parameters. For example, nonlinear ELM-AE needs the $\ell_2$-regularization item, the selection range of which is commonly from $1e^{-8}$ to $1e^8$. The hyper-parameter space would be exponentially expanded due to involving the hyper-parameters from feature post-processing or stacking multiple ELM-AEs. Considering PCA often acts as the plug-and-play role of dimension reduction in machine learning that we expect a simple variant of ELM-AE, which can be verified the adaptability to any model with minimal trials. Hence this thesis proposes a Unified Extreme Learning Machine Auto-Encoder (U-ELM-AE), which presents competitive performance with other ELM-AE variants and PCA, and importantly involves no additional hyper-parameters. Experiments have shown its effectiveness and efficiency for image dimension reduction, compared with PCA and ELM-AE variants. Also, U-ELM-AE can be conveniently integrated into Local Receptive Fields based Extreme Learning Machine (LRF-ELM) and PCANet to present improvements. The U-ELM-AE is only suitable for dimension reduction case, while nonlinear ELM-AE can be used for dimension expansion. It is necessary to handle scenarios where the input dimension is small. Thus, an effective multi-layer ELM is proposed: 1) if the ELM-AE is used for dimension expansion, then a new regularization is applied to nonlinear ELM-AE to constrain the output value scale; 2) if the ELM-AE is used for dimension reduction, then U-ELM-AE is employed. With such a structure, it can achieve efficiency and competitive performance with ML-ELM, H-ELM, and SBAE-ELM.