Random vector functional link network based shallow and deep learning
Deep learning has been extremely successful in recent years. However, it should be noted that neural networks utilizing back-propagation for parameter training are subject to a time-intensive drawback. Also, these neural networks may fall into local minima and give sub-optimal results. At the same t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171612 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-171612 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering::Computer hardware, software and systems |
spellingShingle |
Engineering::Electrical and electronic engineering::Computer hardware, software and systems Shi, Qiushi Random vector functional link network based shallow and deep learning |
description |
Deep learning has been extremely successful in recent years. However, it should be noted that neural networks utilizing back-propagation for parameter training are subject to a time-intensive drawback. Also, these neural networks may fall into local minima and give sub-optimal results. At the same time, another kind of neural network based on randomization is attracting significant attention because of its superiority to overcome the shortcomings of the BP-trained models. Among them, the Random Vector Functional Link Network (RVFL) is a typical representative with a single hidden layer. The weights and biases are randomly generated in this neural network. And its uniqueness lies in a direct link that connects the information from the input layer to the output layer. To increase the performance, stability, and robustness of this model, two improved structures named Deep Random Vector Functional Link Network (dRVFL) and Ensemble Deep Random Vector Functional Link Network (edRVFL) have been proposed recently. The dRVFL network is a deep version of the RVFL network that allows the existence of multiple hidden layers, while the edRVFL network treats each hidden layer as a classifier to compose an ensemble.
Although the edRVFL network achieves superior performance on multiple tasks, this model suffers from one big inconvenience - there are too many hyperparameters in the network which need to be tuned during the training process. Typically, these hyperparameters are tuned using grid search, which involves trying numerous combinations and can be time-consuming. This raises the following research questions: Is there a more efficient tuning method that can expedite the tuning stage? Can this tuning method handle consecutive values? To find new tuning methods that can expedite the tuning stage and handle
consecutive values, we propose two new tuning strategies to overcome the difficulty. Firstly, we use a two-stage tuning strategy for the edRVFL network (edRVFL-TS) to replace the grid search method. Then, we introduce the Bayesian optimization based edRVFL network (edRVFL-BO) to deal with the consecutive hyperparameters. The experiments conducted on 46 UCI benchmark datasets demonstrate that the edRVFL-BO network not only achieves the highest accuracy but also exhibits the fastest tuning time. This indicates that the edRVFL-BO network is highly efficient in terms of both performance and computational speed when compared to other tuning methods used in the experiments.
Furthermore, we notice that some datasets have a large number of features, and we name them high-dimensional datasets. It raises the following research question: Can we develop an edRVFL variant that is suitable for classifying high-dimensional data? To develop an edRVFL variant that is suitable for classifying high-dimensional data, we introduce two groups of methods. We first review and extend the shallow sparse pre-trained RVFL network to the ensemble deep version (SP-edRVFL). These networks use a sparse pre-trained auto-encoder to train the hidden weights of the networks. After that, we propose two double-regularized networks, namely the double-regularized RVFL (2R-RVFL) network and the double-regularized edRVFL (2R-edRVFL) network. We assign two regularization parameters for the linear input features and non-linear hidden features separately. The experiments conducted on 12 high-dimensional datasets reveal that the SP-edRVFL network outperforms other models. However, it is worth noting that while the SP-edRVFL network demonstrates superior accuracy, it is also observed to be time-consuming. Therefore, when selecting a suitable network for high-dimensional datasets, the decision should be based on the specific circumstances at hand, taking into account factors such as accuracy and training time.
Previously, the research conducted on the edRVFL network has primarily concentrated on supervised learning tasks. Consequently, we raise the following research question: Can we introduce a specialized variant of the edRVFL network that is specifically tailored to handle semi-supervised learning tasks? In order to extend the capabilities of the edRVFL network to semi-supervised learning, we propose a novel approach called Jointly Optimized learning Strategy for the edRVFL network (JOSedRVFL). The JOSedRVFL network employs an iterative procedure to compute the output weights and predict class labels for unlabeled training data during the training process. Furthermore, we introduce another semi-supervised edRVFL network called SS-edRVFL, which incorporates manifold regularization. To highlight the similarities and differences between these two methods, we conduct a brief comparison. The experiments conducted on 4 UCI benchmark datasets demonstrate that the JOSedRVFL network achieves the highest accuracy across all 4 datasets. The SS-edRVFL network consistently performs well, securing the second-highest accuracy on three out of the four datasets, which is only slightly lower than that of the JOSedRVFL network.
Lastly, we address the research question of how to enhance the classification ability of the edRVFL network. To improve the performance of the edRVFL network, we integrate it with several techniques. Firstly, we introduce batch normalization, which aids in re-normalizing the hidden features of the network, thereby preventing divergence. Additionally, we propose the weighted edRVFL network (WedRVFL), which assigns different weights to training samples in different layers based on their confident classification in the previous layer. Furthermore, we introduce the pruning-based edRVFL network (PedRVFL), which prunes inferior neurons based on their importance for classification before generating the subsequent hidden layer. Moreover, we present the combination of weighting and pruning techniques in the form of the WPedRVFL network. Additionally, we integrate the WPedRVFL network with double regularization and develop the 2R-WPedRVFL and 1\&2R-WPedRVFL networks. The empirical results show the superiority of our new methods on benchmark datasets. |
author2 |
Ponnuthurai Nagaratnam Suganthan |
author_facet |
Ponnuthurai Nagaratnam Suganthan Shi, Qiushi |
format |
Thesis-Doctor of Philosophy |
author |
Shi, Qiushi |
author_sort |
Shi, Qiushi |
title |
Random vector functional link network based shallow and deep learning |
title_short |
Random vector functional link network based shallow and deep learning |
title_full |
Random vector functional link network based shallow and deep learning |
title_fullStr |
Random vector functional link network based shallow and deep learning |
title_full_unstemmed |
Random vector functional link network based shallow and deep learning |
title_sort |
random vector functional link network based shallow and deep learning |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/171612 |
_version_ |
1784855572537933824 |
spelling |
sg-ntu-dr.10356-1716122023-12-01T01:52:37Z Random vector functional link network based shallow and deep learning Shi, Qiushi Ponnuthurai Nagaratnam Suganthan School of Electrical and Electronic Engineering EPNSugan@ntu.edu.sg Engineering::Electrical and electronic engineering::Computer hardware, software and systems Deep learning has been extremely successful in recent years. However, it should be noted that neural networks utilizing back-propagation for parameter training are subject to a time-intensive drawback. Also, these neural networks may fall into local minima and give sub-optimal results. At the same time, another kind of neural network based on randomization is attracting significant attention because of its superiority to overcome the shortcomings of the BP-trained models. Among them, the Random Vector Functional Link Network (RVFL) is a typical representative with a single hidden layer. The weights and biases are randomly generated in this neural network. And its uniqueness lies in a direct link that connects the information from the input layer to the output layer. To increase the performance, stability, and robustness of this model, two improved structures named Deep Random Vector Functional Link Network (dRVFL) and Ensemble Deep Random Vector Functional Link Network (edRVFL) have been proposed recently. The dRVFL network is a deep version of the RVFL network that allows the existence of multiple hidden layers, while the edRVFL network treats each hidden layer as a classifier to compose an ensemble. Although the edRVFL network achieves superior performance on multiple tasks, this model suffers from one big inconvenience - there are too many hyperparameters in the network which need to be tuned during the training process. Typically, these hyperparameters are tuned using grid search, which involves trying numerous combinations and can be time-consuming. This raises the following research questions: Is there a more efficient tuning method that can expedite the tuning stage? Can this tuning method handle consecutive values? To find new tuning methods that can expedite the tuning stage and handle consecutive values, we propose two new tuning strategies to overcome the difficulty. Firstly, we use a two-stage tuning strategy for the edRVFL network (edRVFL-TS) to replace the grid search method. Then, we introduce the Bayesian optimization based edRVFL network (edRVFL-BO) to deal with the consecutive hyperparameters. The experiments conducted on 46 UCI benchmark datasets demonstrate that the edRVFL-BO network not only achieves the highest accuracy but also exhibits the fastest tuning time. This indicates that the edRVFL-BO network is highly efficient in terms of both performance and computational speed when compared to other tuning methods used in the experiments. Furthermore, we notice that some datasets have a large number of features, and we name them high-dimensional datasets. It raises the following research question: Can we develop an edRVFL variant that is suitable for classifying high-dimensional data? To develop an edRVFL variant that is suitable for classifying high-dimensional data, we introduce two groups of methods. We first review and extend the shallow sparse pre-trained RVFL network to the ensemble deep version (SP-edRVFL). These networks use a sparse pre-trained auto-encoder to train the hidden weights of the networks. After that, we propose two double-regularized networks, namely the double-regularized RVFL (2R-RVFL) network and the double-regularized edRVFL (2R-edRVFL) network. We assign two regularization parameters for the linear input features and non-linear hidden features separately. The experiments conducted on 12 high-dimensional datasets reveal that the SP-edRVFL network outperforms other models. However, it is worth noting that while the SP-edRVFL network demonstrates superior accuracy, it is also observed to be time-consuming. Therefore, when selecting a suitable network for high-dimensional datasets, the decision should be based on the specific circumstances at hand, taking into account factors such as accuracy and training time. Previously, the research conducted on the edRVFL network has primarily concentrated on supervised learning tasks. Consequently, we raise the following research question: Can we introduce a specialized variant of the edRVFL network that is specifically tailored to handle semi-supervised learning tasks? In order to extend the capabilities of the edRVFL network to semi-supervised learning, we propose a novel approach called Jointly Optimized learning Strategy for the edRVFL network (JOSedRVFL). The JOSedRVFL network employs an iterative procedure to compute the output weights and predict class labels for unlabeled training data during the training process. Furthermore, we introduce another semi-supervised edRVFL network called SS-edRVFL, which incorporates manifold regularization. To highlight the similarities and differences between these two methods, we conduct a brief comparison. The experiments conducted on 4 UCI benchmark datasets demonstrate that the JOSedRVFL network achieves the highest accuracy across all 4 datasets. The SS-edRVFL network consistently performs well, securing the second-highest accuracy on three out of the four datasets, which is only slightly lower than that of the JOSedRVFL network. Lastly, we address the research question of how to enhance the classification ability of the edRVFL network. To improve the performance of the edRVFL network, we integrate it with several techniques. Firstly, we introduce batch normalization, which aids in re-normalizing the hidden features of the network, thereby preventing divergence. Additionally, we propose the weighted edRVFL network (WedRVFL), which assigns different weights to training samples in different layers based on their confident classification in the previous layer. Furthermore, we introduce the pruning-based edRVFL network (PedRVFL), which prunes inferior neurons based on their importance for classification before generating the subsequent hidden layer. Moreover, we present the combination of weighting and pruning techniques in the form of the WPedRVFL network. Additionally, we integrate the WPedRVFL network with double regularization and develop the 2R-WPedRVFL and 1\&2R-WPedRVFL networks. The empirical results show the superiority of our new methods on benchmark datasets. Doctor of Philosophy 2023-11-01T06:17:31Z 2023-11-01T06:17:31Z 2023 Thesis-Doctor of Philosophy Shi, Q. (2023). Random vector functional link network based shallow and deep learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171612 https://hdl.handle.net/10356/171612 10.32657/10356/171612 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |