Extreme learning machine with affine transformation inputs in an activation function
The extreme learning machine (ELM) has attracted much attention over the past decade due to its fast learning speed and convincing generalization performance. However, there still remains a practical issue to be approached when applying the ELM: the randomly generated hidden node parameters without...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/136684 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The extreme learning machine (ELM) has attracted much attention over the past decade due to its fast learning speed and convincing generalization performance. However, there still remains a practical issue to be approached when applying the ELM: the randomly generated hidden node parameters without tuning can lead to the hidden node outputs being nonuniformly distributed, thus giving rise to poor generalization performance. To address this deficiency, a novel activation function with an affine transformation (AT) on its input is introduced into the ELM, which leads to an improved ELM algorithm that is referred to as an AT-ELM in this paper. The scaling and translation parameters of the AT activation function are computed based on the maximum entropy principle in such a way that the hidden layer outputs approximately obey a uniform distribution. Application of the AT-ELM algorithm in nonlinear function regression shows its robustness to the range scaling of the network inputs. Experiments on nonlinear function regression, real-world data set classification, and benchmark image recognition demonstrate better performance for the AT-ELM compared with the original ELM, the regularized ELM, and the kernel ELM. Recognition results on benchmark image data sets also reveal that the AT-ELM outperforms several other state-of-the-art algorithms in general. |
---|