Extreme learning machine for classification and clustering on multiview data

Many real-world applications deal with data that have multiple feature sets. Such data is called multiview data, where each feature set is considered as a view. Many machine learning tasks have their multiview versions, which have similar objectives but dealing with multiview data. For example, the...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Jichao
Other Authors: Lin Zhiping
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/153094
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Many real-world applications deal with data that have multiple feature sets. Such data is called multiview data, where each feature set is considered as a view. Many machine learning tasks have their multiview versions, which have similar objectives but dealing with multiview data. For example, the most general classification and clustering tasks have their multiview versions. Multiview classification aims to categorize multiview data into predefined classes. It is quite useful when dealing with complicated objects presented in different formats (e.g., 3D models). Multiview clustering separates the multiview data into clusters by considering their pairwise similarities in all views. It is widely used in feature learning and data labeling. Multiview spectral clustering is one of the most popular kinds of multiview clustering algorithms, which leverages eigendecomposition on the graph Laplacian to find the clusters. Meanwhile, extreme learning machine (ELM) is attracting more and more attention for its promising performance in various real-world applications. Therefore, the thesis investigates ELM for classification and clustering on multiview data. Firstly, the thesis investigates multiview 3D shape classification. The thesis extends the idea of ELM's random neurons to octree based operations and proposes octree based convolutional autoencoder extreme learning machine (OCA-ELM) for 3D shape classification. The ELM auto-encoder (ELM-AE) is adopted for feature extraction and feature fusion, and an ELM classifier is ultilized for the final classification. The experimental results prove that the input and operations based on octree can improve the classification performance. Secondly, the thesis investigates spectral clustering with ELM. ELM is a training algorithm for single-layer feedforward networks, famous for its nonlinear feature mapping. The thesis develops a clustering algorithm called unsupervised feature selection based extreme learning machine (UFS-ELM). The proposed algorithm first transfers the data into a hidden space with nonlinear feature mapping of the ELM. It reduces the dimensionality of features to the desired number of clusters while preserving the graph structure. An unsupervised feature selection method is embedded in the formulation to eliminate the worthless hidden neurons. With a simple constraint, the method can directly output the clustering result without post-processing. Experimental results demonstrate the effectiveness of UFS-ELM on a wide range of datasets. Thirdly, the thesis explores spectral clustering on multiview data. The views of multiview data might have different data types, scales, and data distributions. The information of multiple views needs to be fused into a single one at some moment to derive a consistent clustering result across different views. Existing multiview spectral clustering methods fuse multiple similarity matrices or Laplacian matrices into a common matrix. This thesis proposes a term of distance fusion that fuses the information from the distance perspective. Specifically, the proposed method provides a comprehensive distance measure of a unified scale. A common distance can be produced by combining multiple distance matrices without weighting. With the distance fusion, the thesis presents a novel multiview spectral clustering method called distance fusion with adaptive neighbors (DFAN). An empirical study on several multiview datasets shows that the proposed DFAN method surpasses the state-of-the-art multiview spectral clustering methods. The experimental results also show that the proposed method enlarges the distances between different clusters and is robust to views with misleading information. Finally, the thesis proposes to incorporate extreme learning machine with multiview spectral clustering. Existing multiview spectral clustering methods construct a similarity matrix based on the pairwise distances measured in the original data space or linearly projected space. However, the actual data structure might lie on a complex nonlinear manifold. In this thesis, the extreme learning machine is adopted to learn a nonlinearly mapped embedding space where the intrinsic structure is preserved. The distances measured in both embedding space and the original space can be fused to derive a more robust distance measure for each view. Massive experiments demonstrate the proposed multiview clustering method, termed dual distance adaptive multiview clustering (DAMC), is superior to the state-of-the-art multiview clustering methods.