Improving deep representation learning for fuzzy clustering

Deep clustering has gained popularity over the past decade due to the superior feature representation learning capability of deep neural networks. Many research works proposed novel approaches to improve the feature representation learning of the deep neural network for better clustering performance...

Full description

Saved in:
Bibliographic Details
Main Author: Song, Kang
Other Authors: Lihui Chen
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182967
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-182967
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Deep fuzzy clustering
spellingShingle Computer and Information Science
Deep fuzzy clustering
Song, Kang
Improving deep representation learning for fuzzy clustering
description Deep clustering has gained popularity over the past decade due to the superior feature representation learning capability of deep neural networks. Many research works proposed novel approaches to improve the feature representation learning of the deep neural network for better clustering performance. The well-known Deep Embedded Clustering (DEC) method only focuses on its clustering loss which may distort the feature space learned by the autoencoder with reconstruction loss at the pre-training stage. The Improved DEC (IDEC) method preserves local structures in the feature space by adding back the decoder part and regularizing the clustering loss with reconstruction loss in the clustering training stage, but the effect is limited as the reconstruction loss used for regularization is the same as the one used in the pre-training stage. In the domain of deep fuzzy clustering, a common clustering loss term using KL divergence was adopted in many methods, including the well-known Graph-regularized Deep Normalized Fuzzy Compactness and Separation (GrDNFCS) method. However this loss term may have unpredictable adverse effects on the representation learning and clustering when batch size is small. The hard pseudo-label-based graph-regularization term in GrDNFCS has a strong bias towards the initial cluster assignments which makes the rectification of the representation learning difficult for ambiguous samples and samples around the cluster boundaries. In this thesis, we have proposed and presented some novel methods to improve the deep feature representation learning that can be applied to crisp clustering and fuzzy clustering to achieve better clustering performance. We proposed a novel method called Deep Embedded Clustering with Random Projection Penalty (DEC-RPP) which introduces a random projection penalty term in the loss functions as a regularizer to preserve pairwise distance and local structures in the feature space during training. The feature representation learned using this penalty term improves the clustering results significantly when compared with state-of-the-art methods using similar network architectures. We also proposed the Deep Fuzzy Clustering with Powered Membership and Soft Graph-regularization (pDFC-SoftGr) method that replaces the KL divergence loss term in GrDNFCS with our proposed powered membership loss term and replaces the hard pseudo-label-based graph-regularization term with our proposed soft graph-regularization term. We use examples to show the limitations of the loss terms in GrDNFCS and provide theoretical proof of the desired behavior of the powered membership term. The two new loss terms encourage the features learned by the deep neural network to produce strengthened fuzzy cluster assignments and make the rectification of ambiguous fuzzy cluster assignments easier. We conduct experiments on six benchmark datasets to compare the performance of our proposed method with other methods, we also conduct ablation studies on the effectiveness of each of the two new loss terms we proposed. The results show that both the ablation tests and the combined algorithm outperform other deep fuzzy clustering methods using similar network architectures. We also present a real-world application of deep fuzzy clustering in the E-commerce industry. The techniques that we proposed to improve deep feature representation learning are shown to be effective when used in this application.
author2 Lihui Chen
author_facet Lihui Chen
Song, Kang
format Thesis-Doctor of Philosophy
author Song, Kang
author_sort Song, Kang
title Improving deep representation learning for fuzzy clustering
title_short Improving deep representation learning for fuzzy clustering
title_full Improving deep representation learning for fuzzy clustering
title_fullStr Improving deep representation learning for fuzzy clustering
title_full_unstemmed Improving deep representation learning for fuzzy clustering
title_sort improving deep representation learning for fuzzy clustering
publisher Nanyang Technological University
publishDate 2025
url https://hdl.handle.net/10356/182967
_version_ 1826362300218474496
spelling sg-ntu-dr.10356-1829672025-03-11T08:54:58Z Improving deep representation learning for fuzzy clustering Song, Kang Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Computer and Information Science Deep fuzzy clustering Deep clustering has gained popularity over the past decade due to the superior feature representation learning capability of deep neural networks. Many research works proposed novel approaches to improve the feature representation learning of the deep neural network for better clustering performance. The well-known Deep Embedded Clustering (DEC) method only focuses on its clustering loss which may distort the feature space learned by the autoencoder with reconstruction loss at the pre-training stage. The Improved DEC (IDEC) method preserves local structures in the feature space by adding back the decoder part and regularizing the clustering loss with reconstruction loss in the clustering training stage, but the effect is limited as the reconstruction loss used for regularization is the same as the one used in the pre-training stage. In the domain of deep fuzzy clustering, a common clustering loss term using KL divergence was adopted in many methods, including the well-known Graph-regularized Deep Normalized Fuzzy Compactness and Separation (GrDNFCS) method. However this loss term may have unpredictable adverse effects on the representation learning and clustering when batch size is small. The hard pseudo-label-based graph-regularization term in GrDNFCS has a strong bias towards the initial cluster assignments which makes the rectification of the representation learning difficult for ambiguous samples and samples around the cluster boundaries. In this thesis, we have proposed and presented some novel methods to improve the deep feature representation learning that can be applied to crisp clustering and fuzzy clustering to achieve better clustering performance. We proposed a novel method called Deep Embedded Clustering with Random Projection Penalty (DEC-RPP) which introduces a random projection penalty term in the loss functions as a regularizer to preserve pairwise distance and local structures in the feature space during training. The feature representation learned using this penalty term improves the clustering results significantly when compared with state-of-the-art methods using similar network architectures. We also proposed the Deep Fuzzy Clustering with Powered Membership and Soft Graph-regularization (pDFC-SoftGr) method that replaces the KL divergence loss term in GrDNFCS with our proposed powered membership loss term and replaces the hard pseudo-label-based graph-regularization term with our proposed soft graph-regularization term. We use examples to show the limitations of the loss terms in GrDNFCS and provide theoretical proof of the desired behavior of the powered membership term. The two new loss terms encourage the features learned by the deep neural network to produce strengthened fuzzy cluster assignments and make the rectification of ambiguous fuzzy cluster assignments easier. We conduct experiments on six benchmark datasets to compare the performance of our proposed method with other methods, we also conduct ablation studies on the effectiveness of each of the two new loss terms we proposed. The results show that both the ablation tests and the combined algorithm outperform other deep fuzzy clustering methods using similar network architectures. We also present a real-world application of deep fuzzy clustering in the E-commerce industry. The techniques that we proposed to improve deep feature representation learning are shown to be effective when used in this application. Doctor of Philosophy 2025-03-11T08:54:57Z 2025-03-11T08:54:57Z 2024 Thesis-Doctor of Philosophy Song, K. (2024). Improving deep representation learning for fuzzy clustering. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182967 https://hdl.handle.net/10356/182967 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University