Improving deep representation learning for fuzzy clustering

Deep clustering has gained popularity over the past decade due to the superior feature representation learning capability of deep neural networks. Many research works proposed novel approaches to improve the feature representation learning of the deep neural network for better clustering performance...

Full description

Saved in:

Bibliographic Details
Main Author:	Song, Kang
Other Authors:	Lihui Chen
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Computer and Information Science Deep fuzzy clustering
Online Access:	https://hdl.handle.net/10356/182967
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182967
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Deep fuzzy clustering
spellingShingle	Computer and Information Science Deep fuzzy clustering Song, Kang Improving deep representation learning for fuzzy clustering
description	Deep clustering has gained popularity over the past decade due to the superior feature representation learning capability of deep neural networks. Many research works proposed novel approaches to improve the feature representation learning of the deep neural network for better clustering performance. The well-known Deep Embedded Clustering (DEC) method only focuses on its clustering loss which may distort the feature space learned by the autoencoder with reconstruction loss at the pre-training stage. The Improved DEC (IDEC) method preserves local structures in the feature space by adding back the decoder part and regularizing the clustering loss with reconstruction loss in the clustering training stage, but the effect is limited as the reconstruction loss used for regularization is the same as the one used in the pre-training stage. In the domain of deep fuzzy clustering, a common clustering loss term using KL divergence was adopted in many methods, including the well-known Graph-regularized Deep Normalized Fuzzy Compactness and Separation (GrDNFCS) method. However this loss term may have unpredictable adverse effects on the representation learning and clustering when batch size is small. The hard pseudo-label-based graph-regularization term in GrDNFCS has a strong bias towards the initial cluster assignments which makes the rectification of the representation learning difficult for ambiguous samples and samples around the cluster boundaries. In this thesis, we have proposed and presented some novel methods to improve the deep feature representation learning that can be applied to crisp clustering and fuzzy clustering to achieve better clustering performance. We proposed a novel method called Deep Embedded Clustering with Random Projection Penalty (DEC-RPP) which introduces a random projection penalty term in the loss functions as a regularizer to preserve pairwise distance and local structures in the feature space during training. The feature representation learned using this penalty term improves the clustering results significantly when compared with state-of-the-art methods using similar network architectures. We also proposed the Deep Fuzzy Clustering with Powered Membership and Soft Graph-regularization (pDFC-SoftGr) method that replaces the KL divergence loss term in GrDNFCS with our proposed powered membership loss term and replaces the hard pseudo-label-based graph-regularization term with our proposed soft graph-regularization term. We use examples to show the limitations of the loss terms in GrDNFCS and provide theoretical proof of the desired behavior of the powered membership term. The two new loss terms encourage the features learned by the deep neural network to produce strengthened fuzzy cluster assignments and make the rectification of ambiguous fuzzy cluster assignments easier. We conduct experiments on six benchmark datasets to compare the performance of our proposed method with other methods, we also conduct ablation studies on the effectiveness of each of the two new loss terms we proposed. The results show that both the ablation tests and the combined algorithm outperform other deep fuzzy clustering methods using similar network architectures. We also present a real-world application of deep fuzzy clustering in the E-commerce industry. The techniques that we proposed to improve deep feature representation learning are shown to be effective when used in this application.
author2	Lihui Chen
author_facet	Lihui Chen Song, Kang
format	Thesis-Doctor of Philosophy
author	Song, Kang
author_sort	Song, Kang
title	Improving deep representation learning for fuzzy clustering
title_short	Improving deep representation learning for fuzzy clustering
title_full	Improving deep representation learning for fuzzy clustering
title_fullStr	Improving deep representation learning for fuzzy clustering
title_full_unstemmed	Improving deep representation learning for fuzzy clustering
title_sort	improving deep representation learning for fuzzy clustering
publisher	Nanyang Technological University
publishDate	2025
url	https://hdl.handle.net/10356/182967
_version_	1826362300218474496
spelling	sg-ntu-dr.10356-1829672025-03-11T08:54:58Z Improving deep representation learning for fuzzy clustering Song, Kang Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Computer and Information Science Deep fuzzy clustering Deep clustering has gained popularity over the past decade due to the superior feature representation learning capability of deep neural networks. Many research works proposed novel approaches to improve the feature representation learning of the deep neural network for better clustering performance. The well-known Deep Embedded Clustering (DEC) method only focuses on its clustering loss which may distort the feature space learned by the autoencoder with reconstruction loss at the pre-training stage. The Improved DEC (IDEC) method preserves local structures in the feature space by adding back the decoder part and regularizing the clustering loss with reconstruction loss in the clustering training stage, but the effect is limited as the reconstruction loss used for regularization is the same as the one used in the pre-training stage. In the domain of deep fuzzy clustering, a common clustering loss term using KL divergence was adopted in many methods, including the well-known Graph-regularized Deep Normalized Fuzzy Compactness and Separation (GrDNFCS) method. However this loss term may have unpredictable adverse effects on the representation learning and clustering when batch size is small. The hard pseudo-label-based graph-regularization term in GrDNFCS has a strong bias towards the initial cluster assignments which makes the rectification of the representation learning difficult for ambiguous samples and samples around the cluster boundaries. In this thesis, we have proposed and presented some novel methods to improve the deep feature representation learning that can be applied to crisp clustering and fuzzy clustering to achieve better clustering performance. We proposed a novel method called Deep Embedded Clustering with Random Projection Penalty (DEC-RPP) which introduces a random projection penalty term in the loss functions as a regularizer to preserve pairwise distance and local structures in the feature space during training. The feature representation learned using this penalty term improves the clustering results significantly when compared with state-of-the-art methods using similar network architectures. We also proposed the Deep Fuzzy Clustering with Powered Membership and Soft Graph-regularization (pDFC-SoftGr) method that replaces the KL divergence loss term in GrDNFCS with our proposed powered membership loss term and replaces the hard pseudo-label-based graph-regularization term with our proposed soft graph-regularization term. We use examples to show the limitations of the loss terms in GrDNFCS and provide theoretical proof of the desired behavior of the powered membership term. The two new loss terms encourage the features learned by the deep neural network to produce strengthened fuzzy cluster assignments and make the rectification of ambiguous fuzzy cluster assignments easier. We conduct experiments on six benchmark datasets to compare the performance of our proposed method with other methods, we also conduct ablation studies on the effectiveness of each of the two new loss terms we proposed. The results show that both the ablation tests and the combined algorithm outperform other deep fuzzy clustering methods using similar network architectures. We also present a real-world application of deep fuzzy clustering in the E-commerce industry. The techniques that we proposed to improve deep feature representation learning are shown to be effective when used in this application. Doctor of Philosophy 2025-03-11T08:54:57Z 2025-03-11T08:54:57Z 2024 Thesis-Doctor of Philosophy Song, K. (2024). Improving deep representation learning for fuzzy clustering. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182967 https://hdl.handle.net/10356/182967 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Improving deep representation learning for fuzzy clustering

Similar Items