Evolving fuzzy clustering approach: An epoch clustering that enables heuristic postpruning

Clustering is an unsupervised machine learning method that is used both individually and as a part of the preprocessing stage for the supervised machine learning methods. Due to its unsupervised nature, clustering results have less accuracy compared to the supervised learning. This article aims to i...

Full description

Saved in:
Bibliographic Details
Main Authors: Shirkhorshidi, Ali Seyed, Wah, Teh Ying, Shirkhorshidi, Seyed Mohammad Reza, Aghabozorgi, Saeed
Format: Article
Published: Institute of Electrical and Electronics Engineers (IEEE) 2021
Subjects:
Online Access:http://eprints.um.edu.my/26486/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
Description
Summary:Clustering is an unsupervised machine learning method that is used both individually and as a part of the preprocessing stage for the supervised machine learning methods. Due to its unsupervised nature, clustering results have less accuracy compared to the supervised learning. This article aims to introduce a new perspective in clustering by defining an approach for data pruning. The method also enables clustering using multiple sets of prototypes instead of only one set to improve clustering accuracy. Consequently, this approach has the potential to be used independently or as a part of a preprocessing to prepare purified data for the training step of a supervised learning approach. An evolving fuzzy clustering approach (EFCA) utilizes the fuzzy membership concept to breakdown clustering in epochs instead of running the clustering on all data at once. In some cases, for supervised learning, we rather have a smaller subset of highly accurate labeled data instead of a dataset with less accurate labels. The EFCA's ``epoch cut'' enables postpruning ability to eliminate obscure data points, which results in more clustering accuracy. The EFCA has been applied to a set of eight multivariate and ten time-series datasets, and for example, after deploying epoch cut and eliminating obscure data (20% of data) by automatic postpruning, it achieved 100% accuracy for the rest 80% Iris data.