VDPC: variational density peak clustering algorithm

The widely applied density peak clustering (DPC) algorithm makes an intuitive cluster formation assumption that cluster centers are often surrounded by data points with lower local density and far away from other data points with higher local density. However, this assumption suffers from one limita...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Yizhang, Wang, Di, Zhou, You, Zhang, Xiaofeng, Quek, Chai
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/169128
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The widely applied density peak clustering (DPC) algorithm makes an intuitive cluster formation assumption that cluster centers are often surrounded by data points with lower local density and far away from other data points with higher local density. However, this assumption suffers from one limitation that it is often problematic when identifying clusters with lower density because they might be easily merged into other clusters with higher density. As a result, DPC may not be able to identify clusters with variational density. To address this issue, we propose a variational density peak clustering (VDPC) algorithm, which is designed to systematically and autonomously perform the clustering task on datasets with various types of density distributions. Specifically, we first propose a novel method to identify the representatives among all data points and construct initial clusters based on the identified representatives for further analysis of the clusters’ property. Furthermore, we divide all data points into different levels according to their local density and propose a unified clustering framework by combining the advantages of both DPC and DBSCAN. Thus, all the identified initial clusters spreading across different density levels are systematically processed to form the final clusters. To evaluate the effectiveness of the proposed VDPC algorithm, we conduct extensive experiments using 20 datasets including eight synthetic, six real-world, and six image datasets. The experimental results show that VDPC outperforms two classical algorithms (i.e., DPC and DBSCAN) and four state-of-the-art extended DPC algorithms.