A maximal-clique-based clustering approach for multi-observer multi-view data by using k-nearest neighbor with S-pseudo-ultrametric induced by a fuzzy similarity

Partitioning multi-view data is a recent challenge in clustering methods, which traditionally consider single-view data. In clustering techniques, finding the similarity or distance between objects, handled by metrics in Rn, plays a central role in community detection. Under this framework, differen...

Full description

Saved in:
Bibliographic Details
Main Authors: Khameneh, Azadeh Zahedi, Ghaznavi, Mehrdad, Kilicman, Adem, Mahad, Zahari, Mardani, Abbas
Format: Article
Published: Springer Science and Business Media Deutschland GmbH 2024
Online Access:http://psasir.upm.edu.my/id/eprint/112041/
https://link.springer.com/article/10.1007/s00521-024-09560-x
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Putra Malaysia
Description
Summary:Partitioning multi-view data is a recent challenge in clustering methods, which traditionally consider single-view data. In clustering techniques, finding the similarity or distance between objects, handled by metrics in Rn, plays a central role in community detection. Under this framework, different algorithms have been developed where the output relies on an exact distance calculated based on the objects’ features. As feature information might be qualitative data defined in an ambiguous environment, this study offers a new class of metrics, so-called S-distance, as a dual of a fuzzy T-similarity, which successfully produces a collective distance based on all views/observers and provides a more flexible framework to define distance under uncertainty. Besides, most existing approaches handle multi-view clustering by aggregating each view’s clusters or using an iterative optimization method; both are time-consuming. Here, by transforming the multi-view clustering problem into node clustering, we suggest a new approach without iteration for multi-view and multi-observer data. Our proposed method, GMSkNN, uses an attribute-structural similarity relation between nodes to get more coherent clusters. To this end, we first build a k-nearest neighbor (kNN) directed graph using the proposed S-distance, then transform it into an undirected graph based on the neighborhood information of the nodes so that the resultant graph is characterized based on nodes interactions and initial features information of the nodes. Next, a new maximal-clique-based clustering is designed to complete the node partitioning. The proposed clustering algorithm is programmed and tested on synthetic and four real-world datasets using the R software. The clustering results are analyzed based on several indexes. This analysis shows the efficiency of the proposed algorithm compared to the traditional clustering methods. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.