Multi-source propagation aware network clustering

Network cluster analysis is of great importance as it is closely related to diverse applications, such as social community detection, biological module identification, and document segmentation. Aiming to effectively uncover clusters in the network data, a number of computational approaches, which u...

Full description

Saved in:
Bibliographic Details
Main Authors: He, Tiantian, Ong, Yew-Soon, Hu, Pengwei
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148397
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Network cluster analysis is of great importance as it is closely related to diverse applications, such as social community detection, biological module identification, and document segmentation. Aiming to effectively uncover clusters in the network data, a number of computational approaches, which utilize network topology, single vector of vertex features, or both the aforementioned, have been proposed. However, most prevalent approaches are incapable of dealing with those contemporary network data whose vertices are characterized by features collected from multiple sources. To address this challenge, in this paper, we propose a novel framework, dubbed Multi-Source Propagation Aware Network Clustering (MSPANC) for uncovering clusters in network data possessing multiple sources of vertex features. Different from most previous approaches, MSPANC is able to infer the cluster preference for each vertex utilizing both network topology and multi-source vertex features. To improve the practical significance of the discovered clusters, the learning of cluster membership is also involved into the modeling of the maximization of intra-cluster propagation regarding multi-source features. We propose a unified objective function for MSPANC to perform the clustering task and derive an alternative manner of learning algorithm for model optimization. Besides, we theoretically prove the convergence of the algorithm for optimizing MSPANC. The proposed model has been tested on five real-world datasets, including social, biological and document networks, and has been compared with several competitive baselines. The remarkable experimental results validate the effectiveness of MSPANC.