Semi-supervised co-clustering on attributed heterogeneous information networks

Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper...

Full description

Saved in:
Bibliographic Details
Main Authors: JI, Yugang, SHI, Chuan, FANG, Yuan, KONG, Xiangnan, YIN, Mingyang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5291
https://ink.library.smu.edu.sg/context/sis_research/article/6294/viewcontent/IPM20_SCCAIN.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-6294
record_format dspace
spelling sg-smu-ink.sis_research-62942020-09-09T04:47:45Z Semi-supervised co-clustering on attributed heterogeneous information networks JI, Yugang SHI, Chuan FANG, Yuan KONG, Xiangnan YIN, Mingyang Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other. To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering. Keywords: co-clustering, heterogeneous information network, meta-paths, matrix tri-factorization, semi-supervised learning 2020-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5291 info:doi/10.1016/j.ipm.2020.102338 https://ink.library.smu.edu.sg/context/sis_research/article/6294/viewcontent/IPM20_SCCAIN.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University co-clustering heterogeneous information network meta-paths matrix tri-factorization semi-supervised learning Databases and Information Systems OS and Networks
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic co-clustering
heterogeneous information network
meta-paths
matrix tri-factorization
semi-supervised learning
Databases and Information Systems
OS and Networks
spellingShingle co-clustering
heterogeneous information network
meta-paths
matrix tri-factorization
semi-supervised learning
Databases and Information Systems
OS and Networks
JI, Yugang
SHI, Chuan
FANG, Yuan
KONG, Xiangnan
YIN, Mingyang
Semi-supervised co-clustering on attributed heterogeneous information networks
description Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other. To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering. Keywords: co-clustering, heterogeneous information network, meta-paths, matrix tri-factorization, semi-supervised learning
format text
author JI, Yugang
SHI, Chuan
FANG, Yuan
KONG, Xiangnan
YIN, Mingyang
author_facet JI, Yugang
SHI, Chuan
FANG, Yuan
KONG, Xiangnan
YIN, Mingyang
author_sort JI, Yugang
title Semi-supervised co-clustering on attributed heterogeneous information networks
title_short Semi-supervised co-clustering on attributed heterogeneous information networks
title_full Semi-supervised co-clustering on attributed heterogeneous information networks
title_fullStr Semi-supervised co-clustering on attributed heterogeneous information networks
title_full_unstemmed Semi-supervised co-clustering on attributed heterogeneous information networks
title_sort semi-supervised co-clustering on attributed heterogeneous information networks
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/5291
https://ink.library.smu.edu.sg/context/sis_research/article/6294/viewcontent/IPM20_SCCAIN.pdf
_version_ 1770575373250068480