Semi-supervised co-clustering on attributed heterogeneous information networks
Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2020
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/5291 https://ink.library.smu.edu.sg/context/sis_research/article/6294/viewcontent/IPM20_SCCAIN.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-6294 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-62942020-09-09T04:47:45Z Semi-supervised co-clustering on attributed heterogeneous information networks JI, Yugang SHI, Chuan FANG, Yuan KONG, Xiangnan YIN, Mingyang Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other. To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering. Keywords: co-clustering, heterogeneous information network, meta-paths, matrix tri-factorization, semi-supervised learning 2020-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5291 info:doi/10.1016/j.ipm.2020.102338 https://ink.library.smu.edu.sg/context/sis_research/article/6294/viewcontent/IPM20_SCCAIN.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University co-clustering heterogeneous information network meta-paths matrix tri-factorization semi-supervised learning Databases and Information Systems OS and Networks |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
co-clustering heterogeneous information network meta-paths matrix tri-factorization semi-supervised learning Databases and Information Systems OS and Networks |
spellingShingle |
co-clustering heterogeneous information network meta-paths matrix tri-factorization semi-supervised learning Databases and Information Systems OS and Networks JI, Yugang SHI, Chuan FANG, Yuan KONG, Xiangnan YIN, Mingyang Semi-supervised co-clustering on attributed heterogeneous information networks |
description |
Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other. To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering. Keywords: co-clustering, heterogeneous information network, meta-paths, matrix tri-factorization, semi-supervised learning |
format |
text |
author |
JI, Yugang SHI, Chuan FANG, Yuan KONG, Xiangnan YIN, Mingyang |
author_facet |
JI, Yugang SHI, Chuan FANG, Yuan KONG, Xiangnan YIN, Mingyang |
author_sort |
JI, Yugang |
title |
Semi-supervised co-clustering on attributed heterogeneous information networks |
title_short |
Semi-supervised co-clustering on attributed heterogeneous information networks |
title_full |
Semi-supervised co-clustering on attributed heterogeneous information networks |
title_fullStr |
Semi-supervised co-clustering on attributed heterogeneous information networks |
title_full_unstemmed |
Semi-supervised co-clustering on attributed heterogeneous information networks |
title_sort |
semi-supervised co-clustering on attributed heterogeneous information networks |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2020 |
url |
https://ink.library.smu.edu.sg/sis_research/5291 https://ink.library.smu.edu.sg/context/sis_research/article/6294/viewcontent/IPM20_SCCAIN.pdf |
_version_ |
1770575373250068480 |