Towards distributed node similarity search on graphs

Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHANG, Tianming, GAO, Yunjun, ZHENG, Baihua, CHEN, Lu, WEN, Shiting, GUO, Wei
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Graph Node similarity search Distributed processing Algorithm Computer Engineering Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/5147 https://ink.library.smu.edu.sg/context/sis_research/article/6150/viewcontent/Zhang2020_Article_TowardsDistributedNodeSimilari__1_.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-6150
record_format	dspace
spelling	sg-smu-ink.sis_research-61502020-07-09T04:31:36Z Towards distributed node similarity search on graphs ZHANG, Tianming GAO, Yunjun ZHENG, Baihua CHEN, Lu WEN, Shiting GUO, Wei Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In this paper, we propose an efficiently distributed framework to support node similarity search on massive graphs, which considers both graph structure correlation and node attribute similarity in metric spaces. The framework consists of preprocessing stage and query stage. In the preprocessing stage, a parallel KD-tree construction (KDC) algorithm is developed to form a newly defined graph so-called hybrid graph, in order to integrate node attribute similarity into the original graph. To equally divide graph vertices into subsets, KDC adopts the KD-tree partitioning after the pivot mapping. In addition, two metric pruning rules and an optimized allocation strategy are presented to reduce communication and computation costs. In the query stage, based on the formed hybrid graph, we develop similarity search methods using random walk with restart (RWR) to measure node similarity. To boost efficiency, we derive tight bounds to rapidly shrink the search region. Extensive experiments with three real massive graphs are conducted to verify the effectiveness, efficiency, and scalability of our proposed techniques. 2020-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/5147 info:doi/10.1007/s11280-020-00819-6 https://ink.library.smu.edu.sg/context/sis_research/article/6150/viewcontent/Zhang2020_Article_TowardsDistributedNodeSimilari__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graph Node similarity search Distributed processing Algorithm Computer Engineering Theory and Algorithms
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Graph Node similarity search Distributed processing Algorithm Computer Engineering Theory and Algorithms
spellingShingle	Graph Node similarity search Distributed processing Algorithm Computer Engineering Theory and Algorithms ZHANG, Tianming GAO, Yunjun ZHENG, Baihua CHEN, Lu WEN, Shiting GUO, Wei Towards distributed node similarity search on graphs
description	Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In this paper, we propose an efficiently distributed framework to support node similarity search on massive graphs, which considers both graph structure correlation and node attribute similarity in metric spaces. The framework consists of preprocessing stage and query stage. In the preprocessing stage, a parallel KD-tree construction (KDC) algorithm is developed to form a newly defined graph so-called hybrid graph, in order to integrate node attribute similarity into the original graph. To equally divide graph vertices into subsets, KDC adopts the KD-tree partitioning after the pivot mapping. In addition, two metric pruning rules and an optimized allocation strategy are presented to reduce communication and computation costs. In the query stage, based on the formed hybrid graph, we develop similarity search methods using random walk with restart (RWR) to measure node similarity. To boost efficiency, we derive tight bounds to rapidly shrink the search region. Extensive experiments with three real massive graphs are conducted to verify the effectiveness, efficiency, and scalability of our proposed techniques.
format	text
author	ZHANG, Tianming GAO, Yunjun ZHENG, Baihua CHEN, Lu WEN, Shiting GUO, Wei
author_facet	ZHANG, Tianming GAO, Yunjun ZHENG, Baihua CHEN, Lu WEN, Shiting GUO, Wei
author_sort	ZHANG, Tianming
title	Towards distributed node similarity search on graphs
title_short	Towards distributed node similarity search on graphs
title_full	Towards distributed node similarity search on graphs
title_fullStr	Towards distributed node similarity search on graphs
title_full_unstemmed	Towards distributed node similarity search on graphs
title_sort	towards distributed node similarity search on graphs
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/5147 https://ink.library.smu.edu.sg/context/sis_research/article/6150/viewcontent/Zhang2020_Article_TowardsDistributedNodeSimilari__1_.pdf
_version_	1770575294476845056

Towards distributed node similarity search on graphs

Similar Items