gSWORD: GPU-accelerated sampling for subgraph counting

Subgraph counting is a fundamental component for many downstream applications such as graph representation learning and query optimization. Since obtaining the exact count is often intractable, there have been a plethora of approximation methods on graph sampling techniques. Nonetheless, the state-o...

Full description

Saved in:

Bibliographic Details
Main Authors:	YE, Chang, LI, Yuchen, SUN, Shixuan, GUO, Wentian
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Graph sampling Subgraph counting GPU computing Databases and Information Systems Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/9320 https://ink.library.smu.edu.sg/context/sis_research/article/10320/viewcontent/3639288_pvoa_cc_by.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10320
record_format	dspace
spelling	sg-smu-ink.sis_research-103202024-10-17T06:58:12Z gSWORD: GPU-accelerated sampling for subgraph counting YE, Chang LI, Yuchen SUN, Shixuan GUO, Wentian Subgraph counting is a fundamental component for many downstream applications such as graph representation learning and query optimization. Since obtaining the exact count is often intractable, there have been a plethora of approximation methods on graph sampling techniques. Nonetheless, the state-of-the-art sampling methods still require massive samples to produce accurate approximations on large data graphs. We propose gSWORD, a GPU framework that leverages the massive parallelism of GPUs to accelerate iterative sampling algorithms for subgraph counting. Despite the embarrassingly parallel nature of the samples, there are unique challenges in accelerating subgraph counting due to its irregular computation logic. To address these challenges, we introduce two GPU-centric optimizations: (1) sample inheritance, enabling threads to inherit samples from neighboring threads to avoid idling, and (2) warp streaming, effectively distributing workloads among threads through a streaming process. Moreover, we propose a CPU-GPU co-processing pipeline that overlaps the sampling and enumeration processes to mitigate the underestimation issue. Experimental results demonstrate that deploying state-of-the-art sampling algorithms on gSWORD can perform millions of samples per second. The co-processing pipeline substantially improves the estimation accuracy in the cases where existing methods encounter severe underestimations with negligible overhead. 2024-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9320 info:doi/10.1145/3639288 https://ink.library.smu.edu.sg/context/sis_research/article/10320/viewcontent/3639288_pvoa_cc_by.pdf http://creativecommons.org/licenses/by/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graph sampling Subgraph counting GPU computing Databases and Information Systems Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Graph sampling Subgraph counting GPU computing Databases and Information Systems Numerical Analysis and Scientific Computing
spellingShingle	Graph sampling Subgraph counting GPU computing Databases and Information Systems Numerical Analysis and Scientific Computing YE, Chang LI, Yuchen SUN, Shixuan GUO, Wentian gSWORD: GPU-accelerated sampling for subgraph counting
description	Subgraph counting is a fundamental component for many downstream applications such as graph representation learning and query optimization. Since obtaining the exact count is often intractable, there have been a plethora of approximation methods on graph sampling techniques. Nonetheless, the state-of-the-art sampling methods still require massive samples to produce accurate approximations on large data graphs. We propose gSWORD, a GPU framework that leverages the massive parallelism of GPUs to accelerate iterative sampling algorithms for subgraph counting. Despite the embarrassingly parallel nature of the samples, there are unique challenges in accelerating subgraph counting due to its irregular computation logic. To address these challenges, we introduce two GPU-centric optimizations: (1) sample inheritance, enabling threads to inherit samples from neighboring threads to avoid idling, and (2) warp streaming, effectively distributing workloads among threads through a streaming process. Moreover, we propose a CPU-GPU co-processing pipeline that overlaps the sampling and enumeration processes to mitigate the underestimation issue. Experimental results demonstrate that deploying state-of-the-art sampling algorithms on gSWORD can perform millions of samples per second. The co-processing pipeline substantially improves the estimation accuracy in the cases where existing methods encounter severe underestimations with negligible overhead.
format	text
author	YE, Chang LI, Yuchen SUN, Shixuan GUO, Wentian
author_facet	YE, Chang LI, Yuchen SUN, Shixuan GUO, Wentian
author_sort	YE, Chang
title	gSWORD: GPU-accelerated sampling for subgraph counting
title_short	gSWORD: GPU-accelerated sampling for subgraph counting
title_full	gSWORD: GPU-accelerated sampling for subgraph counting
title_fullStr	gSWORD: GPU-accelerated sampling for subgraph counting
title_full_unstemmed	gSWORD: GPU-accelerated sampling for subgraph counting
title_sort	gsword: gpu-accelerated sampling for subgraph counting
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9320 https://ink.library.smu.edu.sg/context/sis_research/article/10320/viewcontent/3639288_pvoa_cc_by.pdf
_version_	1814047929659293696

gSWORD: GPU-accelerated sampling for subgraph counting

Similar Items