HubPPR: Effective Indexing for Approximate Personalized PageRank

Personalized PageRank (PPR) computation is a fundamental operation in web search, social networks, and graph analysis. Given a graph G, a source s, and a target t, the PPR query Π(s, t) returns the probability that a random walk on G starting from s terminates at t. Unlike global PageRank which can...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Sibo, Tang, Youze, Xiao, Xiaokui, Yang, Yin, Li, Zengxiang
Other Authors: School of Computer Science and Engineering
Format: Conference or Workshop Item
Language:English
Published: 2017
Subjects:
Online Access:https://hdl.handle.net/10356/81350
http://hdl.handle.net/10220/43455
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-81350
record_format dspace
spelling sg-ntu-dr.10356-813502020-03-07T11:48:45Z HubPPR: Effective Indexing for Approximate Personalized PageRank Wang, Sibo Tang, Youze Xiao, Xiaokui Yang, Yin Li, Zengxiang School of Computer Science and Engineering Proceedings of the VLDB Endowment Personalized PageRank Indexing scheme Personalized PageRank (PPR) computation is a fundamental operation in web search, social networks, and graph analysis. Given a graph G, a source s, and a target t, the PPR query Π(s, t) returns the probability that a random walk on G starting from s terminates at t. Unlike global PageRank which can be effectively pre-computed and materialized, the PPR result depends on both the source and the target, rendering results materialization infeasible for large graphs. Existing indexing techniques have rather limited effectiveness; in fact, the current state-of-the-art solution, BiPPR, answers individual PPR queries without pre-computation or indexing, and yet it outperforms all previous index-based solutions. Motivated by this, we propose HubPPR, an effective indexing scheme for PPR computation with controllable tradeoffs for accuracy, query time, and memory consumption. The main idea is to pre-compute and index auxiliary information for selected hub nodes that are often involved in PPR processing. Going one step further, we extend HubPPR to answer top-k PPR queries, which returns the k nodes with the highest PPR values with respect to a source s, among a given set T of target nodes. Extensive experiments demonstrate that compared to the current best solution BiPPR, HubPPR achieves up to 10x and 220x speedup for PPR and top-k PPR processing, respectively, with moderate memory consumption. Notably, with a single commodity server, HubPPR answers a top-k PPR query in seconds on graphs with billions of edges, with high accuracy and strong result quality guarantees. MOE (Min. of Education, S’pore) Published version 2017-07-27T02:34:23Z 2019-12-06T14:29:00Z 2017-07-27T02:34:23Z 2019-12-06T14:29:00Z 2016 Conference Paper Wang, S., Tang, Y., Xiao, X., Yang, Y., & Li, Z. (2016). HubPPR: Effective Indexing for Approximate Personalized PageRank. Proceedings of the VLDB Endowment, 10(3), 205-216. 2150-8097 https://hdl.handle.net/10356/81350 http://hdl.handle.net/10220/43455 10.14778/3021924.3021936 en Proceedings of the VLDB Endowment This work is licensed under the Creative Commons AttributionNonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For any use beyond those covered by this license, obtain permission by emailing info@vldb.org. 12 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Personalized PageRank
Indexing scheme
spellingShingle Personalized PageRank
Indexing scheme
Wang, Sibo
Tang, Youze
Xiao, Xiaokui
Yang, Yin
Li, Zengxiang
HubPPR: Effective Indexing for Approximate Personalized PageRank
description Personalized PageRank (PPR) computation is a fundamental operation in web search, social networks, and graph analysis. Given a graph G, a source s, and a target t, the PPR query Π(s, t) returns the probability that a random walk on G starting from s terminates at t. Unlike global PageRank which can be effectively pre-computed and materialized, the PPR result depends on both the source and the target, rendering results materialization infeasible for large graphs. Existing indexing techniques have rather limited effectiveness; in fact, the current state-of-the-art solution, BiPPR, answers individual PPR queries without pre-computation or indexing, and yet it outperforms all previous index-based solutions. Motivated by this, we propose HubPPR, an effective indexing scheme for PPR computation with controllable tradeoffs for accuracy, query time, and memory consumption. The main idea is to pre-compute and index auxiliary information for selected hub nodes that are often involved in PPR processing. Going one step further, we extend HubPPR to answer top-k PPR queries, which returns the k nodes with the highest PPR values with respect to a source s, among a given set T of target nodes. Extensive experiments demonstrate that compared to the current best solution BiPPR, HubPPR achieves up to 10x and 220x speedup for PPR and top-k PPR processing, respectively, with moderate memory consumption. Notably, with a single commodity server, HubPPR answers a top-k PPR query in seconds on graphs with billions of edges, with high accuracy and strong result quality guarantees.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Wang, Sibo
Tang, Youze
Xiao, Xiaokui
Yang, Yin
Li, Zengxiang
format Conference or Workshop Item
author Wang, Sibo
Tang, Youze
Xiao, Xiaokui
Yang, Yin
Li, Zengxiang
author_sort Wang, Sibo
title HubPPR: Effective Indexing for Approximate Personalized PageRank
title_short HubPPR: Effective Indexing for Approximate Personalized PageRank
title_full HubPPR: Effective Indexing for Approximate Personalized PageRank
title_fullStr HubPPR: Effective Indexing for Approximate Personalized PageRank
title_full_unstemmed HubPPR: Effective Indexing for Approximate Personalized PageRank
title_sort hubppr: effective indexing for approximate personalized pagerank
publishDate 2017
url https://hdl.handle.net/10356/81350
http://hdl.handle.net/10220/43455
_version_ 1681037614809874432