Scheduled approximation for Personalized PageRank with Utility-based hub selection
As Personalized PageRank has been widely leveraged for ranking on a graph, the efficient computation of Personalized PageRank Vector (PPV) becomes a prominent issue. In this paper, we propose FastPPV, an approximate PPV computation algorithm that is incremental and accuracy-aware. Our approach hinge...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2015
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/4070 https://ink.library.smu.edu.sg/context/sis_research/article/5073/viewcontent/fastppv.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-5073 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-50732018-07-20T04:57:34Z Scheduled approximation for Personalized PageRank with Utility-based hub selection ZHU, Fanwei FANG, Yuan CHANG, Kevin Chen-Chuan YING, Jing As Personalized PageRank has been widely leveraged for ranking on a graph, the efficient computation of Personalized PageRank Vector (PPV) becomes a prominent issue. In this paper, we propose FastPPV, an approximate PPV computation algorithm that is incremental and accuracy-aware. Our approach hinges on a novel paradigm of scheduled approximation: the computation is partitioned and scheduled for processing in an “organized” way, such that we can gradually improve our PPV estimation in an incremental manner and quantify the accuracy of our approximation at query time. Guided by this principle, we develop an efficient hub-based realization, where we adopt the metric of hub length to partition and schedule random walk tours so that the approximation error reduces exponentially over iterations. In addition, as tours are segmented by hubs, the shared substructures between different tours (around the same hub) can be reused to speed up query processing both within and across iterations. Given the key roles played by the hubs, we further investigate the problem of hub selection. In particular, we develop a conceptual model to select hubs based on the two desirable properties of hubs—sharing and discriminating, and present several different strategies to realize the conceptual model. Finally, we evaluate FastPPV over two real-world graphs, and show that it not only significantly outperforms two state-of-the-art baselines in both online and offline phrases, but also scales well on larger graphs. In particular, we are able to achieve near-constant time online query processing irrespective of graph size. 2015-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4070 info:doi/10.1007/s00778-014-0376-8 https://ink.library.smu.edu.sg/context/sis_research/article/5073/viewcontent/fastppv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Accuracy-aware Incremental enhancement Hub selection Scheduled approximation Personalized PageRank Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Accuracy-aware Incremental enhancement Hub selection Scheduled approximation Personalized PageRank Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Accuracy-aware Incremental enhancement Hub selection Scheduled approximation Personalized PageRank Databases and Information Systems Numerical Analysis and Scientific Computing ZHU, Fanwei FANG, Yuan CHANG, Kevin Chen-Chuan YING, Jing Scheduled approximation for Personalized PageRank with Utility-based hub selection |
description |
As Personalized PageRank has been widely leveraged for ranking on a graph, the efficient computation of Personalized PageRank Vector (PPV) becomes a prominent issue. In this paper, we propose FastPPV, an approximate PPV computation algorithm that is incremental and accuracy-aware. Our approach hinges on a novel paradigm of scheduled approximation: the computation is partitioned and scheduled for processing in an “organized” way, such that we can gradually improve our PPV estimation in an incremental manner and quantify the accuracy of our approximation at query time. Guided by this principle, we develop an efficient hub-based realization, where we adopt the metric of hub length to partition and schedule random walk tours so that the approximation error reduces exponentially over iterations. In addition, as tours are segmented by hubs, the shared substructures between different tours (around the same hub) can be reused to speed up query processing both within and across iterations. Given the key roles played by the hubs, we further investigate the problem of hub selection. In particular, we develop a conceptual model to select hubs based on the two desirable properties of hubs—sharing and discriminating, and present several different strategies to realize the conceptual model. Finally, we evaluate FastPPV over two real-world graphs, and show that it not only significantly outperforms two state-of-the-art baselines in both online and offline phrases, but also scales well on larger graphs. In particular, we are able to achieve near-constant time online query processing irrespective of graph size. |
format |
text |
author |
ZHU, Fanwei FANG, Yuan CHANG, Kevin Chen-Chuan YING, Jing |
author_facet |
ZHU, Fanwei FANG, Yuan CHANG, Kevin Chen-Chuan YING, Jing |
author_sort |
ZHU, Fanwei |
title |
Scheduled approximation for Personalized PageRank with Utility-based hub selection |
title_short |
Scheduled approximation for Personalized PageRank with Utility-based hub selection |
title_full |
Scheduled approximation for Personalized PageRank with Utility-based hub selection |
title_fullStr |
Scheduled approximation for Personalized PageRank with Utility-based hub selection |
title_full_unstemmed |
Scheduled approximation for Personalized PageRank with Utility-based hub selection |
title_sort |
scheduled approximation for personalized pagerank with utility-based hub selection |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2015 |
url |
https://ink.library.smu.edu.sg/sis_research/4070 https://ink.library.smu.edu.sg/context/sis_research/article/5073/viewcontent/fastppv.pdf |
_version_ |
1770574239651332096 |