GraphH: High performance big graph analytics in small clusters

It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have been proposed for processing big graphs on disk, the high disk...

全面介紹

Saved in:

書目詳細資料
Main Authors:	SUN, Peng, WEN, Yonggang, TA, Nguyen Binh Duong, XIAO, Xiaokui
格式:	text
語言:	English
出版:	Institutional Knowledge at Singapore Management University 2017
主題:	Graph Processing Distributed Computing System Network Numerical Analysis and Scientific Computing Software Engineering
在線閱讀:	https://ink.library.smu.edu.sg/sis_research/4765 https://ink.library.smu.edu.sg/context/sis_research/article/5768/viewcontent/1705.05595.pdf
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Singapore Management University
語言:	English

實物特徵
總結:	It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have been proposed for processing big graphs on disk, the high disk I/O overhead could significantly reduce performance. In this paper, we propose GraphH to enable highperformance big graph analytics in small clusters. Specifically, we design a two-stage graph partition scheme to evenly divide the input graph into partitions, and propose a GAB (GatherApply-Broadcast) computation model to make each worker process a partition in memory at a time. We use an edge cache mechanism to reduce the disk I/O overhead, and design a hybrid strategy to improve the communication performance. GraphH can efficiently process big graphs in small clusters or even a single commodity server. Extensive evaluations have shown that GraphH could be up to 7.8x faster compared to popular in-memory systems, such as Pregel+ and PowerGraph when processing generic graphs, and more than 100x faster than recently proposed out-of-core systems, such as GraphD and Chaos when processing big graphs.

GraphH: High performance big graph analytics in small clusters

相似書籍