Exploring parallelisms on large-scale graph processing frameworks

Large-scale graph-structured computation is becoming increasingly important for various data analysis applications, and has driven the development of various distributed graph processing frameworks. However, existing framework abstractions do not directly support efficient implementation for complex...

Full description

Saved in:
Bibliographic Details
Main Author: Guo, Jiachun
Other Authors: Cai Wentong
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66648
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Large-scale graph-structured computation is becoming increasingly important for various data analysis applications, and has driven the development of various distributed graph processing frameworks. However, existing framework abstractions do not directly support efficient implementation for complex algorithms including graph traversal algorithms. This project leverages the properties of small-world scale-free graphs, which apply to many real-world datasets, and implements an efficient Hybrid BFS algorithm and a Concurrent BFS algorithm on PowerGraph. Hybrid BFS algorithm combines a conventional top-down approach and a novel bottom-up approach to improve the overall performance, while concurrent BFS algorithm runs multiple BFSs on the same graph at the same time and shares explorations across them. Extensive experiments are conducted on two real-world graph datasets to evaluate the performance of Hybrid BFS and concurrent BFS algorithms. The results reveal that Hybrid BFS outperforms traditional top-down approach, and concurrent BFS algorithm has higher resource utilization, efficiency and scalability with respect to the number of concurrent BFSs.