Exploring parallelisms on large-scale graph processing frameworks
Large-scale graph-structured computation is becoming increasingly important for various data analysis applications, and has driven the development of various distributed graph processing frameworks. However, existing framework abstractions do not directly support efficient implementation for complex...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/66648 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Large-scale graph-structured computation is becoming increasingly important for various data analysis applications, and has driven the development of various distributed graph processing frameworks. However, existing framework abstractions do not directly support efficient implementation for complex algorithms including graph traversal algorithms. This project leverages the properties of small-world scale-free graphs, which apply to many real-world datasets, and implements an efficient Hybrid BFS algorithm and a Concurrent BFS algorithm on PowerGraph. Hybrid BFS algorithm combines a conventional top-down approach and a novel bottom-up approach to improve the overall performance, while concurrent BFS algorithm runs multiple BFSs on the same graph at the same time and shares explorations across them. Extensive experiments are conducted on two real-world graph datasets to evaluate the performance of Hybrid BFS and concurrent BFS algorithms. The results reveal that Hybrid BFS outperforms traditional top-down approach, and concurrent BFS algorithm has higher resource utilization, efficiency and scalability with respect to the number of concurrent BFSs. |
---|