TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
Triangle counting algorithm is one of graph algorithm that can be used to analyze a graph dataset. The graph that will be analyzed sometimes needs to be partitioned first into several nodes in a cluster of graph processing engine before applying a graph algorithm to the graph. Graph partition algori...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/43524 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:43524 |
---|---|
spelling |
id-itb.:435242019-09-27T11:12:43ZTRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX Novanto, Dicky Indonesia Final Project partition, HDRF, triangle counting, Spark INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/43524 Triangle counting algorithm is one of graph algorithm that can be used to analyze a graph dataset. The graph that will be analyzed sometimes needs to be partitioned first into several nodes in a cluster of graph processing engine before applying a graph algorithm to the graph. Graph partition algorithm can affect performance of a distributed graph algorithm and in this research, HDRF graph partitition strategy is implemented and processing it in Apache Spark GraphX. HDRF algorithm is a graph partition algorithm that can result low replication factor value so that it can minimize the number of communications needed between nodes in a cluster when executing a graph algorithm. In this research, it is done also experiment on triangle counting algorithm on a partitioned graph using HDRF and Random Vertex Cut algorithm on Apache Spark GraphX. The purpose of the experiment is execution time comparison between executing triangle counting algorithm with graph partitioned using HDRF and Random Vertex Cut algorithm in Apache Spark GraphX and then analyzing the time comparison. Based on the experiment, it shows that execution time of triangle counting algorithm on graph partitioned using HDRF is better than using graph partitioned with Random Vertex Cut in a big dataset, Autonomous Systems by Skitter, with improvement of 20.79%. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Triangle counting algorithm is one of graph algorithm that can be used to analyze a graph dataset. The graph that will be analyzed sometimes needs to be partitioned first into several nodes in a cluster of graph processing engine before applying a graph algorithm to the graph. Graph partition algorithm can affect performance of a distributed graph algorithm and in this research, HDRF graph partitition strategy is implemented and processing it in Apache Spark GraphX. HDRF algorithm is a graph partition algorithm that can result low replication factor value so that it can minimize the number of communications needed between nodes in a cluster when executing a graph algorithm. In this research, it is done also experiment on triangle counting algorithm on a partitioned graph using HDRF and Random Vertex Cut algorithm on Apache Spark GraphX. The purpose of the experiment is execution time comparison between executing triangle counting algorithm with graph partitioned using HDRF and Random Vertex Cut algorithm in Apache Spark GraphX and then analyzing the time comparison. Based on the experiment, it shows that execution time of triangle counting algorithm on graph partitioned using HDRF is better than using graph partitioned with Random Vertex Cut in a big dataset, Autonomous Systems by Skitter, with improvement of 20.79%. |
format |
Final Project |
author |
Novanto, Dicky |
spellingShingle |
Novanto, Dicky TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX |
author_facet |
Novanto, Dicky |
author_sort |
Novanto, Dicky |
title |
TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX |
title_short |
TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX |
title_full |
TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX |
title_fullStr |
TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX |
title_full_unstemmed |
TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX |
title_sort |
triangle counting using hdrf graph partition strategy in apache spark graphx |
url |
https://digilib.itb.ac.id/gdl/view/43524 |
_version_ |
1822270407841939456 |