TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX

Triangle counting algorithm is one of graph algorithm that can be used to analyze a graph dataset. The graph that will be analyzed sometimes needs to be partitioned first into several nodes in a cluster of graph processing engine before applying a graph algorithm to the graph. Graph partition algori...

Full description

Saved in:
Bibliographic Details
Main Author: Novanto, Dicky
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/43524
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:43524
spelling id-itb.:435242019-09-27T11:12:43ZTRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX Novanto, Dicky Indonesia Final Project partition, HDRF, triangle counting, Spark INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/43524 Triangle counting algorithm is one of graph algorithm that can be used to analyze a graph dataset. The graph that will be analyzed sometimes needs to be partitioned first into several nodes in a cluster of graph processing engine before applying a graph algorithm to the graph. Graph partition algorithm can affect performance of a distributed graph algorithm and in this research, HDRF graph partitition strategy is implemented and processing it in Apache Spark GraphX. HDRF algorithm is a graph partition algorithm that can result low replication factor value so that it can minimize the number of communications needed between nodes in a cluster when executing a graph algorithm. In this research, it is done also experiment on triangle counting algorithm on a partitioned graph using HDRF and Random Vertex Cut algorithm on Apache Spark GraphX. The purpose of the experiment is execution time comparison between executing triangle counting algorithm with graph partitioned using HDRF and Random Vertex Cut algorithm in Apache Spark GraphX and then analyzing the time comparison. Based on the experiment, it shows that execution time of triangle counting algorithm on graph partitioned using HDRF is better than using graph partitioned with Random Vertex Cut in a big dataset, Autonomous Systems by Skitter, with improvement of 20.79%. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Triangle counting algorithm is one of graph algorithm that can be used to analyze a graph dataset. The graph that will be analyzed sometimes needs to be partitioned first into several nodes in a cluster of graph processing engine before applying a graph algorithm to the graph. Graph partition algorithm can affect performance of a distributed graph algorithm and in this research, HDRF graph partitition strategy is implemented and processing it in Apache Spark GraphX. HDRF algorithm is a graph partition algorithm that can result low replication factor value so that it can minimize the number of communications needed between nodes in a cluster when executing a graph algorithm. In this research, it is done also experiment on triangle counting algorithm on a partitioned graph using HDRF and Random Vertex Cut algorithm on Apache Spark GraphX. The purpose of the experiment is execution time comparison between executing triangle counting algorithm with graph partitioned using HDRF and Random Vertex Cut algorithm in Apache Spark GraphX and then analyzing the time comparison. Based on the experiment, it shows that execution time of triangle counting algorithm on graph partitioned using HDRF is better than using graph partitioned with Random Vertex Cut in a big dataset, Autonomous Systems by Skitter, with improvement of 20.79%.
format Final Project
author Novanto, Dicky
spellingShingle Novanto, Dicky
TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
author_facet Novanto, Dicky
author_sort Novanto, Dicky
title TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
title_short TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
title_full TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
title_fullStr TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
title_full_unstemmed TRIANGLE COUNTING USING HDRF GRAPH PARTITION STRATEGY IN APACHE SPARK GRAPHX
title_sort triangle counting using hdrf graph partition strategy in apache spark graphx
url https://digilib.itb.ac.id/gdl/view/43524
_version_ 1822270407841939456