Cluster analysis on dynamic graph database

There has been a growing trend of shifting toward graph databases from relational databases, as they show relationships between nodes in a better fashion with more meaning and insights that could be inferred from the same. Clustering has also taken the limelight in terms of graph visualization as it...

Full description

Saved in:
Bibliographic Details
Main Author: Meha, Deepaprakash
Other Authors: Ke Yiping, Kelly
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70325
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:There has been a growing trend of shifting toward graph databases from relational databases, as they show relationships between nodes in a better fashion with more meaning and insights that could be inferred from the same. Clustering has also taken the limelight in terms of graph visualization as it can clearly show those objects with similarity as well as those with dissimilarity. In simple terms, Clustering or Community detection is grouping of objects together based on similarity in their characteristic. In other words, those in the same cluster are highly similar to one another, while those in different clusters are dissimilar. The objective of this project is mainly visualizing vast amounts of evolving graph data, using cluster analysis. As a first step, the existing research on clustering on dynamic graphs will be analysed in addition to exploring their advantages and disadvantages. Subsequently, 4 different clustering algorithms will be chosen and analysed. The algorithms that form part of this project are Louvain Multi-Level Clustering Algorithm, Walktrap Clustering Algorithm, Fast-Greedy Clustering Algorithm and Edge Betweenness Clustering Algorithm. After analysing these algorithms, the same will be implemented on large and evolving graph networks, with the objective of analysing how the clusters evolve over time in R programming. The implementation of these algorithms basically is to take snapshots of the dynamic graph dataset and applying these clustering algorithms on it to detect the communities. These snapshots will help in understanding the evolvement of clusters over time. The dynamic graph datasets used in this project include edge additions only.