Cluster analysis on dynamic graphs

As the collection of data becomes common with the big data revolution, an increasing number of methods are created for data analysis. One important field of data is graph data. Much research has been done for static graph analysis in recent years. However, graph data today is continuously evolving (...

Full description

Saved in:
Bibliographic Details
Main Author: Lee, Cecilia Si Si
Other Authors: Ke Yiping, Kelly
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156636
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-156636
record_format dspace
spelling sg-ntu-dr.10356-1566362022-04-21T08:20:20Z Cluster analysis on dynamic graphs Lee, Cecilia Si Si Ke Yiping, Kelly School of Computer Science and Engineering ypke@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence As the collection of data becomes common with the big data revolution, an increasing number of methods are created for data analysis. One important field of data is graph data. Much research has been done for static graph analysis in recent years. However, graph data today is continuously evolving (not static), especially with the constant expansion of the Internet. For example, social networks are changing every second as users can add or remove social connections. Clustering is often used as a technique to identify the different social clusters. Traditionally, graphs are represented as adjacency lists and adjacency matrices. However, as data size increases, it becomes impractical to store graphs in adjacency lists and matrices. Thus, graphs are represented as embeddings. Embedding is a compressed representation of the graph, and it is a vector-based representation of the graph. Thus, this representation can be used as input for machine learning. There are a few research that proposes dynamic graph embedding methods. However, little work has been done to use these dynamic graph embedding methods to get embeddings and then perform clustering on them. Thus, in this project, we will specifically experiment with the different embedding methods and perform clustering on the embeddings. This project aims to study the clustering problem on dynamic graphs that evolve and change over time. It will find a method that does not require re-clustering of the graph when it is modified as well as maintain the accuracy of the clustering of the dynamic graphs. We make the following specific contributions in this project: 1. In-depth literature review of recent state-of-art dynamic graph clustering methods. 2. Propose a machine learning graph pipeline that merges different dynamic graph embedding methods with k-means clustering, to improve current dynamic graph clustering methods. 3. Perform in-depth evaluation and analysis of the different embedding models by carrying out experiments with simulated and real-life datasets; with DyEARNN being the best embedding model for dynamic clustering. Bachelor of Engineering (Computer Science) 2022-04-21T08:20:19Z 2022-04-21T08:20:19Z 2022 Final Year Project (FYP) Lee, C. S. S. (2022). Cluster analysis on dynamic graphs. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156636 https://hdl.handle.net/10356/156636 en SCSE21-0371 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Lee, Cecilia Si Si
Cluster analysis on dynamic graphs
description As the collection of data becomes common with the big data revolution, an increasing number of methods are created for data analysis. One important field of data is graph data. Much research has been done for static graph analysis in recent years. However, graph data today is continuously evolving (not static), especially with the constant expansion of the Internet. For example, social networks are changing every second as users can add or remove social connections. Clustering is often used as a technique to identify the different social clusters. Traditionally, graphs are represented as adjacency lists and adjacency matrices. However, as data size increases, it becomes impractical to store graphs in adjacency lists and matrices. Thus, graphs are represented as embeddings. Embedding is a compressed representation of the graph, and it is a vector-based representation of the graph. Thus, this representation can be used as input for machine learning. There are a few research that proposes dynamic graph embedding methods. However, little work has been done to use these dynamic graph embedding methods to get embeddings and then perform clustering on them. Thus, in this project, we will specifically experiment with the different embedding methods and perform clustering on the embeddings. This project aims to study the clustering problem on dynamic graphs that evolve and change over time. It will find a method that does not require re-clustering of the graph when it is modified as well as maintain the accuracy of the clustering of the dynamic graphs. We make the following specific contributions in this project: 1. In-depth literature review of recent state-of-art dynamic graph clustering methods. 2. Propose a machine learning graph pipeline that merges different dynamic graph embedding methods with k-means clustering, to improve current dynamic graph clustering methods. 3. Perform in-depth evaluation and analysis of the different embedding models by carrying out experiments with simulated and real-life datasets; with DyEARNN being the best embedding model for dynamic clustering.
author2 Ke Yiping, Kelly
author_facet Ke Yiping, Kelly
Lee, Cecilia Si Si
format Final Year Project
author Lee, Cecilia Si Si
author_sort Lee, Cecilia Si Si
title Cluster analysis on dynamic graphs
title_short Cluster analysis on dynamic graphs
title_full Cluster analysis on dynamic graphs
title_fullStr Cluster analysis on dynamic graphs
title_full_unstemmed Cluster analysis on dynamic graphs
title_sort cluster analysis on dynamic graphs
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/156636
_version_ 1731235806110023680