Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning

Graph mining has proven to be extremely useful in analysing features and properties of real-world graphs. This enables a number of tasks including the prediction and evaluation of how information varies with changes in the link structure, generating and building models to extract properties such as...

Full description

Saved in:

Bibliographic Details
Main Author:	Rakaraddi, Appan
Other Authors:	Lam Siew Kei
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Graph neural networks Continual learning Few-shot learning
Online Access:	https://hdl.handle.net/10356/176227
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-176227
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Graph neural networks Continual learning Few-shot learning
spellingShingle	Computer and Information Science Graph neural networks Continual learning Few-shot learning Rakaraddi, Appan Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
description	Graph mining has proven to be extremely useful in analysing features and properties of real-world graphs. This enables a number of tasks including the prediction and evaluation of how information varies with changes in the link structure, generating and building models to extract properties such as link prediction, node classification and recommendation, cluster and community detection, etc. Deep learning techniques have become the prevalent approach for graph mining as they are able to take advantage of the increasing availability of graph data. Traditional deep learning models pre-process the graph data by mapping the nodes to a real vector, which can desecrate important node relationships in the graphs. Training on such pre-processed structures with the traditional deep learning models may present extremely unstable and inaccurate results. To overcome these issues, Graph Neural Network (GNN) was introduced to handle the non-Euclidean structure of the graphs for many of the classifier and regressor tasks. However, state-of-the-art GNNs require a large amount of training data, which incur an enormous burden of labelled data annotations for supervised learning. Also, GNNs vastly suffer from degradation with the increasing number of depth layers leading to oversmoothing during neighbourhood feature aggregation. The aim of this research is to overcome the above-mentioned challenges faced by GNNs for graph mining. We propose a method for graph data mining as a regression problem for the estimation of Eigenvector Centrality in graphs with a GNN based approach in a completely unsupervised learning environment. To achieve this, we define an Encoder-Decoder based model architecture called CUL. We show that even when trained on a small number of datasets, the model performed at least on par with the supervised methods in terms of accuracy with different types of embedding schemes like Graph Convolutional Network (GCN), GraphSAGE, and Graph Attention Network (GAT). This is reinforced by the model performance based on top-$\mathcal{N}\%$ accuracy metric, on real-world and synthetically generated datasets. CUL demonstrates at least on par or even a superior performance over existing methods. Next, we focus on the applications of GNNs in the domain of continual learning. We propose a model called GCL that learns across a sequence of tasks for node-classification in graphs under task-incremental and class-incremental settings. GCL comprises of a Reinforced Controller/RLC and an expandable GNN framework called Child Network/CN, along with a memory buffer to store the node samples. We compared our method against state-of-the-art methods in regards to average accuracy and average forgetting on four datasets with different GNNs. Our model performed better than the other methods i.e., it showed higher accuracy values as well as lower forgetting rates across the tasks in both task-incremental and class-incremental settings. Finally, we propose a method for improving few-shot node classification on graphs on any generic GNN backbone framework. We propose an uncertainty-based estimator which is modeled using a GNN that maps the scalar discrete probabilities of a GNN-classifier outputs to a continuous probability distribution. We demonstrate that the few-shot node classification accuracy improves by testing on different graph datasets. We also focus on bridging gaps between different few-shot learning methods for node classification for graphs. The performance variance between the methods is analysed and the pros/cons of each of the architectures are highlighted. This analysis aids in understanding the performance differentiator and development of better architectures for few-shot learning on graph node classification.
author2	Lam Siew Kei
author_facet	Lam Siew Kei Rakaraddi, Appan
format	Thesis-Doctor of Philosophy
author	Rakaraddi, Appan
author_sort	Rakaraddi, Appan
title	Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
title_short	Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
title_full	Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
title_fullStr	Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
title_full_unstemmed	Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
title_sort	advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/176227
_version_	1806059862309732352
spelling	sg-ntu-dr.10356-1762272024-06-03T06:51:19Z Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning Rakaraddi, Appan Lam Siew Kei School of Computer Science and Engineering Mahardhika Pratama ASSKLam@ntu.edu.sg Computer and Information Science Graph neural networks Continual learning Few-shot learning Graph mining has proven to be extremely useful in analysing features and properties of real-world graphs. This enables a number of tasks including the prediction and evaluation of how information varies with changes in the link structure, generating and building models to extract properties such as link prediction, node classification and recommendation, cluster and community detection, etc. Deep learning techniques have become the prevalent approach for graph mining as they are able to take advantage of the increasing availability of graph data. Traditional deep learning models pre-process the graph data by mapping the nodes to a real vector, which can desecrate important node relationships in the graphs. Training on such pre-processed structures with the traditional deep learning models may present extremely unstable and inaccurate results. To overcome these issues, Graph Neural Network (GNN) was introduced to handle the non-Euclidean structure of the graphs for many of the classifier and regressor tasks. However, state-of-the-art GNNs require a large amount of training data, which incur an enormous burden of labelled data annotations for supervised learning. Also, GNNs vastly suffer from degradation with the increasing number of depth layers leading to oversmoothing during neighbourhood feature aggregation. The aim of this research is to overcome the above-mentioned challenges faced by GNNs for graph mining. We propose a method for graph data mining as a regression problem for the estimation of Eigenvector Centrality in graphs with a GNN based approach in a completely unsupervised learning environment. To achieve this, we define an Encoder-Decoder based model architecture called CUL. We show that even when trained on a small number of datasets, the model performed at least on par with the supervised methods in terms of accuracy with different types of embedding schemes like Graph Convolutional Network (GCN), GraphSAGE, and Graph Attention Network (GAT). This is reinforced by the model performance based on top-$\mathcal{N}\%$ accuracy metric, on real-world and synthetically generated datasets. CUL demonstrates at least on par or even a superior performance over existing methods. Next, we focus on the applications of GNNs in the domain of continual learning. We propose a model called GCL that learns across a sequence of tasks for node-classification in graphs under task-incremental and class-incremental settings. GCL comprises of a Reinforced Controller/RLC and an expandable GNN framework called Child Network/CN, along with a memory buffer to store the node samples. We compared our method against state-of-the-art methods in regards to average accuracy and average forgetting on four datasets with different GNNs. Our model performed better than the other methods i.e., it showed higher accuracy values as well as lower forgetting rates across the tasks in both task-incremental and class-incremental settings. Finally, we propose a method for improving few-shot node classification on graphs on any generic GNN backbone framework. We propose an uncertainty-based estimator which is modeled using a GNN that maps the scalar discrete probabilities of a GNN-classifier outputs to a continuous probability distribution. We demonstrate that the few-shot node classification accuracy improves by testing on different graph datasets. We also focus on bridging gaps between different few-shot learning methods for node classification for graphs. The performance variance between the methods is analysed and the pros/cons of each of the architectures are highlighted. This analysis aids in understanding the performance differentiator and development of better architectures for few-shot learning on graph node classification. Doctor of Philosophy 2024-05-14T05:29:05Z 2024-05-14T05:29:05Z 2024 Thesis-Doctor of Philosophy Rakaraddi, A. (2024). Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176227 https://hdl.handle.net/10356/176227 10.32657/10356/176227 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning

Similar Items