Learning feature representation for graphs

This FYP aimed to implement and experiment novel frameworks to learn the feature representations for graphs. In order to be used for downstream tasks, numerous algorithms require the input graphs to be represented as fixed-length feature vectors. Graph2vec developed in 2017 was reported to achieve s...

Full description

Saved in:
Bibliographic Details
Main Author: Luong, Quynh Kha
Other Authors: Chen Lihui
Format: Final Year Project
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/10356/77558
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This FYP aimed to implement and experiment novel frameworks to learn the feature representations for graphs. In order to be used for downstream tasks, numerous algorithms require the input graphs to be represented as fixed-length feature vectors. Graph2vec developed in 2017 was reported to achieve state-of-the-art results in graph classification. Seeing the potential improvement in accuracy, we targeted to extend graph2vec to learn better representations for graphs. We constructed other unsupervised graph2vec models and applied the semi-supervised learning to those models. Furthermore, we built the deep graph embeddings combining the information of both subgraph embeddings and frequency. In total, ten extended frameworks were built to learn the graph embeddings. All approaches were conducted with five benchmark datasets in the bio- and chem- informatics domain. Graph embeddings were then evaluated by the accuracy in the classification task. Experiments showed that graph2vec skip-gram bi model produces better graph embeddings than its uni counterpart (Hypothesis 1). To verify Hypothesis 2, we have proved that semi-supervised learning methods achieve better outcomes compared to unsupervised ones. Subsequently, deep graph embeddings performing better than normal graph embeddings gives concrete evidence to support our Hypothesis 3. In this FYP, we also observed that performance of graph2vec skip-gram model surpasses that of CBOW model. Noticeably, while comparing accuracy results obtained from recent frameworks, we can conclude that deep graph embeddings from the semi-supervised graph2vec skip-gram bi model is our highest achiever. This extended frameworks can be applied to other datasets to efficiently learn the representations for graphs to perform graph classification and clustering.