On size-oriented long-tailed graph classification of graph neural networks

The prevalence of graph structures attracts a surge of investigation on graph data, enabling several downstream tasks such as multigraph classification. However, in the multi-graph setting, graphs usually follow a long-tailed distribution in terms of their sizes, i.e., the number of nodes. In partic...

Full description

Saved in:
Bibliographic Details
Main Authors: LIU, Zemin, MAO, Qiheng, LIU, Chenghao, FANG, Yuan, SUN, Jianling
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7489
https://ink.library.smu.edu.sg/context/sis_research/article/8492/viewcontent/TheWebConf22_SOLT.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:The prevalence of graph structures attracts a surge of investigation on graph data, enabling several downstream tasks such as multigraph classification. However, in the multi-graph setting, graphs usually follow a long-tailed distribution in terms of their sizes, i.e., the number of nodes. In particular, a large fraction of tail graphs usually have small sizes. Though recent graph neural networks (GNNs) can learn powerful graph-level representations, they treat the graphs uniformly and marginalize the tail graphs which suffer from the lack of distinguishable structures, resulting in inferior performance on tail graphs. To alleviate this concern, in this paper we propose a novel graph neural network named SOLT-GNN, to close the representational gap between the head and tail graphs from the perspective of knowledge transfer. In particular, SOLTGNN capitalizes on the co-occurrence substructures exploitation to extract the transferable patterns from head graphs. Furthermore, a novel relevance prediction function is proposed to memorize the pattern relevance derived from head graphs, in order to predict the complements for tail graphs to attain more comprehensive structures for enrichment. We conduct extensive experiments on five benchmark datasets, and demonstrate that our proposed model can outperform the state-of-the-art baselines.