Heterogeneous graph transformer with poly-tokenization

Graph neural networks have shown widespread success for learning on graphs, but they still face fundamental drawbacks, such as limited expressive power, over-smoothing, and over-squashing. Meanwhile, the transformer architecture offers a potential solution to these issues. However, existing graph tr...

Full description

Saved in:
Bibliographic Details
Main Authors: LU, Zhiyuan, FANG, Yuan, YANG, Cheng, SHI, Chuan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9678
https://ink.library.smu.edu.sg/context/sis_research/article/10678/viewcontent/IJCAI24_PHGT.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Graph neural networks have shown widespread success for learning on graphs, but they still face fundamental drawbacks, such as limited expressive power, over-smoothing, and over-squashing. Meanwhile, the transformer architecture offers a potential solution to these issues. However, existing graph transformers primarily cater to homogeneous graphs and are unable to model the intricate semantics of heterogeneous graphs. Moreover, unlike small molecular graphs where the entire graph can be considered as the receptive field in graph transformers, real-world heterogeneous graphs comprise a significantly larger number of nodes and cannot be entirely treated as such. Consequently, existing graph transformers struggle to capture the long-range dependencies in these complex heterogeneous graphs. To address these two limitations, we present Poly-tokenized Heterogeneous Graph Transformer (PHGT), a novel transformer-based heterogeneous graph model. In addition to traditional node tokens, PHGT introduces a novel poly-token design with two more token types: semantic tokens and global tokens. Semantic tokens encapsulate high-order heterogeneous semantic relationships, while global tokens capture semantic-aware long-range interactions. We validate the effectiveness of PHGT through extensive experiments on standardized heterogeneous graph benchmarks, demonstrating significant improvements over state-of-the-art heterogeneous graph representation learning models.