Transduction on directed graphs via absorbing random walks

In this paper we consider the problem of graph-based transductive classification, and we are particularly interested in the directed graph scenario which is a natural form for many real world applications. Different from existing research efforts that either only deal with undirected graphs or circu...

Full description

Saved in:
Bibliographic Details
Main Authors: De, Jaydeep, Zhang, Xiaowei, Lin, Feng, Li, Cheng
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/139874
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this paper we consider the problem of graph-based transductive classification, and we are particularly interested in the directed graph scenario which is a natural form for many real world applications. Different from existing research efforts that either only deal with undirected graphs or circumvent directionality by means of symmetrization, we propose a novel random walk approach on directed graphs using absorbing Markov chains, which can be regarded as maximizing the accumulated expected number of visits from the unlabeled transient states. Our algorithm is simple, easy to implement, and works with large-scale graphs on binary, multiclass, and multi-label prediction problems. Moreover, it is capable of preserving the graph structure even when the input graph is sparse and changes over time, as well as retaining weak signals presented in the directed edges. We present its intimate connections to a number of existing methods, including graph kernels, graph Laplacian based methods, and spanning forest of graphs. Its computational complexity and the generalization error are also studied. Empirically, our algorithm is evaluated on a wide range of applications, where it has shown to perform competitively comparing to a suite of state-of-the-art methods. In particular, our algorithm is shown to work exceptionally well with large sparse directed graphs with e.g., millions of nodes and tens of millions of edges, where it significantly outperforms other state-of-the-art methods. In the dynamic graph setting involving insertion or deletion of nodes and edge-weight changes over time, it also allows efficient online updates that produce the same results as of the batch update counterparts.