Generative semi-supervised graph anomaly detection

This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a...

Full description

Saved in:
Bibliographic Details
Main Authors: QIAO, Hezhe, WEN, Qingsong, LI, Xiaoli, LIM, Ee-peng, PANG, Guansong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
GAD
Online Access:https://ink.library.smu.edu.sg/sis_research/9763
https://ink.library.smu.edu.sg/context/sis_research/article/10763/viewcontent/10275_Generative_Semi_supervis__1_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10763
record_format dspace
spelling sg-smu-ink.sis_research-107632024-12-16T02:44:27Z Generative semi-supervised graph anomaly detection QIAO, Hezhe WEN, Qingsong LI, Xiaoli LIM, Ee-peng PANG, Guansong This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised GAD methods when they are adapted to the semi-supervised setting. However, their utilization of these normal nodes is limited. In this paper we propose a novel Generative GAD approach (namely GGAD) for the semi-supervised scenario to better exploit the normal nodes. The key idea is to generate pseudo anomaly nodes, referred to as outlier nodes, for providing effective negative node samples in training a discriminative one-class classifier. The main challenge here lies in the lack of ground truth information about real anomaly nodes. To address this challenge, GGAD is designed to leverage two important priors about the anomaly nodes – asymmetric local affinity and egocentric closeness – to generate reliable outlier nodes that assimilate anomaly nodes in both graph structure and feature representations. Comprehensive experiments on six real-world GAD datasets are performed to establish a benchmark for semi-supervised GAD and show that GGAD substantially outperforms state-of-the-art unsupervised and semi-supervised GAD methods with varying numbers of training normal nodes. 2024-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9763 info:doi/10.48550/ARXIV.2402.11887 https://ink.library.smu.edu.sg/context/sis_research/article/10763/viewcontent/10275_Generative_Semi_supervis__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graph anomaly detection GAD Generative GAD Anomaly nodes Databases and Information Systems Information Security
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Graph anomaly detection
GAD
Generative GAD
Anomaly nodes
Databases and Information Systems
Information Security
spellingShingle Graph anomaly detection
GAD
Generative GAD
Anomaly nodes
Databases and Information Systems
Information Security
QIAO, Hezhe
WEN, Qingsong
LI, Xiaoli
LIM, Ee-peng
PANG, Guansong
Generative semi-supervised graph anomaly detection
description This work considers a practical semi-supervised graph anomaly detection (GAD) scenario, where part of the nodes in a graph are known to be normal, contrasting to the extensively explored unsupervised setting with a fully unlabeled graph. We reveal that having access to the normal nodes, even just a small percentage of normal nodes, helps enhance the detection performance of existing unsupervised GAD methods when they are adapted to the semi-supervised setting. However, their utilization of these normal nodes is limited. In this paper we propose a novel Generative GAD approach (namely GGAD) for the semi-supervised scenario to better exploit the normal nodes. The key idea is to generate pseudo anomaly nodes, referred to as outlier nodes, for providing effective negative node samples in training a discriminative one-class classifier. The main challenge here lies in the lack of ground truth information about real anomaly nodes. To address this challenge, GGAD is designed to leverage two important priors about the anomaly nodes – asymmetric local affinity and egocentric closeness – to generate reliable outlier nodes that assimilate anomaly nodes in both graph structure and feature representations. Comprehensive experiments on six real-world GAD datasets are performed to establish a benchmark for semi-supervised GAD and show that GGAD substantially outperforms state-of-the-art unsupervised and semi-supervised GAD methods with varying numbers of training normal nodes.
format text
author QIAO, Hezhe
WEN, Qingsong
LI, Xiaoli
LIM, Ee-peng
PANG, Guansong
author_facet QIAO, Hezhe
WEN, Qingsong
LI, Xiaoli
LIM, Ee-peng
PANG, Guansong
author_sort QIAO, Hezhe
title Generative semi-supervised graph anomaly detection
title_short Generative semi-supervised graph anomaly detection
title_full Generative semi-supervised graph anomaly detection
title_fullStr Generative semi-supervised graph anomaly detection
title_full_unstemmed Generative semi-supervised graph anomaly detection
title_sort generative semi-supervised graph anomaly detection
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9763
https://ink.library.smu.edu.sg/context/sis_research/article/10763/viewcontent/10275_Generative_Semi_supervis__1_.pdf
_version_ 1819113131524227072