Adversarial learning on heterogeneous information networks

Network embedding, which aims to represent network data in alow-dimensional space, has been commonly adopted for analyzingheterogeneous information networks (HIN). Although exiting HINembedding methods have achieved performance improvement tosome extent, they still face a few major weaknesses. Most...

Full description

Saved in:
Bibliographic Details
Main Authors: HU, Binbin, FANG, Yuan, SHI, Chuan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2019
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4433
https://ink.library.smu.edu.sg/context/sis_research/article/5436/viewcontent/Adversarial_Learning_Heterogeneous_Information_Networks_pv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Network embedding, which aims to represent network data in alow-dimensional space, has been commonly adopted for analyzingheterogeneous information networks (HIN). Although exiting HINembedding methods have achieved performance improvement tosome extent, they still face a few major weaknesses. Most importantly, they usually adopt negative sampling to randomly selectnodes from the network, and they do not learn the underlying distribution for more robust embedding. Inspired by generative adversarial networks (GAN), we develop a novel framework HeGAN forHIN embedding, which trains both a discriminator and a generatorin a minimax game. Compared to existing HIN embedding methods,our generator would learn the node distribution to generate betternegative samples. Compared to GANs on homogeneous networks,our discriminator and generator are designed to be relation-aware inorder to capture the rich semantics on HINs. Furthermore, towardsmore effective and efficient sampling, we propose a generalizedgenerator, which samples “latent” nodes directly from a continuousdistribution, not confined to the nodes in the original network asexisting methods are. Finally, we conduct extensive experiments onfour real-world datasets. Results show that we consistently and significantly outperform state-of-the-art baselines across all datasetsand tasks.