Unsupervised user identity linkage via factoid embedding

User identity linkage (UIL), the problem of matching user account across multiple online social networks (OSNs), is widely studied and important to many real-world applications. Most existing UIL solutions adopt a supervised or semisupervised approach which generally suffer from scarcity of labeled...

Full description

Saved in:
Bibliographic Details
Main Authors: XIE, Wei, MU, Xin, LEE, Roy Ka Wei, ZHU, Feida, LIM, Ee-peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4258
https://ink.library.smu.edu.sg/context/sis_research/article/5261/viewcontent/24._Dec06_2018___Unsupervised_User_Identity_Linkage_via_Factoid_Embedding__ICDM18_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:User identity linkage (UIL), the problem of matching user account across multiple online social networks (OSNs), is widely studied and important to many real-world applications. Most existing UIL solutions adopt a supervised or semisupervised approach which generally suffer from scarcity of labeled data. In this paper, we propose Factoid Embedding, a novel framework that adopts an unsupervised approach. It is designed to cope with different profile attributes, content types and network links of different OSNs. The key idea is that each piece of information about a user identity describes the real identity owner, and thus distinguishes the owner from other users. We represent such a piece of information by a factoid and model it as a triplet consisting of user identity, predicate, and an object or another user identity. By embedding these factoids, we learn the user identity latent representations and link two user identities from different OSNs if they are close to each other in the user embedding space. Our Factoid Embedding algorithm is designed such that as we learn the embedding space, each embedded factoid is “translated” into a motion in the user embedding space to bring similar user identities closer, and different user identities further apart. Extensive experiments are conducted to evaluate Factoid Embedding on two real-world OSNs data sets. The experiment results show that Factoid Embedding outperforms the state-of-the-art methods even without training data.