Exploring node polysemy for network embedding

In real life, many complex systems are often presented in the form of data in network structure. Network embedding is a model to learn a low-dimensional feature vector from the nodes (or edges) of the network. Existing network embedding models map their respective attributes, links, and other inform...

Full description

Saved in:
Bibliographic Details
Main Author: Lou, Mingqi
Other Authors: Lihui CHEN
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141009
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In real life, many complex systems are often presented in the form of data in network structure. Network embedding is a model to learn a low-dimensional feature vector from the nodes (or edges) of the network. Existing network embedding models map their respective attributes, links, and other information into vectors that represent the nodes in the network. However, in real life, entities have many different aspects due to their own characteristics and motives. A polysemous embedding method referred to as PolyPTE [8] has recently been proposed by researchers to model every aspect of one node, mapping multiple facets of a node into a vector. It can maintain the connection between the node and the facet. Therefore, in this project, we applied and extend this PolyPTE [8]model to study if the PolyPTE can handle both multiple facets of a node and different types of links between nodes in the heterogeneous networks. Based on the definition of PolyPTE, we train different types of edges under the equal probability by selecting samples from these different sets of edges in turn. To ensure the correctness of negative sampling[28], the types of negative samples should also be the same as the types of positive samples. After that, we change the number of facets, total embedding dimension and the sampling rate to compare the sensitivity of these hyperparameters and further test the performance of the model. Finally, we compare the classification results of sampling from three types of edges respectively with that of considering all types of edges as the same type. The experimental results show that the former is better. In conclusion, in this project we successfully explore the way that Polysemous Embedding processes HIN with multiple types of links and then prove its effectiveness or the sensitivity of this model via empirical studies.