From node embedding to community embedding on graphs

In the last few years, graphs have become an instinctive representative tool to better study complex structures. For example, in chemistry, it is common to represent and study the interaction between different proteins as a graph, reducing the experimental costs. Similarly, with the consolidation of...

Full description

Saved in:

Bibliographic Details
Main Author:	Cavallari, Sandro
Other Authors:	Erik Cambria
Format:	Theses and Dissertations
Language:	English
Published:	2019
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/85349 http://hdl.handle.net/10220/48265
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-85349
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Cavallari, Sandro From node embedding to community embedding on graphs
description	In the last few years, graphs have become an instinctive representative tool to better study complex structures. For example, in chemistry, it is common to represent and study the interaction between different proteins as a graph, reducing the experimental costs. Similarly, with the consolidation of Web 2.0 and the rise of Web 3.0, social networks have become one of the most popular sources of information, wherein users not only utilize but also provide content. On such platforms, the users’ information is explicitly provided and most of the information is implicitly contained in the links between different users. In robotics, it is possible to achieve more a “human-like” movement of an artificial skeleton if we represent the interaction of each major joint rather than adopting an independent approach. In addition to its representation capabilities, effectively extracting valuable information from graph data remains an open problem. While, in the last few years, machine learning, particularly deep learning, has achieved outstanding results in many complex applications such as image processing or natural language processing, researchers have not yet found a legitimate technique to leverage the structural information contained in the graph data type. Moreover, it is not yet possible to fully unleash the learning capabilities of deep models. Thus, in this thesis, we will introduce the concept of graph embedding, a robust set of techniques able to automatically extract meaningful information from network structures. First, we will provide an understanding of the problem setting by formally defining each component. Subsequently, we provide a review of all the major studies related to this topic. Then, we study an important yet largely under-explored setting of graph embedding, i.e., embedding communities rather than individual nodes. We identify that community embedding is not only useful for community-level applications such as graph visualization but also provides an exciting opportunity to improve community detection and node classification. To learn such embedding, our insight hinges on a closed loop among community embedding, community detection, and node embedding. On the one hand, node embedding can help improve community detection, which outputs good communities for fitting better community embedding. On the other hand, community embedding can be used to optimize the node embedding by introducing a community-aware high-order proximity. Based on this insights, we propose a novel Community Embedding (ComE) model that jointly solves the three tasks together. In practice, the number of communities can be unknown beforehand; therefore, we further proposed ComE+, a new framework that handles both uncertainty of unknown ground truth community assignments and an unknown number of communities. We further developed an efficient inference algorithm for ComE+ by maintaining the complexity as low as O(\|V\| + \|E\|). We evaluated the proposed algorithms on multiple real-world data sets and showed that they improve graph visualization and outperform the state-of-the-art baselines in various application tasks, e.g., community detection and node classification. Finally, given the proliferation of dynamic interactive structure in many socio-economic problems, we argue that a new set of embedding techniques are required. Recently, spatio-temporal graphs and interaction networks have proven to be useful in abstracting the underling interactive structure that defines the influence between the various aspects of a problem, but most of the previous studies have developed architecture that is able to only preserve manually crafted interrelation. However, we argue that it is necessary to automatically learn from such interaction structures if we want such models to fit in real-world problems. We also require that the discovered latent structure should be easily interpretable to facilitate the understanding of the modern world. Thus, we propose an efficient neural mixture model able to project a dynamic interaction in a fixed size space, preserving the influence that each node undergoes because of its neighbors. The proposed architecture is able to outperform strong baselines in various data sets ranging from the prediction of the household’s energy load to traffic congestion, and many others.
author2	Erik Cambria
author_facet	Erik Cambria Cavallari, Sandro
format	Theses and Dissertations
author	Cavallari, Sandro
author_sort	Cavallari, Sandro
title	From node embedding to community embedding on graphs
title_short	From node embedding to community embedding on graphs
title_full	From node embedding to community embedding on graphs
title_fullStr	From node embedding to community embedding on graphs
title_full_unstemmed	From node embedding to community embedding on graphs
title_sort	from node embedding to community embedding on graphs
publishDate	2019
url	https://hdl.handle.net/10356/85349 http://hdl.handle.net/10220/48265
_version_	1681059257823264768
spelling	sg-ntu-dr.10356-853492020-07-01T05:44:21Z From node embedding to community embedding on graphs Cavallari, Sandro Erik Cambria School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering In the last few years, graphs have become an instinctive representative tool to better study complex structures. For example, in chemistry, it is common to represent and study the interaction between different proteins as a graph, reducing the experimental costs. Similarly, with the consolidation of Web 2.0 and the rise of Web 3.0, social networks have become one of the most popular sources of information, wherein users not only utilize but also provide content. On such platforms, the users’ information is explicitly provided and most of the information is implicitly contained in the links between different users. In robotics, it is possible to achieve more a “human-like” movement of an artificial skeleton if we represent the interaction of each major joint rather than adopting an independent approach. In addition to its representation capabilities, effectively extracting valuable information from graph data remains an open problem. While, in the last few years, machine learning, particularly deep learning, has achieved outstanding results in many complex applications such as image processing or natural language processing, researchers have not yet found a legitimate technique to leverage the structural information contained in the graph data type. Moreover, it is not yet possible to fully unleash the learning capabilities of deep models. Thus, in this thesis, we will introduce the concept of graph embedding, a robust set of techniques able to automatically extract meaningful information from network structures. First, we will provide an understanding of the problem setting by formally defining each component. Subsequently, we provide a review of all the major studies related to this topic. Then, we study an important yet largely under-explored setting of graph embedding, i.e., embedding communities rather than individual nodes. We identify that community embedding is not only useful for community-level applications such as graph visualization but also provides an exciting opportunity to improve community detection and node classification. To learn such embedding, our insight hinges on a closed loop among community embedding, community detection, and node embedding. On the one hand, node embedding can help improve community detection, which outputs good communities for fitting better community embedding. On the other hand, community embedding can be used to optimize the node embedding by introducing a community-aware high-order proximity. Based on this insights, we propose a novel Community Embedding (ComE) model that jointly solves the three tasks together. In practice, the number of communities can be unknown beforehand; therefore, we further proposed ComE+, a new framework that handles both uncertainty of unknown ground truth community assignments and an unknown number of communities. We further developed an efficient inference algorithm for ComE+ by maintaining the complexity as low as O(\|V\| + \|E\|). We evaluated the proposed algorithms on multiple real-world data sets and showed that they improve graph visualization and outperform the state-of-the-art baselines in various application tasks, e.g., community detection and node classification. Finally, given the proliferation of dynamic interactive structure in many socio-economic problems, we argue that a new set of embedding techniques are required. Recently, spatio-temporal graphs and interaction networks have proven to be useful in abstracting the underling interactive structure that defines the influence between the various aspects of a problem, but most of the previous studies have developed architecture that is able to only preserve manually crafted interrelation. However, we argue that it is necessary to automatically learn from such interaction structures if we want such models to fit in real-world problems. We also require that the discovered latent structure should be easily interpretable to facilitate the understanding of the modern world. Thus, we propose an efficient neural mixture model able to project a dynamic interaction in a fixed size space, preserving the influence that each node undergoes because of its neighbors. The proposed architecture is able to outperform strong baselines in various data sets ranging from the prediction of the household’s energy load to traffic congestion, and many others. Doctor of Philosophy 2019-05-17T07:27:27Z 2019-12-06T16:02:10Z 2019-05-17T07:27:27Z 2019-12-06T16:02:10Z 2019 Thesis Cavallari, S. (2019). From node embedding to community embedding on graphs. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/85349 http://hdl.handle.net/10220/48265 10.32657/10220/48265 en 118 p. application/pdf

From node embedding to community embedding on graphs

Similar Items