Graph homophily unsupervised measurements

This study investigates unsupervised measures for graph homophily, a feature that indicates the degree to which homologous nodes are connected in a network. Traditional studies frequently use labels to quantify homophily, however in many real-world circumstances, these labels may not be accessible....

Full description

Saved in:
Bibliographic Details
Main Author: Nguyen, Hoang Minh
Other Authors: Lihui Chen
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176882
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-176882
record_format dspace
spelling sg-ntu-dr.10356-1768822024-05-24T15:43:45Z Graph homophily unsupervised measurements Nguyen, Hoang Minh Lihui Chen School of Electrical and Electronic Engineering Zheng Yilun ELHCHEN@ntu.edu.sg, yilun001@e.ntu.edu.sg Computer and Information Science Graph Homophily This study investigates unsupervised measures for graph homophily, a feature that indicates the degree to which homologous nodes are connected in a network. Traditional studies frequently use labels to quantify homophily, however in many real-world circumstances, these labels may not be accessible. As a result, we present several unsupervised approaches for measuring homophily that do not require labels. Our proposed methods include: (1) calculating raw feature similarity across all nodes and selecting edges based on a threshold, calculating similarity of learned representations across all nodes and selecting edges based on a threshold, where the representations are learned using (2) Graph Auto-encoder and (3) Graph Attention Model without using labels, and (4) using unsupervised graph clustering and evaluating graph homophily based on the clustering results. To evaluate the efficiency of these unsupervised measures, we propose two evaluation perspectives: edge homophily and node homophily. For edge homophily, we first choose all edges and then use true labels to determine if an edge is homophilic or non-homophilic. For node homophily, we first calculate true labels, then unsupervised node homophily, and lastly correlation. This comprehensive evaluation allows us to understand the strengths and weaknesses of each unsupervised measurement, as well as insights into optimal methods for evaluating graph homophily in an unsupervised setting. Bachelor's degree 2024-05-21T07:01:42Z 2024-05-21T07:01:42Z 2024 Final Year Project (FYP) Nguyen, H. M. (2024). Graph homophily unsupervised measurements. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176882 https://hdl.handle.net/10356/176882 en A3029-231 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Graph Homophily
spellingShingle Computer and Information Science
Graph Homophily
Nguyen, Hoang Minh
Graph homophily unsupervised measurements
description This study investigates unsupervised measures for graph homophily, a feature that indicates the degree to which homologous nodes are connected in a network. Traditional studies frequently use labels to quantify homophily, however in many real-world circumstances, these labels may not be accessible. As a result, we present several unsupervised approaches for measuring homophily that do not require labels. Our proposed methods include: (1) calculating raw feature similarity across all nodes and selecting edges based on a threshold, calculating similarity of learned representations across all nodes and selecting edges based on a threshold, where the representations are learned using (2) Graph Auto-encoder and (3) Graph Attention Model without using labels, and (4) using unsupervised graph clustering and evaluating graph homophily based on the clustering results. To evaluate the efficiency of these unsupervised measures, we propose two evaluation perspectives: edge homophily and node homophily. For edge homophily, we first choose all edges and then use true labels to determine if an edge is homophilic or non-homophilic. For node homophily, we first calculate true labels, then unsupervised node homophily, and lastly correlation. This comprehensive evaluation allows us to understand the strengths and weaknesses of each unsupervised measurement, as well as insights into optimal methods for evaluating graph homophily in an unsupervised setting.
author2 Lihui Chen
author_facet Lihui Chen
Nguyen, Hoang Minh
format Final Year Project
author Nguyen, Hoang Minh
author_sort Nguyen, Hoang Minh
title Graph homophily unsupervised measurements
title_short Graph homophily unsupervised measurements
title_full Graph homophily unsupervised measurements
title_fullStr Graph homophily unsupervised measurements
title_full_unstemmed Graph homophily unsupervised measurements
title_sort graph homophily unsupervised measurements
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/176882
_version_ 1814047339544838144