Estimating homophily in social networks using dyadic predictions

Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in...

Full description

Saved in:

Bibliographic Details
Main Authors:	BERRY, George, SIRIANNI, Antonio, WEBER, Ingmar, AN, Jisun, MACY, Michael
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2021
Subjects:	homophily networks machine learning quantitative methodology Digital Communications and Networking OS and Networks
Online Access:	https://ink.library.smu.edu.sg/sis_research/6225 https://ink.library.smu.edu.sg/context/sis_research/article/7228/viewcontent/SocSci_v8_285to307.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-7228
record_format	dspace
spelling	sg-smu-ink.sis_research-72282021-12-23T06:13:16Z Estimating homophily in social networks using dyadic predictions BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors and node degree, and correlation between node-level model errors along dyads. We examine three methods for estimating homophily: predicting node categories, predicting dyad categories, and a hybrid “ego–alter” approach. This analysis indicates that only the dyadic prediction approach is unbiased, whereas the node-level approach produces both high bias and high overall error. We find that node-level classification performance is not a reliable indicator of accuracy for homophily. Although this article focuses on a particular version of homophily, results generalize to heterophilous cases and other dyadic measures. We conclude with suggestions for research design. Code for this article is available at https://github.com/georgeberry/autocorr. 2021-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6225 info:doi/10.15195/v8.a14 https://ink.library.smu.edu.sg/context/sis_research/article/7228/viewcontent/SocSci_v8_285to307.pdf http://creativecommons.org/licenses/by/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University homophily networks machine learning quantitative methodology Digital Communications and Networking OS and Networks
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	homophily networks machine learning quantitative methodology Digital Communications and Networking OS and Networks
spellingShingle	homophily networks machine learning quantitative methodology Digital Communications and Networking OS and Networks BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael Estimating homophily in social networks using dyadic predictions
description	Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors and node degree, and correlation between node-level model errors along dyads. We examine three methods for estimating homophily: predicting node categories, predicting dyad categories, and a hybrid “ego–alter” approach. This analysis indicates that only the dyadic prediction approach is unbiased, whereas the node-level approach produces both high bias and high overall error. We find that node-level classification performance is not a reliable indicator of accuracy for homophily. Although this article focuses on a particular version of homophily, results generalize to heterophilous cases and other dyadic measures. We conclude with suggestions for research design. Code for this article is available at https://github.com/georgeberry/autocorr.
format	text
author	BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael
author_facet	BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael
author_sort	BERRY, George
title	Estimating homophily in social networks using dyadic predictions
title_short	Estimating homophily in social networks using dyadic predictions
title_full	Estimating homophily in social networks using dyadic predictions
title_fullStr	Estimating homophily in social networks using dyadic predictions
title_full_unstemmed	Estimating homophily in social networks using dyadic predictions
title_sort	estimating homophily in social networks using dyadic predictions
publisher	Institutional Knowledge at Singapore Management University
publishDate	2021
url	https://ink.library.smu.edu.sg/sis_research/6225 https://ink.library.smu.edu.sg/context/sis_research/article/7228/viewcontent/SocSci_v8_285to307.pdf
_version_	1770575894856859648

Estimating homophily in social networks using dyadic predictions

Similar Items