Going beyond accuracy: Estimating homophily in social networks using predictions

In online social networks, it is common to use predictions of node categories to estimate measures of homophily and other relational properties. However, online social network data often lacks basic demographic information about the nodes. Researchers must rely on predicted node attributes to estima...

Full description

Saved in:

Bibliographic Details
Main Authors:	BERRY, George, SIRIANNI, Antonio, WEBER, Ingmar, AN, Jisun, MACY, Michael
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Artificial Intelligence and Robotics Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/6045 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=7048&context=sis_research
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-7048
record_format	dspace
spelling	sg-smu-ink.sis_research-70482021-07-16T01:15:21Z Going beyond accuracy: Estimating homophily in social networks using predictions BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael In online social networks, it is common to use predictions of node categories to estimate measures of homophily and other relational properties. However, online social network data often lacks basic demographic information about the nodes. Researchers must rely on predicted node attributes to estimate measures of homophily, but little is known about the validity of these measures. We show that estimating homophily in a network can be viewed as a dyadic prediction problem, and that homophily estimates are unbiased when dyad-level residuals sum to zero in the network. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally have this property and can introduce large biases into homophily estimates. Bias occurs due to error autocorrelation along dyads. Importantly, node-level classification performance is not a reliable indicator of estimation accuracy for homophily. We compare estimation strategies that make predictions at the node and dyad levels, evaluating performance in different settings. We propose a novel “ego-alter” modeling approach that outperforms standard node and dyad classification strategies. While this paper focuses on homophily, results generalize to other relational measures which aggregate predictions along the dyads in a network. We conclude with suggestions for research designs to study homophily in online networks. Code for this paper is available at https://github.com/georgeberry/autocorr. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6045 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=7048&context=sis_research http://creativecommons.org/licenses/by/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Theory and Algorithms
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Artificial Intelligence and Robotics Theory and Algorithms
spellingShingle	Artificial Intelligence and Robotics Theory and Algorithms BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael Going beyond accuracy: Estimating homophily in social networks using predictions
description	In online social networks, it is common to use predictions of node categories to estimate measures of homophily and other relational properties. However, online social network data often lacks basic demographic information about the nodes. Researchers must rely on predicted node attributes to estimate measures of homophily, but little is known about the validity of these measures. We show that estimating homophily in a network can be viewed as a dyadic prediction problem, and that homophily estimates are unbiased when dyad-level residuals sum to zero in the network. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally have this property and can introduce large biases into homophily estimates. Bias occurs due to error autocorrelation along dyads. Importantly, node-level classification performance is not a reliable indicator of estimation accuracy for homophily. We compare estimation strategies that make predictions at the node and dyad levels, evaluating performance in different settings. We propose a novel “ego-alter” modeling approach that outperforms standard node and dyad classification strategies. While this paper focuses on homophily, results generalize to other relational measures which aggregate predictions along the dyads in a network. We conclude with suggestions for research designs to study homophily in online networks. Code for this paper is available at https://github.com/georgeberry/autocorr.
format	text
author	BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael
author_facet	BERRY, George SIRIANNI, Antonio WEBER, Ingmar AN, Jisun MACY, Michael
author_sort	BERRY, George
title	Going beyond accuracy: Estimating homophily in social networks using predictions
title_short	Going beyond accuracy: Estimating homophily in social networks using predictions
title_full	Going beyond accuracy: Estimating homophily in social networks using predictions
title_fullStr	Going beyond accuracy: Estimating homophily in social networks using predictions
title_full_unstemmed	Going beyond accuracy: Estimating homophily in social networks using predictions
title_sort	going beyond accuracy: estimating homophily in social networks using predictions
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/6045 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=7048&context=sis_research
_version_	1712305318935920640

Going beyond accuracy: Estimating homophily in social networks using predictions

Similar Items