Mining communities in networks: A solution for consistency and its evaluation

Online social networks pose significant challenges to computer scientists, physicists, and sociologists alike, for their massive size, fast evolution, and uncharted potential for social computing. One particular problem that has interested us is community identification. Many algorithms based on var...

Full description

Saved in:
Bibliographic Details
Main Authors: KWAK, Haewoon, CHOI, Yoonchan, EOM, Young-Ho, JEONG, Hawoong, MOON, Sue
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2009
Subjects:
cnm
Online Access:https://ink.library.smu.edu.sg/sis_research/6102
https://ink.library.smu.edu.sg/context/sis_research/article/7105/viewcontent/1644893.1644930.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7105
record_format dspace
spelling sg-smu-ink.sis_research-71052021-09-29T12:39:54Z Mining communities in networks: A solution for consistency and its evaluation KWAK, Haewoon CHOI, Yoonchan EOM, Young-Ho JEONG, Hawoong MOON, Sue Online social networks pose significant challenges to computer scientists, physicists, and sociologists alike, for their massive size, fast evolution, and uncharted potential for social computing. One particular problem that has interested us is community identification. Many algorithms based on various metrics have been proposed for communities in networks [18, 24], but a few algorithms scale to very large networks. Three recent community identification algorithms, namely CNM [16], Wakita [59], and Louvain [10], stand out for their scalability to a few millions of nodes. All of them use modularity as the metric of optimization. However, all three algorithms produce inconsistent communities every time the ordering of nodes to the algorithms changes.We propose two quantitative metrics to represent the level of consistency across multiple runs of an algorithm: pairwise membership probability and consistency. Based on these two metrics, we propose a solution that improves the consistency without compromising the modularity. We demonstrate that our solution to use pairwise membership probabilities as link weights generates consistent communities within six or fewer cycles for most networks. However, our iterative, pairwise membership reinforcing approach does not deliver convergence for Flickr, Orkut, and Cyworld networks as well for the rest of the networks. Our approach is empirically driven and is yet to be shown to produce consistent output analytically. We leave further investigation into the topological structure and its impact on the consistency as future work.In order to evaluate the quality of clustering, we have looked at 3 of the 48 communities identified in the AS graph. Surprisingly, all have either hierarchical, geographical, or topological interpretations to their groupings. Our preliminary evaluation of the quality of communities is promising. We plan to conduct more thorough evaluation of the communities and study network structures and their evolutions using our approach. 2009-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6102 info:doi/10.1145/1644893.1644930 https://ink.library.smu.edu.sg/context/sis_research/article/7105/viewcontent/1644893.1644930.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University community modularity cnm wakita louvain social networks consistent community identification as graph Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic community
modularity
cnm
wakita
louvain
social networks
consistent community identification
as graph
Numerical Analysis and Scientific Computing
spellingShingle community
modularity
cnm
wakita
louvain
social networks
consistent community identification
as graph
Numerical Analysis and Scientific Computing
KWAK, Haewoon
CHOI, Yoonchan
EOM, Young-Ho
JEONG, Hawoong
MOON, Sue
Mining communities in networks: A solution for consistency and its evaluation
description Online social networks pose significant challenges to computer scientists, physicists, and sociologists alike, for their massive size, fast evolution, and uncharted potential for social computing. One particular problem that has interested us is community identification. Many algorithms based on various metrics have been proposed for communities in networks [18, 24], but a few algorithms scale to very large networks. Three recent community identification algorithms, namely CNM [16], Wakita [59], and Louvain [10], stand out for their scalability to a few millions of nodes. All of them use modularity as the metric of optimization. However, all three algorithms produce inconsistent communities every time the ordering of nodes to the algorithms changes.We propose two quantitative metrics to represent the level of consistency across multiple runs of an algorithm: pairwise membership probability and consistency. Based on these two metrics, we propose a solution that improves the consistency without compromising the modularity. We demonstrate that our solution to use pairwise membership probabilities as link weights generates consistent communities within six or fewer cycles for most networks. However, our iterative, pairwise membership reinforcing approach does not deliver convergence for Flickr, Orkut, and Cyworld networks as well for the rest of the networks. Our approach is empirically driven and is yet to be shown to produce consistent output analytically. We leave further investigation into the topological structure and its impact on the consistency as future work.In order to evaluate the quality of clustering, we have looked at 3 of the 48 communities identified in the AS graph. Surprisingly, all have either hierarchical, geographical, or topological interpretations to their groupings. Our preliminary evaluation of the quality of communities is promising. We plan to conduct more thorough evaluation of the communities and study network structures and their evolutions using our approach.
format text
author KWAK, Haewoon
CHOI, Yoonchan
EOM, Young-Ho
JEONG, Hawoong
MOON, Sue
author_facet KWAK, Haewoon
CHOI, Yoonchan
EOM, Young-Ho
JEONG, Hawoong
MOON, Sue
author_sort KWAK, Haewoon
title Mining communities in networks: A solution for consistency and its evaluation
title_short Mining communities in networks: A solution for consistency and its evaluation
title_full Mining communities in networks: A solution for consistency and its evaluation
title_fullStr Mining communities in networks: A solution for consistency and its evaluation
title_full_unstemmed Mining communities in networks: A solution for consistency and its evaluation
title_sort mining communities in networks: a solution for consistency and its evaluation
publisher Institutional Knowledge at Singapore Management University
publishDate 2009
url https://ink.library.smu.edu.sg/sis_research/6102
https://ink.library.smu.edu.sg/context/sis_research/article/7105/viewcontent/1644893.1644930.pdf
_version_ 1770575822326857728