Cohesive subgraphs discovery in networks

With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph s...

Full description

Saved in:
Bibliographic Details
Main Author: Kim, Junghoon
Other Authors: Gao Cong
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156363
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-156363
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Information systems::Database management
spellingShingle Engineering::Computer science and engineering::Information systems::Database management
Kim, Junghoon
Cohesive subgraphs discovery in networks
description With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph structures in networks has many applications such as recommendation, detecting fraudsters, event organization, and graph analysis. In this thesis, we study three important problems of finding cohesive subgraphs. Our studies include finding a query-based cohesive subgraph search in social networks and location-based social networks, and detecting all cohesive subgraphs in attributed bipartite networks. In our study of Density Modularity based Community Search (DMCS), we propose the modularity-based community search problem in networks. Modularity is a widely used measure of the structure of networks which checks the strength of partition of a network into communities. Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which implies dense connections inside communities and sparse connections to the nodes outside the community. Hence, we conduct a pioneer study on searching for a community with high modularity. We point out that while modularity has been used in community detection (unrelated to query nodes), its application in community search (related to query nodes) brings in different challenges named free-rider effect and resolution limit problems. Both problems indicate that intrinsically optimizing the modularity can fail to find a small-sized community as a result for the community search problem. We mitigate these problems by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem by using the graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density modularity. We prove that the DMCS problem is NP-hard, and thus there is no exact algorithm that is scalable to large graphs. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our approximate algorithms. In our study of GeoSocial Community Search(GCS) problem, we propose the community search problem in location-based social networks. Specifically, we aim at finding a social community and a location cluster that are densely connected in a location-based social network. GCS can be useful for finding marketing targets and user/location recommendations. To the best of our knowledge, this is the first work to find a social community and a location cluster that are densely connected from location-based social networks. We prove that the GCS problem is NP-hard and is not in APX, unless P = NP. To solve GCS problem, we propose three effective and efficient algorithms: Core-based Basic Algorithm, Top-down Greedy Removing Algorithm, and Bottom-up Greedy Expansion Algorithms. Finally, we report extensive experimental studies that offer insights into the efficiency and effectiveness of the proposed solutions. In our study of Attributed Bipartite Co-clustering (ABC), we propose the co-clustering problem in attributed bipartite networks. Co-clustering in bipartite networks is a fundamental and important problem. Specifically, we unify two main concepts: (i) bipartite modularity optimization, and (ii) attribute cohesiveness. To the best of our knowledge, this is the first work to find a set of co-clusters while considering the attribute cohesiveness. We prove that ABC problem is NP-hard and propose three effective and efficient algorithms: (1) Top-Down Algorithm; (2) Bottom-Up Algorithm; and (3) Group-based Matching Algorithm. Finally, extensive experimental results on real-world attributed bipartite networks demonstrate the efficiency and effectiveness of our algorithms.
author2 Gao Cong
author_facet Gao Cong
Kim, Junghoon
format Thesis-Doctor of Philosophy
author Kim, Junghoon
author_sort Kim, Junghoon
title Cohesive subgraphs discovery in networks
title_short Cohesive subgraphs discovery in networks
title_full Cohesive subgraphs discovery in networks
title_fullStr Cohesive subgraphs discovery in networks
title_full_unstemmed Cohesive subgraphs discovery in networks
title_sort cohesive subgraphs discovery in networks
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/156363
_version_ 1734310192285220864
spelling sg-ntu-dr.10356-1563632022-05-04T10:23:15Z Cohesive subgraphs discovery in networks Kim, Junghoon Gao Cong School of Computer Science and Engineering gaocong@ntu.edu.sg Engineering::Computer science and engineering::Information systems::Database management With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph structures in networks has many applications such as recommendation, detecting fraudsters, event organization, and graph analysis. In this thesis, we study three important problems of finding cohesive subgraphs. Our studies include finding a query-based cohesive subgraph search in social networks and location-based social networks, and detecting all cohesive subgraphs in attributed bipartite networks. In our study of Density Modularity based Community Search (DMCS), we propose the modularity-based community search problem in networks. Modularity is a widely used measure of the structure of networks which checks the strength of partition of a network into communities. Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which implies dense connections inside communities and sparse connections to the nodes outside the community. Hence, we conduct a pioneer study on searching for a community with high modularity. We point out that while modularity has been used in community detection (unrelated to query nodes), its application in community search (related to query nodes) brings in different challenges named free-rider effect and resolution limit problems. Both problems indicate that intrinsically optimizing the modularity can fail to find a small-sized community as a result for the community search problem. We mitigate these problems by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem by using the graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density modularity. We prove that the DMCS problem is NP-hard, and thus there is no exact algorithm that is scalable to large graphs. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our approximate algorithms. In our study of GeoSocial Community Search(GCS) problem, we propose the community search problem in location-based social networks. Specifically, we aim at finding a social community and a location cluster that are densely connected in a location-based social network. GCS can be useful for finding marketing targets and user/location recommendations. To the best of our knowledge, this is the first work to find a social community and a location cluster that are densely connected from location-based social networks. We prove that the GCS problem is NP-hard and is not in APX, unless P = NP. To solve GCS problem, we propose three effective and efficient algorithms: Core-based Basic Algorithm, Top-down Greedy Removing Algorithm, and Bottom-up Greedy Expansion Algorithms. Finally, we report extensive experimental studies that offer insights into the efficiency and effectiveness of the proposed solutions. In our study of Attributed Bipartite Co-clustering (ABC), we propose the co-clustering problem in attributed bipartite networks. Co-clustering in bipartite networks is a fundamental and important problem. Specifically, we unify two main concepts: (i) bipartite modularity optimization, and (ii) attribute cohesiveness. To the best of our knowledge, this is the first work to find a set of co-clusters while considering the attribute cohesiveness. We prove that ABC problem is NP-hard and propose three effective and efficient algorithms: (1) Top-Down Algorithm; (2) Bottom-Up Algorithm; and (3) Group-based Matching Algorithm. Finally, extensive experimental results on real-world attributed bipartite networks demonstrate the efficiency and effectiveness of our algorithms. Doctor of Philosophy 2022-04-13T03:08:26Z 2022-04-13T03:08:26Z 2022 Thesis-Doctor of Philosophy Kim, J. (2022). Cohesive subgraphs discovery in networks. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156363 https://hdl.handle.net/10356/156363 10.32657/10356/156363 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University