Top-K Aggregation Queries over Large Networks

Searching and mining large graphs today is critical to a variety of application domains, ranging from personalized recommendation in social networks, to searches for functional associations in biological pathways. In these domains, there is a need to perform aggregation operations on large-scale net...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yan, Xifeng, He, Bin, ZHU, Feida, Han, Jiawei
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2010
Subjects:	Databases and Information Systems Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/508 https://ink.library.smu.edu.sg/context/sis_research/article/1507/viewcontent/ZhuFDicde10_aggregation.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-1507
record_format	dspace
spelling	sg-smu-ink.sis_research-15072018-12-07T01:12:30Z Top-K Aggregation Queries over Large Networks Yan, Xifeng He, Bin ZHU, Feida Han, Jiawei Searching and mining large graphs today is critical to a variety of application domains, ranging from personalized recommendation in social networks, to searches for functional associations in biological pathways. In these domains, there is a need to perform aggregation operations on large-scale networks. Unfortunately the existing implementation of aggregation operations on relational databases does not guarantee superior performance in network space, especially when it involves edge traversals and joins of gigantic tables. In this paper, we investigate the neighborhood aggregation queries: Find nodes that have top-k highest aggregate values over their h-hop neighbors. While these basic queries are common in a wide range of search and recommendation tasks, surprisingly they have not been studied systematically. We developed a Local Neighborhood Aggregation framework, called LONA, to answer them efficiently. LONA exploits two properties unique in network space: First, the aggregate value for the neighboring nodes should be similar in most cases; Second, given the distribution of attribute values, it is possible to estimate the upper-bound value of aggregates. These two properties inspire the development of novel pruning techniques, forward pruning using differential index and backward pruning using partial distribution. Empirical results show that LONA could outperform the baseline algorithm up to 10 times in real-life large networks. 2010-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/508 info:doi/10.1109/ICDE.2010.5447863 https://ink.library.smu.edu.sg/context/sis_research/article/1507/viewcontent/ZhuFDicde10_aggregation.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems Numerical Analysis and Scientific Computing
spellingShingle	Databases and Information Systems Numerical Analysis and Scientific Computing Yan, Xifeng He, Bin ZHU, Feida Han, Jiawei Top-K Aggregation Queries over Large Networks
description	Searching and mining large graphs today is critical to a variety of application domains, ranging from personalized recommendation in social networks, to searches for functional associations in biological pathways. In these domains, there is a need to perform aggregation operations on large-scale networks. Unfortunately the existing implementation of aggregation operations on relational databases does not guarantee superior performance in network space, especially when it involves edge traversals and joins of gigantic tables. In this paper, we investigate the neighborhood aggregation queries: Find nodes that have top-k highest aggregate values over their h-hop neighbors. While these basic queries are common in a wide range of search and recommendation tasks, surprisingly they have not been studied systematically. We developed a Local Neighborhood Aggregation framework, called LONA, to answer them efficiently. LONA exploits two properties unique in network space: First, the aggregate value for the neighboring nodes should be similar in most cases; Second, given the distribution of attribute values, it is possible to estimate the upper-bound value of aggregates. These two properties inspire the development of novel pruning techniques, forward pruning using differential index and backward pruning using partial distribution. Empirical results show that LONA could outperform the baseline algorithm up to 10 times in real-life large networks.
format	text
author	Yan, Xifeng He, Bin ZHU, Feida Han, Jiawei
author_facet	Yan, Xifeng He, Bin ZHU, Feida Han, Jiawei
author_sort	Yan, Xifeng
title	Top-K Aggregation Queries over Large Networks
title_short	Top-K Aggregation Queries over Large Networks
title_full	Top-K Aggregation Queries over Large Networks
title_fullStr	Top-K Aggregation Queries over Large Networks
title_full_unstemmed	Top-K Aggregation Queries over Large Networks
title_sort	top-k aggregation queries over large networks
publisher	Institutional Knowledge at Singapore Management University
publishDate	2010
url	https://ink.library.smu.edu.sg/sis_research/508 https://ink.library.smu.edu.sg/context/sis_research/article/1507/viewcontent/ZhuFDicde10_aggregation.pdf
_version_	1770570445285752832

Top-K Aggregation Queries over Large Networks

Similar Items