Continuous Monitoring of Top-K Queries over Sliding Windows

Given a dataset P and a preference function f, a top-k query retrieves the k tuples in P with the highest scores according to f. Even though the problem is well-studied in conventional databases, the existing methods are inapplicable to highly dynamic environments involving numerous long-running que...

Full description

Saved in:
Bibliographic Details
Main Authors: MOURATIDIS, Kyriakos, BAKIRAS, Spiridon, PAPADIAS, Dimitris
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2007
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/547
https://ink.library.smu.edu.sg/context/sis_research/article/1546/viewcontent/topk_SIGMOD06.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1546
record_format dspace
spelling sg-smu-ink.sis_research-15462016-04-29T08:38:01Z Continuous Monitoring of Top-K Queries over Sliding Windows MOURATIDIS, Kyriakos BAKIRAS, Spiridon PAPADIAS, Dimitris Given a dataset P and a preference function f, a top-k query retrieves the k tuples in P with the highest scores according to f. Even though the problem is well-studied in conventional databases, the existing methods are inapplicable to highly dynamic environments involving numerous long-running queries. This paper studies continuous monitoring of top-k queries over a fixed-size window W of the most recent data. The window size can be expressed either in terms of the number of active tuples or time units. We propose a general methodology for top-k monitoring that restricts processing to the sub-domains of the workspace that influence the result of some query. To cope with high stream rates and provide fast answers in an on-line fashion, the data in W reside in main memory. The valid records are indexed by a grid structure, which also maintains book-keeping information. We present two processing techniques: the first one computes the new answer of a query whenever some of the current top-k points expire; the second one partially pre-computes the future changes in the result, achieving better running time at the expense of slightly higher space requirements. We analyze the performance of both algorithms and evaluate their efficiency through extensive experiments. Finally, we extend the proposed framework to other query types and a different data stream model. 2007-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/547 info:doi/10.1145/1142473.1142544 https://ink.library.smu.edu.sg/context/sis_research/article/1546/viewcontent/topk_SIGMOD06.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University dataset continuous monitoring processing techniques algorithm performance Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic dataset
continuous monitoring
processing techniques
algorithm performance
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle dataset
continuous monitoring
processing techniques
algorithm performance
Databases and Information Systems
Numerical Analysis and Scientific Computing
MOURATIDIS, Kyriakos
BAKIRAS, Spiridon
PAPADIAS, Dimitris
Continuous Monitoring of Top-K Queries over Sliding Windows
description Given a dataset P and a preference function f, a top-k query retrieves the k tuples in P with the highest scores according to f. Even though the problem is well-studied in conventional databases, the existing methods are inapplicable to highly dynamic environments involving numerous long-running queries. This paper studies continuous monitoring of top-k queries over a fixed-size window W of the most recent data. The window size can be expressed either in terms of the number of active tuples or time units. We propose a general methodology for top-k monitoring that restricts processing to the sub-domains of the workspace that influence the result of some query. To cope with high stream rates and provide fast answers in an on-line fashion, the data in W reside in main memory. The valid records are indexed by a grid structure, which also maintains book-keeping information. We present two processing techniques: the first one computes the new answer of a query whenever some of the current top-k points expire; the second one partially pre-computes the future changes in the result, achieving better running time at the expense of slightly higher space requirements. We analyze the performance of both algorithms and evaluate their efficiency through extensive experiments. Finally, we extend the proposed framework to other query types and a different data stream model.
format text
author MOURATIDIS, Kyriakos
BAKIRAS, Spiridon
PAPADIAS, Dimitris
author_facet MOURATIDIS, Kyriakos
BAKIRAS, Spiridon
PAPADIAS, Dimitris
author_sort MOURATIDIS, Kyriakos
title Continuous Monitoring of Top-K Queries over Sliding Windows
title_short Continuous Monitoring of Top-K Queries over Sliding Windows
title_full Continuous Monitoring of Top-K Queries over Sliding Windows
title_fullStr Continuous Monitoring of Top-K Queries over Sliding Windows
title_full_unstemmed Continuous Monitoring of Top-K Queries over Sliding Windows
title_sort continuous monitoring of top-k queries over sliding windows
publisher Institutional Knowledge at Singapore Management University
publishDate 2007
url https://ink.library.smu.edu.sg/sis_research/547
https://ink.library.smu.edu.sg/context/sis_research/article/1546/viewcontent/topk_SIGMOD06.pdf
_version_ 1770570471260028928