Global Immutable Region Computation

A top-k query shortlists the k records in a dataset that best match the user's preferences. To indicate her preferences, the user typically determines a numeric weight for each data dimension (i.e., attribute). We refer to these weights collectively as the query vector. Based on this vector, ea...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Jilian, MOURATIDIS, Kyriakos, PANG, Hwee Hwa
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2223
https://ink.library.smu.edu.sg/context/sis_research/article/3223/viewcontent/SIGMOD14_GIR.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:A top-k query shortlists the k records in a dataset that best match the user's preferences. To indicate her preferences, the user typically determines a numeric weight for each data dimension (i.e., attribute). We refer to these weights collectively as the query vector. Based on this vector, each data record is implicitly mapped to a score value (via a weighted sum function). The records with the k largest scores are reported as the result. In this paper we propose an auxiliary feature to standard top-k query processing. Specifically, we compute the maximal locus within which the query vector incurs no change in the current top-k result. In other words, we compute all possible query weight settings that produce exactly the same top-k result as the user's original query. We call this locus the global immutable region (GIR). The GIR can be used as a guide to query vector readjustments, as a sensitivity measure for the top-k result, as well as to enable effective result caching. We develop efficient algorithms for GIR computation, and verify their robustness using a variety of real and synthetic datasets.