Computing Immutable Regions for Subspace Top-k Queries

Given a high-dimensional dataset, a top-k query can be used to shortlist the k tuples that best match the user's preferences. Typically, these preferences regard a subset of the available dimensions (i.e., attributes) whose relative significance is expressed by user-specified weights. Along wit...

Full description

Saved in:

Bibliographic Details
Main Authors:	MOURATIDIS, Kyriakos, PANG, Hwee Hwa
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2013
Subjects:	Computer Sciences
Online Access:	https://ink.library.smu.edu.sg/sis_research_smu/39 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1038&context=sis_research_smu
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research_smu-1038
record_format	dspace
spelling	sg-smu-ink.sis_research_smu-10382018-07-09T05:58:58Z Computing Immutable Regions for Subspace Top-k Queries MOURATIDIS, Kyriakos PANG, Hwee Hwa Given a high-dimensional dataset, a top-k query can be used to shortlist the k tuples that best match the user's preferences. Typically, these preferences regard a subset of the available dimensions (i.e., attributes) whose relative significance is expressed by user-specified weights. Along with the query result, we propose to compute for each involved dimension the maximal deviation to the corresponding weight for which the query result remains valid. The derived weight ranges, called immutable regions, are useful for performing sensitivity analysis, for finetuning the query weights, etc. In this paper, we focus on top-k queries with linear preference functions over the queried dimensions. We codify the conditions under which changes in a dimension's weight invalidate the query result, and develop algorithms to compute the immutable regions. In general, this entails the examination of numerous non-result tuples. To reduce processing time, we introduce a pruning technique and a thresholding mechanism that allow the immutable regions to be determined correctly after examining only a small number of non-result tuples. We demonstrate empirically that the two techniques combine well to form a robust and highly resource-efficient algorithm. We verify the generality of our findings using real high-dimensional data from different domains (documents, images, etc) and with different characteristics. 2013-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research_smu/39 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1038&context=sis_research_smu http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Information Systems (SMU Access Only) eng Institutional Knowledge at Singapore Management University Computer Sciences
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Computer Sciences
spellingShingle	Computer Sciences MOURATIDIS, Kyriakos PANG, Hwee Hwa Computing Immutable Regions for Subspace Top-k Queries
description	Given a high-dimensional dataset, a top-k query can be used to shortlist the k tuples that best match the user's preferences. Typically, these preferences regard a subset of the available dimensions (i.e., attributes) whose relative significance is expressed by user-specified weights. Along with the query result, we propose to compute for each involved dimension the maximal deviation to the corresponding weight for which the query result remains valid. The derived weight ranges, called immutable regions, are useful for performing sensitivity analysis, for finetuning the query weights, etc. In this paper, we focus on top-k queries with linear preference functions over the queried dimensions. We codify the conditions under which changes in a dimension's weight invalidate the query result, and develop algorithms to compute the immutable regions. In general, this entails the examination of numerous non-result tuples. To reduce processing time, we introduce a pruning technique and a thresholding mechanism that allow the immutable regions to be determined correctly after examining only a small number of non-result tuples. We demonstrate empirically that the two techniques combine well to form a robust and highly resource-efficient algorithm. We verify the generality of our findings using real high-dimensional data from different domains (documents, images, etc) and with different characteristics.
format	text
author	MOURATIDIS, Kyriakos PANG, Hwee Hwa
author_facet	MOURATIDIS, Kyriakos PANG, Hwee Hwa
author_sort	MOURATIDIS, Kyriakos
title	Computing Immutable Regions for Subspace Top-k Queries
title_short	Computing Immutable Regions for Subspace Top-k Queries
title_full	Computing Immutable Regions for Subspace Top-k Queries
title_fullStr	Computing Immutable Regions for Subspace Top-k Queries
title_full_unstemmed	Computing Immutable Regions for Subspace Top-k Queries
title_sort	computing immutable regions for subspace top-k queries
publisher	Institutional Knowledge at Singapore Management University
publishDate	2013
url	https://ink.library.smu.edu.sg/sis_research_smu/39 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1038&context=sis_research_smu
_version_	1712300666136821760

Computing Immutable Regions for Subspace Top-k Queries

Similar Items