Top-k Dominating Queries on Incomplete Data

The top-k dominating (TKD) query returns the k objects that dominate the maximum number of objects in a given dataset. It combines the advantages of skyline and top-k queries, and plays an important role in many decision support applications. Incomplete data exists in a wide spectrum of real dataset...

Full description

Saved in:
Bibliographic Details
Main Authors: MIAO, Xiaoye, GAO, Yunjun, ZHENG, Baihua, CHEN, Gang, CUI, Huiyong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/2894
https://ink.library.smu.edu.sg/context/sis_research/article/3894/viewcontent/ZhengBH_2016_TopkDominatingQueriesIncompleteData.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:The top-k dominating (TKD) query returns the k objects that dominate the maximum number of objects in a given dataset. It combines the advantages of skyline and top-k queries, and plays an important role in many decision support applications. Incomplete data exists in a wide spectrum of real datasets, due to device failure, privacy preservation, data loss, and so on. In this paper, for the first time, we carry out a systematic study of TKD queries on incomplete data, which involves the data having some missing dimensional value(s). We formalize this problem, and propose a suite of efficient algorithms for answering TKD queries over incomplete data. Our methods employ some novel techniques, such as upper bound score pruning, bitmap pruning, and partial score pruning, to boost query efficiency. Extensive experimental evaluation using both real and synthetic datasets demonstrates the effectiveness of our developed pruning heuristics and the performance of our presented algorithms.