Continuous Top-k monitoring on document streams

The efficient processing of document streams plays an important role in many information filtering systems. Emerging applications, such as news update filtering and social network notifications, demand presenting end-users with the most relevant content to their preferences. In this work, user prefe...

Full description

Saved in:
Bibliographic Details
Main Authors: U, Leong Hou, ZHANG, Junjie, MOURATIDIS, Kyriakos, LI, Ye
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3643
https://ink.library.smu.edu.sg/context/sis_research/article/4645/viewcontent/TKDE17_MRIO__1_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4645
record_format dspace
spelling sg-smu-ink.sis_research-46452020-01-21T08:49:43Z Continuous Top-k monitoring on document streams U, Leong Hou ZHANG, Junjie MOURATIDIS, Kyriakos LI, Ye The efficient processing of document streams plays an important role in many information filtering systems. Emerging applications, such as news update filtering and social network notifications, demand presenting end-users with the most relevant content to their preferences. In this work, user preferences are indicated by a set of keywords. A central server monitors the document stream and continuously reports to each user the top-k documents that are most relevant to her keywords. Our objective is to support large numbers of users and high stream rates, while refreshing the top-k results almost instantaneously. Our solution abandons the traditional frequency-ordered indexing approach. Instead, it follows an identifier-ordering paradigm that suits better the nature of the problem. When complemented with a novel, locally adaptive technique, our method offers (i) proven optimality w.r.t. the number of considered queries per stream event, and (ii) an order of magnitude shorter response time (i.e., time to refresh the query results) than the current state-of-the-art. 2017-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3643 info:doi/10.1109/TKDE.2017.2657622 https://ink.library.smu.edu.sg/context/sis_research/article/4645/viewcontent/TKDE17_MRIO__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Top-k query Continuous query Document stream Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Top-k query
Continuous query
Document stream
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Top-k query
Continuous query
Document stream
Databases and Information Systems
Numerical Analysis and Scientific Computing
U, Leong Hou
ZHANG, Junjie
MOURATIDIS, Kyriakos
LI, Ye
Continuous Top-k monitoring on document streams
description The efficient processing of document streams plays an important role in many information filtering systems. Emerging applications, such as news update filtering and social network notifications, demand presenting end-users with the most relevant content to their preferences. In this work, user preferences are indicated by a set of keywords. A central server monitors the document stream and continuously reports to each user the top-k documents that are most relevant to her keywords. Our objective is to support large numbers of users and high stream rates, while refreshing the top-k results almost instantaneously. Our solution abandons the traditional frequency-ordered indexing approach. Instead, it follows an identifier-ordering paradigm that suits better the nature of the problem. When complemented with a novel, locally adaptive technique, our method offers (i) proven optimality w.r.t. the number of considered queries per stream event, and (ii) an order of magnitude shorter response time (i.e., time to refresh the query results) than the current state-of-the-art.
format text
author U, Leong Hou
ZHANG, Junjie
MOURATIDIS, Kyriakos
LI, Ye
author_facet U, Leong Hou
ZHANG, Junjie
MOURATIDIS, Kyriakos
LI, Ye
author_sort U, Leong Hou
title Continuous Top-k monitoring on document streams
title_short Continuous Top-k monitoring on document streams
title_full Continuous Top-k monitoring on document streams
title_fullStr Continuous Top-k monitoring on document streams
title_full_unstemmed Continuous Top-k monitoring on document streams
title_sort continuous top-k monitoring on document streams
publisher Institutional Knowledge at Singapore Management University
publishDate 2017
url https://ink.library.smu.edu.sg/sis_research/3643
https://ink.library.smu.edu.sg/context/sis_research/article/4645/viewcontent/TKDE17_MRIO__1_.pdf
_version_ 1770573369508364288