An Incremental Threshold Method for Continuous Text Search Queries

A text filtering system monitors a stream of incoming documents, to identify those that match the interest profiles of its users. The user interests are registered at a server as continuous text search queries. The server constantly maintains for each query a ranked result list, comprising the recen...

Full description

Saved in:
Bibliographic Details
Main Authors: MOURATIDIS, Kyriakos, PANG, Hwee Hwa
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2009
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/455
https://ink.library.smu.edu.sg/context/sis_research/article/1454/viewcontent/ICDE09_ConTextQueries.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1454
record_format dspace
spelling sg-smu-ink.sis_research-14542016-05-03T02:49:58Z An Incremental Threshold Method for Continuous Text Search Queries MOURATIDIS, Kyriakos PANG, Hwee Hwa A text filtering system monitors a stream of incoming documents, to identify those that match the interest profiles of its users. The user interests are registered at a server as continuous text search queries. The server constantly maintains for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring. In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is to support a large number of user queries while sustaining high document arrival rates. Our solution indexes the streamed documents with a structure based on the principles of the inverted file, and processes document arrival and expiration events with an incremental threshold-based method. Using a stream of real documents, we experimentally verify the efficiency of our approach, which is at least an order of magnitude faster than a competitor constructed from existing techniques. 2009-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/455 info:doi/10.1109/ICDE.2009.197 https://ink.library.smu.edu.sg/context/sis_research/article/1454/viewcontent/ICDE09_ConTextQueries.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Arrival rates E-mail monitoring Inverted files Monitoring applications Order of magnitude Sliding Window Structure-based Text filtering Text query Text search Threshold methods User interests User query Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Arrival rates
E-mail monitoring
Inverted files
Monitoring applications
Order of magnitude
Sliding Window
Structure-based
Text filtering
Text query
Text search
Threshold methods
User interests
User query
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Arrival rates
E-mail monitoring
Inverted files
Monitoring applications
Order of magnitude
Sliding Window
Structure-based
Text filtering
Text query
Text search
Threshold methods
User interests
User query
Databases and Information Systems
Numerical Analysis and Scientific Computing
MOURATIDIS, Kyriakos
PANG, Hwee Hwa
An Incremental Threshold Method for Continuous Text Search Queries
description A text filtering system monitors a stream of incoming documents, to identify those that match the interest profiles of its users. The user interests are registered at a server as continuous text search queries. The server constantly maintains for each query a ranked result list, comprising the recent documents (drawn from a sliding window) with the highest similarity to the query. Such a system underlies many text monitoring applications that need to cope with heavy document traffic, such as news and email monitoring. In this paper, we propose the first solution for processing continuous text queries efficiently. Our objective is to support a large number of user queries while sustaining high document arrival rates. Our solution indexes the streamed documents with a structure based on the principles of the inverted file, and processes document arrival and expiration events with an incremental threshold-based method. Using a stream of real documents, we experimentally verify the efficiency of our approach, which is at least an order of magnitude faster than a competitor constructed from existing techniques.
format text
author MOURATIDIS, Kyriakos
PANG, Hwee Hwa
author_facet MOURATIDIS, Kyriakos
PANG, Hwee Hwa
author_sort MOURATIDIS, Kyriakos
title An Incremental Threshold Method for Continuous Text Search Queries
title_short An Incremental Threshold Method for Continuous Text Search Queries
title_full An Incremental Threshold Method for Continuous Text Search Queries
title_fullStr An Incremental Threshold Method for Continuous Text Search Queries
title_full_unstemmed An Incremental Threshold Method for Continuous Text Search Queries
title_sort incremental threshold method for continuous text search queries
publisher Institutional Knowledge at Singapore Management University
publishDate 2009
url https://ink.library.smu.edu.sg/sis_research/455
https://ink.library.smu.edu.sg/context/sis_research/article/1454/viewcontent/ICDE09_ConTextQueries.pdf
_version_ 1770570431422529536