Searching correlated objects in a long sequence

Sequence, widely appearing in various applications (e.g. event logs, text documents, etc) is an ordered list of objects. Exploring correlated objects in a sequence can provide useful knowledge among the objects, e.g., event causality in event log and word phrases in documents. In this paper, we intr...

Full description

Saved in:
Bibliographic Details
Main Authors: LEE, Ken C. K., LEE, Wang-chien, Peuquet, Donna, ZHENG, Baihua
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2008
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/378
https://ink.library.smu.edu.sg/context/sis_research/article/1377/viewcontent/Searching_Correlated_Objects_in_a_Long_Sequence.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1377
record_format dspace
spelling sg-smu-ink.sis_research-13772018-12-03T06:22:57Z Searching correlated objects in a long sequence LEE, Ken C. K. LEE, Wang-chien Peuquet, Donna ZHENG, Baihua Sequence, widely appearing in various applications (e.g. event logs, text documents, etc) is an ordered list of objects. Exploring correlated objects in a sequence can provide useful knowledge among the objects, e.g., event causality in event log and word phrases in documents. In this paper, we introduce correlation query that finds correlated pairs of objects often appearing closely to each other in a given sequence. A correlation query is specified by two control parameters, distance bound, the requirement of object closeness, and correlation threshold, the minimum requirement of correlation strength of result pairs. Instead of processing the query by scanning the sequence multiple times, that is called Multi-Scan Algorithm (MSA), we propose One-Scan Algorithm (OSA) and Index-Based Algorithm (IBA). OSA accesses a queried sequence once and IBA considers correlation threshold in the execution and effectively eliminates unneeded candidates from detail examination. An extensive set of experiments is conducted to evaluate all these algorithms. Among them, IBA, significantly outperforming the others, is the most efficient. 2008-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/378 info:doi/10.1007/978-3-540-69497-7_28 https://ink.library.smu.edu.sg/context/sis_research/article/1377/viewcontent/Searching_Correlated_Objects_in_a_Long_Sequence.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
LEE, Ken C. K.
LEE, Wang-chien
Peuquet, Donna
ZHENG, Baihua
Searching correlated objects in a long sequence
description Sequence, widely appearing in various applications (e.g. event logs, text documents, etc) is an ordered list of objects. Exploring correlated objects in a sequence can provide useful knowledge among the objects, e.g., event causality in event log and word phrases in documents. In this paper, we introduce correlation query that finds correlated pairs of objects often appearing closely to each other in a given sequence. A correlation query is specified by two control parameters, distance bound, the requirement of object closeness, and correlation threshold, the minimum requirement of correlation strength of result pairs. Instead of processing the query by scanning the sequence multiple times, that is called Multi-Scan Algorithm (MSA), we propose One-Scan Algorithm (OSA) and Index-Based Algorithm (IBA). OSA accesses a queried sequence once and IBA considers correlation threshold in the execution and effectively eliminates unneeded candidates from detail examination. An extensive set of experiments is conducted to evaluate all these algorithms. Among them, IBA, significantly outperforming the others, is the most efficient.
format text
author LEE, Ken C. K.
LEE, Wang-chien
Peuquet, Donna
ZHENG, Baihua
author_facet LEE, Ken C. K.
LEE, Wang-chien
Peuquet, Donna
ZHENG, Baihua
author_sort LEE, Ken C. K.
title Searching correlated objects in a long sequence
title_short Searching correlated objects in a long sequence
title_full Searching correlated objects in a long sequence
title_fullStr Searching correlated objects in a long sequence
title_full_unstemmed Searching correlated objects in a long sequence
title_sort searching correlated objects in a long sequence
publisher Institutional Knowledge at Singapore Management University
publishDate 2008
url https://ink.library.smu.edu.sg/sis_research/378
https://ink.library.smu.edu.sg/context/sis_research/article/1377/viewcontent/Searching_Correlated_Objects_in_a_Long_Sequence.pdf
_version_ 1770570401521336320