Mining and Ranking Generators of Sequential Pattern

Sequential pattern mining ¯rst proposed by Agrawal and Srikant has received intensive research due to its wide range applicability in many real-life domains. Various improvements have been proposed which include mining a closed set of sequential patterns. Sequential patterns supported by the same se...

Full description

Saved in:
Bibliographic Details
Main Authors: LO, David, KHOO, Siau-Cheng, LI, Jinyan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2008
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/1026
http://www.siam.org/proceedings/datamining/2008/dm08_51_Lo.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-2025
record_format dspace
spelling sg-smu-ink.sis_research-20252011-11-02T09:49:42Z Mining and Ranking Generators of Sequential Pattern LO, David KHOO, Siau-Cheng LI, Jinyan Sequential pattern mining ¯rst proposed by Agrawal and Srikant has received intensive research due to its wide range applicability in many real-life domains. Various improvements have been proposed which include mining a closed set of sequential patterns. Sequential patterns supported by the same sequences in the database can be considered as belonging to an equivalence class. Each equivalence class contains patterns partially-ordered by sub-sequence relationship and having the same support. Within an equivalence class, the set of maximal and minimal patterns are referred to as closed patterns and generators respectively. Generators used together with closed patterns can provide additional information which closed patterns alone are not able to provide. Also, as generators are the minimal members, they are preferable over closed patterns for model selection and classi¯cation based on the Minimum Description Length (MDL) principle. Several algorithms have been proposed for mining closed sequential patterns, but none so far for mining sequential generators. This paper ¯lls this research gap by investigating properties of sequential generators and proposing an algorithm to e±ciently mine sequential generators. The algorithm works on a three-step process of search space compaction, non-generator pruning and a ¯nal ¯ltering step. We also introduce ranking of mined generators and propose mining of a unique generator per equivalence class. Performance study has been conducted on various synthetic and real benchmark datasets. They show that mining generators can be as fast as mining closed patterns even at low support thresholds. 2008-04-01T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/1026 http://www.siam.org/proceedings/datamining/2008/dm08_51_Lo.pdf Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
LO, David
KHOO, Siau-Cheng
LI, Jinyan
Mining and Ranking Generators of Sequential Pattern
description Sequential pattern mining ¯rst proposed by Agrawal and Srikant has received intensive research due to its wide range applicability in many real-life domains. Various improvements have been proposed which include mining a closed set of sequential patterns. Sequential patterns supported by the same sequences in the database can be considered as belonging to an equivalence class. Each equivalence class contains patterns partially-ordered by sub-sequence relationship and having the same support. Within an equivalence class, the set of maximal and minimal patterns are referred to as closed patterns and generators respectively. Generators used together with closed patterns can provide additional information which closed patterns alone are not able to provide. Also, as generators are the minimal members, they are preferable over closed patterns for model selection and classi¯cation based on the Minimum Description Length (MDL) principle. Several algorithms have been proposed for mining closed sequential patterns, but none so far for mining sequential generators. This paper ¯lls this research gap by investigating properties of sequential generators and proposing an algorithm to e±ciently mine sequential generators. The algorithm works on a three-step process of search space compaction, non-generator pruning and a ¯nal ¯ltering step. We also introduce ranking of mined generators and propose mining of a unique generator per equivalence class. Performance study has been conducted on various synthetic and real benchmark datasets. They show that mining generators can be as fast as mining closed patterns even at low support thresholds.
format text
author LO, David
KHOO, Siau-Cheng
LI, Jinyan
author_facet LO, David
KHOO, Siau-Cheng
LI, Jinyan
author_sort LO, David
title Mining and Ranking Generators of Sequential Pattern
title_short Mining and Ranking Generators of Sequential Pattern
title_full Mining and Ranking Generators of Sequential Pattern
title_fullStr Mining and Ranking Generators of Sequential Pattern
title_full_unstemmed Mining and Ranking Generators of Sequential Pattern
title_sort mining and ranking generators of sequential pattern
publisher Institutional Knowledge at Singapore Management University
publishDate 2008
url https://ink.library.smu.edu.sg/sis_research/1026
http://www.siam.org/proceedings/datamining/2008/dm08_51_Lo.pdf
_version_ 1770570827715051520