Frequent pattern space maintenance : theories and algorithms

This Thesis explores the theories and algorithms for frequent pattern space maintenance. Frequent pattern maintenance is essential for various data mining applications, ranging from database management to hypothetical query answering and interactive trend analysis. Through our survey, we observe tha...

Full description

Saved in:
Bibliographic Details
Main Author: Feng, Mengling
Other Authors: Wong Limsoon
Format: Theses and Dissertations
Language:English
Published: 2010
Subjects:
Online Access:https://hdl.handle.net/10356/20922
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This Thesis explores the theories and algorithms for frequent pattern space maintenance. Frequent pattern maintenance is essential for various data mining applications, ranging from database management to hypothetical query answering and interactive trend analysis. Through our survey, we observe that most existing maintenance algorithms are proposed as an extension of certain pattern discovery algorithms or the data structures they used. But, we believe that, to develop effective maintenance algorithms, it is necessary to understand how the space of frequent patterns evolves under the updates. We investigate the evolution of frequent pattern space using the concept of equivalence classes. This space evolution analysis lays a theoretical foundation for the development of e±cient algorithms. Based on the space evolution analysis, novel "maintainers" for the frequent pattern space, "Transaction Removal Update Maintainer" (TRUM) and "Pattern Space Maintainer" (PSM), are proposed. TRUM effectively addresses the decremental maintenance of frequent pattern space. PSM is a "complete maintainer" that e®ectively maintains the space of frequent patterns for incremental updates, decremental updates and support threshold adjustments. Experimental results demonstrate that both TRUM and PSM outperform the state-of-the-art discovery and maintenance algorithms by significant margins.