A support-ordered trie for fast frequent itemset discovery

The importance of data mining is apparent with the advent of powerful data collection and storage tools; raw data is so abundant that manual analysis is no longer possible. Unfortunately, data mining problems are difficult to solve and this prompted the introduction of several novel data structures...

Full description

Saved in:
Bibliographic Details
Main Authors: LIM, Ee Peng, WOON, Yew-Kwong, NG, Wee-Keong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2004
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/123
https://ink.library.smu.edu.sg/context/sis_research/article/1122/viewcontent/Support_ordered_trie_for_fast_frequent_itemset_discovery.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1122
record_format dspace
spelling sg-smu-ink.sis_research-11222018-06-29T02:02:32Z A support-ordered trie for fast frequent itemset discovery LIM, Ee Peng WOON, Yew-Kwong NG, Wee-Keong The importance of data mining is apparent with the advent of powerful data collection and storage tools; raw data is so abundant that manual analysis is no longer possible. Unfortunately, data mining problems are difficult to solve and this prompted the introduction of several novel data structures to improve mining efficiency. Here, we critically examine existing preprocessing data structures used in association rule mining for enhancing performance in an attempt to understand their strengths and weaknesses. Our analyses culminate in a practical structure called the SOTrielT (support-ordered trie itemset) and two synergistic algorithms to accompany it for the fast discovery of frequent itemsets. Experiments involving a wide range of synthetic data sets reveal that its algorithms outperform FP-growth, a recent association rule mining algorithm with excellent performance, by up to two orders of magnitude and, thus, verifying its' efficiency and viability. 2004-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/123 info:doi/10.1109/TKDE.2004.1318569 https://ink.library.smu.edu.sg/context/sis_research/article/1122/viewcontent/Support_ordered_trie_for_fast_frequent_itemset_discovery.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Association rule mining Data mining Data structures Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Association rule mining
Data mining
Data structures
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Association rule mining
Data mining
Data structures
Databases and Information Systems
Numerical Analysis and Scientific Computing
LIM, Ee Peng
WOON, Yew-Kwong
NG, Wee-Keong
A support-ordered trie for fast frequent itemset discovery
description The importance of data mining is apparent with the advent of powerful data collection and storage tools; raw data is so abundant that manual analysis is no longer possible. Unfortunately, data mining problems are difficult to solve and this prompted the introduction of several novel data structures to improve mining efficiency. Here, we critically examine existing preprocessing data structures used in association rule mining for enhancing performance in an attempt to understand their strengths and weaknesses. Our analyses culminate in a practical structure called the SOTrielT (support-ordered trie itemset) and two synergistic algorithms to accompany it for the fast discovery of frequent itemsets. Experiments involving a wide range of synthetic data sets reveal that its algorithms outperform FP-growth, a recent association rule mining algorithm with excellent performance, by up to two orders of magnitude and, thus, verifying its' efficiency and viability.
format text
author LIM, Ee Peng
WOON, Yew-Kwong
NG, Wee-Keong
author_facet LIM, Ee Peng
WOON, Yew-Kwong
NG, Wee-Keong
author_sort LIM, Ee Peng
title A support-ordered trie for fast frequent itemset discovery
title_short A support-ordered trie for fast frequent itemset discovery
title_full A support-ordered trie for fast frequent itemset discovery
title_fullStr A support-ordered trie for fast frequent itemset discovery
title_full_unstemmed A support-ordered trie for fast frequent itemset discovery
title_sort support-ordered trie for fast frequent itemset discovery
publisher Institutional Knowledge at Singapore Management University
publishDate 2004
url https://ink.library.smu.edu.sg/sis_research/123
https://ink.library.smu.edu.sg/context/sis_research/article/1122/viewcontent/Support_ordered_trie_for_fast_frequent_itemset_discovery.pdf
_version_ 1770568878817017856