Mining Top-K Large Structural Patterns in a Massive Network

With ever-growing popularity of social networks, web and bio-networks, mining large frequent patterns from a single huge network has become increasingly important. Yet the existing pattern mining methods cannot oﬀer the eﬃciency desirable for large pattern discovery. We propose Spider- Mine, a novel...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHU, Feida, QU, Qiang, LO, David, YAN, Xifeng, HAN, Jiawei, YU, Philip S.
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2011
Subjects:	Databases and Information Systems Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/1394 https://ink.library.smu.edu.sg/context/sis_research/article/2393/viewcontent/p807_zhu.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-2393
record_format	dspace
spelling	sg-smu-ink.sis_research-23932020-01-10T01:00:56Z Mining Top-K Large Structural Patterns in a Massive Network ZHU, Feida QU, Qiang LO, David YAN, Xifeng HAN, Jiawei YU, Philip S. With ever-growing popularity of social networks, web and bio-networks, mining large frequent patterns from a single huge network has become increasingly important. Yet the existing pattern mining methods cannot oﬀer the eﬃciency desirable for large pattern discovery. We propose Spider- Mine, a novel algorithm to eﬃciently mine top-K largest frequent patterns from a single massive network with any user-speciﬁed probability of 1 − ϵ. Deviating from the existing edge-by-edge (i.e., incremental) pattern-growth framework, SpiderMine achieves its eﬃciency by unleashing the power of small patterns of a bounded diameter, which we call “spiders”. With the spider structure, our approach adopts a probabilistic mining framework to ﬁnd the top-k largest patterns by (i) identifying an aﬀordable set of promising growth paths toward large patterns, (ii) generating large patterns with much lower combinatorial complexity, and ﬁnally (iii) greatly reducing the cost of graph isomorphism tests with a new graph pattern representation by a multi-set of spiders. Extensive experimental studies on both synthetic and real data sets show that our algorithm outperforms existing methods. 2011-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/1394 info:doi/10.14778/3402707.3402720 https://ink.library.smu.edu.sg/context/sis_research/article/2393/viewcontent/p807_zhu.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems Numerical Analysis and Scientific Computing
spellingShingle	Databases and Information Systems Numerical Analysis and Scientific Computing ZHU, Feida QU, Qiang LO, David YAN, Xifeng HAN, Jiawei YU, Philip S. Mining Top-K Large Structural Patterns in a Massive Network
description	With ever-growing popularity of social networks, web and bio-networks, mining large frequent patterns from a single huge network has become increasingly important. Yet the existing pattern mining methods cannot oﬀer the eﬃciency desirable for large pattern discovery. We propose Spider- Mine, a novel algorithm to eﬃciently mine top-K largest frequent patterns from a single massive network with any user-speciﬁed probability of 1 − ϵ. Deviating from the existing edge-by-edge (i.e., incremental) pattern-growth framework, SpiderMine achieves its eﬃciency by unleashing the power of small patterns of a bounded diameter, which we call “spiders”. With the spider structure, our approach adopts a probabilistic mining framework to ﬁnd the top-k largest patterns by (i) identifying an aﬀordable set of promising growth paths toward large patterns, (ii) generating large patterns with much lower combinatorial complexity, and ﬁnally (iii) greatly reducing the cost of graph isomorphism tests with a new graph pattern representation by a multi-set of spiders. Extensive experimental studies on both synthetic and real data sets show that our algorithm outperforms existing methods.
format	text
author	ZHU, Feida QU, Qiang LO, David YAN, Xifeng HAN, Jiawei YU, Philip S.
author_facet	ZHU, Feida QU, Qiang LO, David YAN, Xifeng HAN, Jiawei YU, Philip S.
author_sort	ZHU, Feida
title	Mining Top-K Large Structural Patterns in a Massive Network
title_short	Mining Top-K Large Structural Patterns in a Massive Network
title_full	Mining Top-K Large Structural Patterns in a Massive Network
title_fullStr	Mining Top-K Large Structural Patterns in a Massive Network
title_full_unstemmed	Mining Top-K Large Structural Patterns in a Massive Network
title_sort	mining top-k large structural patterns in a massive network
publisher	Institutional Knowledge at Singapore Management University
publishDate	2011
url	https://ink.library.smu.edu.sg/sis_research/1394 https://ink.library.smu.edu.sg/context/sis_research/article/2393/viewcontent/p807_zhu.pdf
_version_	1770571101374513152

Mining Top-K Large Structural Patterns in a Massive Network

Similar Items