Concurrent data stream mining

Given the characteristics of streaming data---read-once only and infinitely streaming, it is desirable to perform multiple, concurrent types of mining on streaming data to the fullest extent permitted by resource constraints. However, to the best of our knowledge, conventional stream mining algorit...

Full description

Saved in:

Bibliographic Details
Main Author:	Wang, Wenwen.
Other Authors:	Ng Wee Keong
Format:	Final Year Project
Language:	English
Published:	2009
Subjects:	DRNTU::Engineering::Computer science and engineering::Information systems::Database management
Online Access:	http://hdl.handle.net/10356/16935
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-16935
record_format	dspace
spelling	sg-ntu-dr.10356-169352023-03-03T20:46:00Z Concurrent data stream mining Wang, Wenwen. Ng Wee Keong School of Computer Engineering Centre for Advanced Information Systems DRNTU::Engineering::Computer science and engineering::Information systems::Database management Given the characteristics of streaming data---read-once only and infinitely streaming, it is desirable to perform multiple, concurrent types of mining on streaming data to the fullest extent permitted by resource constraints. However, to the best of our knowledge, conventional stream mining algorithms focused on single, standalone mining. In this report, we made an attempt to achieve concurrent classification and clustering on streaming data. Our integrated framework---the MM-Stream---follows conventional online-offline approaches in stream mining. We describe our framework in general by dividing it into two components, online component and offline component: as data stream in, the online component completes all necessary process within constant time and drops data; as a mining request is issued by the user(s), the offline component performs the mining task(s) from the information collected by online component. We implemented and evaluated the algorithm. The performance evaluation showed that the performance of MM-Stream is comparable or better than existing standalone stream mining algorithms (D-Stream, On-Demand-Stream classifier). We investigated how the performance of such integrated mining compares with the purity and accuracy of standalone clustering and classification respectively. The results showed that, with concurrent mining, we can receive almost-double throughputs, without any degrade on the quality on mining results. We believe that our successful experimentation in MM-Stream's concurrent stream classification and clustering paves the way for incorporating more concurrent data mining tasks to maximize the outputs of stream data mining. Bachelor of Engineering (Computer Engineering) 2009-05-29T02:06:36Z 2009-05-29T02:06:36Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16935 en Nanyang Technological University 77 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Information systems::Database management
spellingShingle	DRNTU::Engineering::Computer science and engineering::Information systems::Database management Wang, Wenwen. Concurrent data stream mining
description	Given the characteristics of streaming data---read-once only and infinitely streaming, it is desirable to perform multiple, concurrent types of mining on streaming data to the fullest extent permitted by resource constraints. However, to the best of our knowledge, conventional stream mining algorithms focused on single, standalone mining. In this report, we made an attempt to achieve concurrent classification and clustering on streaming data. Our integrated framework---the MM-Stream---follows conventional online-offline approaches in stream mining. We describe our framework in general by dividing it into two components, online component and offline component: as data stream in, the online component completes all necessary process within constant time and drops data; as a mining request is issued by the user(s), the offline component performs the mining task(s) from the information collected by online component. We implemented and evaluated the algorithm. The performance evaluation showed that the performance of MM-Stream is comparable or better than existing standalone stream mining algorithms (D-Stream, On-Demand-Stream classifier). We investigated how the performance of such integrated mining compares with the purity and accuracy of standalone clustering and classification respectively. The results showed that, with concurrent mining, we can receive almost-double throughputs, without any degrade on the quality on mining results. We believe that our successful experimentation in MM-Stream's concurrent stream classification and clustering paves the way for incorporating more concurrent data mining tasks to maximize the outputs of stream data mining.
author2	Ng Wee Keong
author_facet	Ng Wee Keong Wang, Wenwen.
format	Final Year Project
author	Wang, Wenwen.
author_sort	Wang, Wenwen.
title	Concurrent data stream mining
title_short	Concurrent data stream mining
title_full	Concurrent data stream mining
title_fullStr	Concurrent data stream mining
title_full_unstemmed	Concurrent data stream mining
title_sort	concurrent data stream mining
publishDate	2009
url	http://hdl.handle.net/10356/16935
_version_	1759856279784783872

Concurrent data stream mining

Similar Items