On density-based data streams clustering algorithms: A survey

Clustering data streams has drawn lots of attention in the few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data str...

Full description

Saved in:
Bibliographic Details
Main Author: Teh, Y.W.
Format: Conference or Workshop Item
Language:English
Published: 2017
Subjects:
Online Access:http://eprints.um.edu.my/18504/1/On_density.pdf
http://eprints.um.edu.my/18504/
http://dx.doi.org/10.1007/s11390-013-1416-3
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
Language: English
id my.um.eprints.18504
record_format eprints
spelling my.um.eprints.185042018-04-20T03:34:18Z http://eprints.um.edu.my/18504/ On density-based data streams clustering algorithms: A survey Teh, Y.W. QA75 Electronic computers. Computer science Clustering data streams has drawn lots of attention in the few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data streams characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using density-based methods in the clustering process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms’ performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms. 2017 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.um.edu.my/18504/1/On_density.pdf Teh, Y.W. (2017) On density-based data streams clustering algorithms: A survey. In: 7th International Conference on Electronic, Communication and Networks (CECNet 2017), 24-27 November 2017, National Dong Hwa University, Hualien, Taiwan. (Submitted) http://dx.doi.org/10.1007/s11390-013-1416-3
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Teh, Y.W.
On density-based data streams clustering algorithms: A survey
description Clustering data streams has drawn lots of attention in the few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data streams characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using density-based methods in the clustering process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms’ performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms.
format Conference or Workshop Item
author Teh, Y.W.
author_facet Teh, Y.W.
author_sort Teh, Y.W.
title On density-based data streams clustering algorithms: A survey
title_short On density-based data streams clustering algorithms: A survey
title_full On density-based data streams clustering algorithms: A survey
title_fullStr On density-based data streams clustering algorithms: A survey
title_full_unstemmed On density-based data streams clustering algorithms: A survey
title_sort on density-based data streams clustering algorithms: a survey
publishDate 2017
url http://eprints.um.edu.my/18504/1/On_density.pdf
http://eprints.um.edu.my/18504/
http://dx.doi.org/10.1007/s11390-013-1416-3
_version_ 1643690722789949440