An empirical comparative analysis of clustering algorithms for big data applications

Big data is a vaguely defined term that describes a dataset as either too large or too complex to analyze and get satisfactory results. Clustering algorithms are a possible solution to this problem of big data, where they can be categorized according to one or more of three clustering objectives. Th...

Full description

Saved in:

Bibliographic Details
Main Author:	Delos Santos, Duke Danielle T.
Format:	text
Language:	English
Published:	Animo Repository 2017
Subjects:	Big data Algorithms
Online Access:	https://animorepository.dlsu.edu.ph/etd_masteral/5395
Tags:	Add Tag No Tags, Be the first to tag this record!

id	oai:animorepository.dlsu.edu.ph:etd_masteral-12233
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etd_masteral-122332024-08-07T02:55:41Z An empirical comparative analysis of clustering algorithms for big data applications Delos Santos, Duke Danielle T. Big data is a vaguely defined term that describes a dataset as either too large or too complex to analyze and get satisfactory results. Clustering algorithms are a possible solution to this problem of big data, where they can be categorized according to one or more of three clustering objectives. These are defined as either grouping focused algorithms, in which the algorithm aims to classify the dataset into meaningful groups, data summarization algorithms, in which the algorithm aims to summarize the data point into a more concise format for an easier analysis, and finally, data visualization, in which the dataset is visualized in a more understandable format. While there are only three categories one can classify clustering algorithms, there are a large number of clustering algorithms with differing performances for different sizes of datasets. The algorithms empirically evaluated and compared under the research include k-means, SOM, DBSCAN, BFR, and BIRCH, and it was found that the algorithms all have different strengths and weaknesses when classifying scaled up datasets, and one can choose the appropriate algorithm based on these strengths and weaknesses. 2017-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/5395 Master's Theses English Animo Repository Big data Algorithms
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
topic	Big data Algorithms
spellingShingle	Big data Algorithms Delos Santos, Duke Danielle T. An empirical comparative analysis of clustering algorithms for big data applications
description	Big data is a vaguely defined term that describes a dataset as either too large or too complex to analyze and get satisfactory results. Clustering algorithms are a possible solution to this problem of big data, where they can be categorized according to one or more of three clustering objectives. These are defined as either grouping focused algorithms, in which the algorithm aims to classify the dataset into meaningful groups, data summarization algorithms, in which the algorithm aims to summarize the data point into a more concise format for an easier analysis, and finally, data visualization, in which the dataset is visualized in a more understandable format. While there are only three categories one can classify clustering algorithms, there are a large number of clustering algorithms with differing performances for different sizes of datasets. The algorithms empirically evaluated and compared under the research include k-means, SOM, DBSCAN, BFR, and BIRCH, and it was found that the algorithms all have different strengths and weaknesses when classifying scaled up datasets, and one can choose the appropriate algorithm based on these strengths and weaknesses.
format	text
author	Delos Santos, Duke Danielle T.
author_facet	Delos Santos, Duke Danielle T.
author_sort	Delos Santos, Duke Danielle T.
title	An empirical comparative analysis of clustering algorithms for big data applications
title_short	An empirical comparative analysis of clustering algorithms for big data applications
title_full	An empirical comparative analysis of clustering algorithms for big data applications
title_fullStr	An empirical comparative analysis of clustering algorithms for big data applications
title_full_unstemmed	An empirical comparative analysis of clustering algorithms for big data applications
title_sort	empirical comparative analysis of clustering algorithms for big data applications
publisher	Animo Repository
publishDate	2017
url	https://animorepository.dlsu.edu.ph/etd_masteral/5395
_version_	1808616487611531264

An empirical comparative analysis of clustering algorithms for big data applications

Similar Items