Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering
Data are getting larger, and most of them are necessary for our businesses. Rapid explosion of data brings us a number of challenges relating to its complexity and how the most important knowledge can be captured in reasonable time. Fuzzy C-means (FCM)—one of the most efficient clustering algorithm...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taiwan Fuzzy Systems Association and Springer-Verlag Berlin Heidelberg
2019
|
Subjects: | |
Online Access: | http://repository.vnu.edu.vn/handle/VNU_123/64979 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Vietnam National University, Hanoi |
Language: | English |
id |
oai:112.137.131.14:VNU_123-64979 |
---|---|
record_format |
dspace |
spelling |
oai:112.137.131.14:VNU_123-649792019-07-15T04:01:37Z Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering Le, Hoang Son Nguyen, Dang Tien Big Data Hybrid Clustering Algorithms Initial Selection Incremental Clustering Data are getting larger, and most of them are necessary for our businesses. Rapid explosion of data brings us a number of challenges relating to its complexity and how the most important knowledge can be captured in reasonable time. Fuzzy C-means (FCM)—one of the most efficient clustering algorithms which have been widely used in pattern recognition, data compression, image segmentation, computer vision and many other fields—also faces the problem of processing large datasets. In this paper, we propose some novel hybrid clustering algorithms based on incremental clustering and initial selection to tune up FCM for the Big Data problem. The first algorithm determines meshes of rectangle covering data points as the representatives, while the second one considers data points that have high influence to others as the representatives. The representatives are then clustered by FCM, and the new centers are selected as initial ones for clustering of the dataset. Theoretical analyses of the new algorithms including comparison of quality of solutions when clustering the representatives set versus the entire set are examined. The experimental results on both simulated and real datasets show that total computational time of the new methods including time of finding representatives and clustering is faster than those of other relevant algorithms. The validation on clustering quality is also examined. The findings of this paper have great impact and significance to researches in the fields of soft computing and Big Data processing. It is obvious that computing methodologies nowadays are facing with huge amount of diverse and complex data structures. Speed of processing is the main priority when considering effectiveness of a specific method. The findings demonstrated practical algorithms and investigated their characteristics that could be referenced by other researchers in similar applications. The usefulness and significance of this research are clearly demonstrated within the extent of real-life applications. 2019-07-15T04:01:37Z 2019-07-15T04:01:37Z 2017 Article Le, H. S., & Nguyen, D. T. (2017). Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering. International Journal of Fuzzy Systems, 19(5), 1585-1602. http://repository.vnu.edu.vn/handle/VNU_123/64979 10.1007/s40815-016-0260-3 en International Journal of Fuzzy Systems; application/pdf Taiwan Fuzzy Systems Association and Springer-Verlag Berlin Heidelberg |
institution |
Vietnam National University, Hanoi |
building |
VNU Library & Information Center |
country |
Vietnam |
collection |
VNU Digital Repository |
language |
English |
topic |
Big Data Hybrid Clustering Algorithms Initial Selection Incremental Clustering |
spellingShingle |
Big Data Hybrid Clustering Algorithms Initial Selection Incremental Clustering Le, Hoang Son Nguyen, Dang Tien Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering |
description |
Data are getting larger, and most of them are necessary for our businesses. Rapid explosion of data brings us a number of challenges relating to its complexity and how the most important knowledge can be captured in reasonable time. Fuzzy C-means (FCM)—one of the most efficient
clustering algorithms which have been widely used in pattern recognition, data compression, image segmentation, computer vision and many other fields—also faces the problem of
processing large datasets. In this paper, we propose some novel hybrid clustering algorithms based on incremental clustering and initial selection to tune up FCM for the Big Data problem. The first algorithm determines meshes of rectangle covering data points as the representatives, while
the second one considers data points that have high influence to others as the representatives. The representatives are then clustered by FCM, and the new centers are selected as initial
ones for clustering of the dataset. Theoretical analyses of the new algorithms including comparison of quality of solutions when clustering the representatives set versus the entire set
are examined. The experimental results on both simulated and real datasets show that total computational time of the new methods including time of finding representatives and
clustering is faster than those of other relevant algorithms. The validation on clustering quality is also examined. The findings of this paper have great impact and significance to researches in the fields of soft computing and Big Data processing. It is obvious that computing methodologies
nowadays are facing with huge amount of diverse and complex data structures. Speed of processing is the main priority when considering effectiveness of a specific method. The findings demonstrated practical algorithms and investigated their characteristics that could be referenced by other researchers in similar applications. The usefulness and significance of this research are clearly demonstrated within the extent of real-life applications. |
format |
Article |
author |
Le, Hoang Son Nguyen, Dang Tien |
author_facet |
Le, Hoang Son Nguyen, Dang Tien |
author_sort |
Le, Hoang Son |
title |
Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering |
title_short |
Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering |
title_full |
Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering |
title_fullStr |
Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering |
title_full_unstemmed |
Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering |
title_sort |
tune up fuzzy c-means for big data: some novel hybrid clustering algorithms based on initial selection and incremental clustering |
publisher |
Taiwan Fuzzy Systems Association and Springer-Verlag Berlin Heidelberg |
publishDate |
2019 |
url |
http://repository.vnu.edu.vn/handle/VNU_123/64979 |
_version_ |
1680966889198583808 |