Stochastic gradient descent based fuzzy clustering for large data
Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradie...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/104522 http://hdl.handle.net/10220/25889 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-104522 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1045222020-03-07T13:24:51Z Stochastic gradient descent based fuzzy clustering for large data Chen, Lihui Wang, Yangtao Mei, Jian-Ping School of Electrical and Electronic Engineering 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) DRNTU::Engineering::Electrical and electronic engineering::Electronic systems Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis. Accepted version 2015-06-12T03:53:14Z 2019-12-06T21:34:27Z 2015-06-12T03:53:14Z 2019-12-06T21:34:27Z 2014 2014 Conference Paper Wang, Y., Chen, L., & Mei, J.-P. (2014). Stochastic gradient descent based fuzzy clustering for large data. 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2511-2518. https://hdl.handle.net/10356/104522 http://hdl.handle.net/10220/25889 10.1109/FUZZ-IEEE.2014.6891755 en © 2015 Institute of Electrical and Electronics Engineers (IEEE). application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering::Electronic systems |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering::Electronic systems Chen, Lihui Wang, Yangtao Mei, Jian-Ping Stochastic gradient descent based fuzzy clustering for large data |
description |
Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Chen, Lihui Wang, Yangtao Mei, Jian-Ping |
format |
Conference or Workshop Item |
author |
Chen, Lihui Wang, Yangtao Mei, Jian-Ping |
author_sort |
Chen, Lihui |
title |
Stochastic gradient descent based fuzzy clustering for large data |
title_short |
Stochastic gradient descent based fuzzy clustering for large data |
title_full |
Stochastic gradient descent based fuzzy clustering for large data |
title_fullStr |
Stochastic gradient descent based fuzzy clustering for large data |
title_full_unstemmed |
Stochastic gradient descent based fuzzy clustering for large data |
title_sort |
stochastic gradient descent based fuzzy clustering for large data |
publishDate |
2015 |
url |
https://hdl.handle.net/10356/104522 http://hdl.handle.net/10220/25889 |
_version_ |
1681037813397585920 |