Stochastic gradient descent based fuzzy clustering for large data

Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradie...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen, Lihui, Wang, Yangtao, Mei, Jian-Ping
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/104522
http://hdl.handle.net/10220/25889
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-104522
record_format dspace
spelling sg-ntu-dr.10356-1045222020-03-07T13:24:51Z Stochastic gradient descent based fuzzy clustering for large data Chen, Lihui Wang, Yangtao Mei, Jian-Ping School of Electrical and Electronic Engineering 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) DRNTU::Engineering::Electrical and electronic engineering::Electronic systems Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis. Accepted version 2015-06-12T03:53:14Z 2019-12-06T21:34:27Z 2015-06-12T03:53:14Z 2019-12-06T21:34:27Z 2014 2014 Conference Paper Wang, Y., Chen, L., & Mei, J.-P. (2014). Stochastic gradient descent based fuzzy clustering for large data. 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2511-2518. https://hdl.handle.net/10356/104522 http://hdl.handle.net/10220/25889 10.1109/FUZZ-IEEE.2014.6891755 en © 2015 Institute of Electrical and Electronics Engineers (IEEE). application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Electronic systems
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Electronic systems
Chen, Lihui
Wang, Yangtao
Mei, Jian-Ping
Stochastic gradient descent based fuzzy clustering for large data
description Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering performance. Experiments have been conducted on synthetic data to show the effectiveness of our derived adaptive learning rate. Experimental studies have been also conducted on several large benchmark datasets including real world image and document datasets. Compared with existing fuzzy clustering approaches for large data, the mini-batch SGFC shows comparable or better accuracy with significant less time consumption. These results demonstrate the great potential of SGFC for large data analysis.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Chen, Lihui
Wang, Yangtao
Mei, Jian-Ping
format Conference or Workshop Item
author Chen, Lihui
Wang, Yangtao
Mei, Jian-Ping
author_sort Chen, Lihui
title Stochastic gradient descent based fuzzy clustering for large data
title_short Stochastic gradient descent based fuzzy clustering for large data
title_full Stochastic gradient descent based fuzzy clustering for large data
title_fullStr Stochastic gradient descent based fuzzy clustering for large data
title_full_unstemmed Stochastic gradient descent based fuzzy clustering for large data
title_sort stochastic gradient descent based fuzzy clustering for large data
publishDate 2015
url https://hdl.handle.net/10356/104522
http://hdl.handle.net/10220/25889
_version_ 1681037813397585920