Multiple centers based fuzzy clustering for imbalanced data

Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-mea...

Full description

Saved in:

Bibliographic Details
Main Author:	Liao, Hongda
Other Authors:	Chen Lihui
Format:	Theses and Dissertations
Language:	English
Published:	2016
Subjects:	DRNTU::Engineering::Electrical and electronic engineering
Online Access:	http://hdl.handle.net/10356/68529
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-means clustering algorithms and the fuzzy k-means clustering algorithms. There are many factors that may affect the performance of the k-means clustering algorithms, such as high dimensionality, scales of the data, noise, etc. And the data distribution is also an important factor that can affect the performance of the k-means clustering algorithm significantly, not only for the hard k-means clustering, but also for the fuzzy k-means clustering. The problem caused by the imbalanced data is also called the “uniform effect”. In this thesis, the multicenter clustering algorithm (MC) [6] has been studied and implemented, which aims to solve “uniform effect”. The MC clustering algorithm contains three sub algorithms, which are the fast global fuzzy k-mean algorithm (FGFKM), the best m-plot algorithm (BMP) and the grouping multicenter algorithm (GMC). The experimental study of the MC, and its three sub-algorithms has been conducted, and the performance of the algorithms is evaluated. Comparisons between MC and its related algorithms have been made using several datasets.

Multiple centers based fuzzy clustering for imbalanced data

Similar Items