Multiple centers based fuzzy clustering for imbalanced data
Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-mea...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/68529 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-68529 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-685292023-07-04T16:34:40Z Multiple centers based fuzzy clustering for imbalanced data Liao, Hongda Chen Lihui School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-means clustering algorithms and the fuzzy k-means clustering algorithms. There are many factors that may affect the performance of the k-means clustering algorithms, such as high dimensionality, scales of the data, noise, etc. And the data distribution is also an important factor that can affect the performance of the k-means clustering algorithm significantly, not only for the hard k-means clustering, but also for the fuzzy k-means clustering. The problem caused by the imbalanced data is also called the “uniform effect”. In this thesis, the multicenter clustering algorithm (MC) [6] has been studied and implemented, which aims to solve “uniform effect”. The MC clustering algorithm contains three sub algorithms, which are the fast global fuzzy k-mean algorithm (FGFKM), the best m-plot algorithm (BMP) and the grouping multicenter algorithm (GMC). The experimental study of the MC, and its three sub-algorithms has been conducted, and the performance of the algorithms is evaluated. Comparisons between MC and its related algorithms have been made using several datasets. Master of Science (Signal Processing) 2016-05-26T07:42:06Z 2016-05-26T07:42:06Z 2016 Thesis http://hdl.handle.net/10356/68529 en 79 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering Liao, Hongda Multiple centers based fuzzy clustering for imbalanced data |
description |
Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-means clustering algorithms and the fuzzy k-means clustering algorithms.
There are many factors that may affect the performance of the k-means clustering algorithms, such as high dimensionality, scales of the data, noise, etc. And the data distribution is also an important factor that can affect the performance of the k-means clustering algorithm significantly, not only for the hard k-means clustering, but also for the fuzzy k-means clustering. The problem caused by the imbalanced data is also called the “uniform effect”.
In this thesis, the multicenter clustering algorithm (MC) [6] has been studied and implemented, which aims to solve “uniform effect”. The MC clustering algorithm contains three sub algorithms, which are the fast global fuzzy k-mean algorithm (FGFKM), the best m-plot algorithm (BMP) and the grouping multicenter algorithm (GMC). The experimental study of the MC, and its three sub-algorithms has been conducted, and the performance of the algorithms is evaluated. Comparisons between MC and its related algorithms have been made using several datasets. |
author2 |
Chen Lihui |
author_facet |
Chen Lihui Liao, Hongda |
format |
Theses and Dissertations |
author |
Liao, Hongda |
author_sort |
Liao, Hongda |
title |
Multiple centers based fuzzy clustering for imbalanced data |
title_short |
Multiple centers based fuzzy clustering for imbalanced data |
title_full |
Multiple centers based fuzzy clustering for imbalanced data |
title_fullStr |
Multiple centers based fuzzy clustering for imbalanced data |
title_full_unstemmed |
Multiple centers based fuzzy clustering for imbalanced data |
title_sort |
multiple centers based fuzzy clustering for imbalanced data |
publishDate |
2016 |
url |
http://hdl.handle.net/10356/68529 |
_version_ |
1772828994992865280 |