Multiple centers based fuzzy clustering for imbalanced data

Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-mea...

Full description

Saved in:
Bibliographic Details
Main Author: Liao, Hongda
Other Authors: Chen Lihui
Format: Theses and Dissertations
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/68529
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-68529
record_format dspace
spelling sg-ntu-dr.10356-685292023-07-04T16:34:40Z Multiple centers based fuzzy clustering for imbalanced data Liao, Hongda Chen Lihui School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-means clustering algorithms and the fuzzy k-means clustering algorithms. There are many factors that may affect the performance of the k-means clustering algorithms, such as high dimensionality, scales of the data, noise, etc. And the data distribution is also an important factor that can affect the performance of the k-means clustering algorithm significantly, not only for the hard k-means clustering, but also for the fuzzy k-means clustering. The problem caused by the imbalanced data is also called the “uniform effect”. In this thesis, the multicenter clustering algorithm (MC) [6] has been studied and implemented, which aims to solve “uniform effect”. The MC clustering algorithm contains three sub algorithms, which are the fast global fuzzy k-mean algorithm (FGFKM), the best m-plot algorithm (BMP) and the grouping multicenter algorithm (GMC). The experimental study of the MC, and its three sub-algorithms has been conducted, and the performance of the algorithms is evaluated. Comparisons between MC and its related algorithms have been made using several datasets. Master of Science (Signal Processing) 2016-05-26T07:42:06Z 2016-05-26T07:42:06Z 2016 Thesis http://hdl.handle.net/10356/68529 en 79 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Liao, Hongda
Multiple centers based fuzzy clustering for imbalanced data
description Clustering for data mining is a useful technique in terms of identifying interesting distributions and discovering groups in the underlying data. K-means is a particular clustering technique that is world-renowned and widely spread for its low computational cost, which mainly includes the hard k-means clustering algorithms and the fuzzy k-means clustering algorithms. There are many factors that may affect the performance of the k-means clustering algorithms, such as high dimensionality, scales of the data, noise, etc. And the data distribution is also an important factor that can affect the performance of the k-means clustering algorithm significantly, not only for the hard k-means clustering, but also for the fuzzy k-means clustering. The problem caused by the imbalanced data is also called the “uniform effect”. In this thesis, the multicenter clustering algorithm (MC) [6] has been studied and implemented, which aims to solve “uniform effect”. The MC clustering algorithm contains three sub algorithms, which are the fast global fuzzy k-mean algorithm (FGFKM), the best m-plot algorithm (BMP) and the grouping multicenter algorithm (GMC). The experimental study of the MC, and its three sub-algorithms has been conducted, and the performance of the algorithms is evaluated. Comparisons between MC and its related algorithms have been made using several datasets.
author2 Chen Lihui
author_facet Chen Lihui
Liao, Hongda
format Theses and Dissertations
author Liao, Hongda
author_sort Liao, Hongda
title Multiple centers based fuzzy clustering for imbalanced data
title_short Multiple centers based fuzzy clustering for imbalanced data
title_full Multiple centers based fuzzy clustering for imbalanced data
title_fullStr Multiple centers based fuzzy clustering for imbalanced data
title_full_unstemmed Multiple centers based fuzzy clustering for imbalanced data
title_sort multiple centers based fuzzy clustering for imbalanced data
publishDate 2016
url http://hdl.handle.net/10356/68529
_version_ 1772828994992865280