Block-Based K-Medoids Partitioning Method with Standardized Data to Improve Clustering Accuracy

Most of the existing k-medoid algorithms select the initial medoid randomly or use a specific formula based on the proximity matrix. This study proposes a block-based k- medoids partitioning method for clustering objects. To get the initial medoids, we search for an object representative from the b...

Full description

Saved in:
Bibliographic Details
Main Authors: Kariyam, Kariyam, Abdurakhman, Abdurakhman, Subanar, Subanar, Utami, Herni, Effendie, Adhitya Ronnie
Format: Article PeerReviewed
Language:English
Published: International Information and Engineering Technology Association (IIETA) 2022
Subjects:
Online Access:https://repository.ugm.ac.id/282735/1/Kariyam_PA.pdf
https://repository.ugm.ac.id/282735/
http://iieta.org/journals/mmep
https://doi.org/10.18280/mmep.090622
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universitas Gadjah Mada
Language: English
Description
Summary:Most of the existing k-medoid algorithms select the initial medoid randomly or use a specific formula based on the proximity matrix. This study proposes a block-based k- medoids partitioning method for clustering objects. To get the initial medoids, we search for an object representative from the block of the standard deviation and the sum of the variable values. We optimized the initial groups to update medoids, so this step can reduce the number of iterations to obtain partitioned data. The block-based k- medoids partitioning method applies to all types of data. To improve clustering accuracy, we operate pre-processing through data standardization. We conducted a series of experiments on eight real data sets and three artificial data to evaluate the proposed method's performance in terms of clustering accuracy. The experiment results show that the Block-based K-Medoids partitioning is more efficient in reducing the number of iterations. The clustering accuracy of the Block-KM for eight real datasets is also comparable to other methods. The data standardization is effective to increase clustering accuracy, especially for block k-medoids, k-means, simple and fast k-medoids, and the Ward method.