COMPARATIVE STUDY CLUSTERING ALGORITHM PERFORMANCE FOR GROUPING EMPLOYEE MEDICAL RECORD DATA AT PT PLN INDONESIA POWER KAMOJANG

PT PLN Indonesia Power Kamojang is one of the subsidiaries of PT PLN (Persero). One of the company's facilities, specifically in the area of employee welfare, is the provision of health benefits or health claims that can be submitted to the company. Until now, these medical records have not...

Full description

Saved in:
Bibliographic Details
Main Author: Elysa Risti, Vera
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/86683
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:PT PLN Indonesia Power Kamojang is one of the subsidiaries of PT PLN (Persero). One of the company's facilities, specifically in the area of employee welfare, is the provision of health benefits or health claims that can be submitted to the company. Until now, these medical records have not been specifically used to support the company's programs. According to the board of directors' circular letter No. 228.K/010/IP/2022 on the Human Experience Management System (HXMS), one of the articles mentions the company's focus in supporting new programs, which include employee health, mental, and physical wellness, namely the Employee Wellness Program. This has motivated the author to contribute by conducting research to support this program using employee medical records from January 2018 to June 2024. This is an effort to provide input to the management in determining the IP Wellness program for employees in PT PLN Indonesia Power Kamojang. This study employs several algorithm models, namely K-means, K-medoids, DBSCAN, Gaussian Mixture Models (GMM), and Agglomerative, supported by a comparison between PCA and Autoencoder dimensionality reduction to determine the best algorithm model for this case. . The optimal K value was determined using the Elbow algorithm. In the search for clustering models and experiments, the author used Google Collab, Jupyter Notebook, and the Python programming language. For each algorithm, the Davies Bouldin Index was used to evaluate each model. The cluster with the smallest DBI value was obtained by AE-GMM at 0.4506. The interpretation based on cost, disease, and age resulted in two clusters: Cluster 0, which had low costs and mild diseases, with fewer elderly and young employees, and Cluster 1, which had high costs and severe diseases, with a higher spread of both elderly and young employees.