ISTITAAH ASSESSMENT CLASSIFICATION SYSTEM BASED ON MACHINE LEARNING USING IMBALANCED DATA

Hajj is an obligatory activity for every Muslim who fulfills the requirements. One of the capable criteria is the health of a prospective pilgrim, or in other words is istitaah. Assessment istitaah for prospective pilgrims is important to ensure pilgrims get health services according to their health...

Full description

Saved in:
Bibliographic Details
Main Author: Masykur Huda, Nuqson
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/47061
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Hajj is an obligatory activity for every Muslim who fulfills the requirements. One of the capable criteria is the health of a prospective pilgrim, or in other words is istitaah. Assessment istitaah for prospective pilgrims is important to ensure pilgrims get health services according to their health conditions. Nowadays, istitaah classification is determined by doctors based on health records. Istitaah assessment is still manually analyzed one by one by the doctor. This study proposes a system that helps doctors to analyze health data faster and more accurately based on patterns from previous data. In this study, the authors designed an assessment classification system based on machine learning. The health examination data with istitaah label is used to build a machine learning model. One of the challenges in istitaah classification assessment is the training dataset is not balanced (imbalanced data). Class “tidak istitaah” and “tidak istitaah sementara” are minority classes on istitaah data. Predictive analysis of unbalanced data can produce bias decisions on minority classes. The minority class in istitaah classification is crucial, because misclassification in the minority class may cause prospective pilgrims unable to make the pilgrimage or deteriorating health conditions of pilgrims. This study explains how to handle imbalanced data to improve the performance of the classifier. To handle imbalanced data, this study using Synthetic Minority Oversampling Technique (SMOTE) methods with adjustment on the determination of neighbor values in binary data. Measurement of classification performance using the K-Fold Cross-Validation method. Based on the test results, the ability of classification in minority classes increased and get Area Under Curve (AUC) score 0.873 and 0.951. Justification of the expert stated that they strongly agreed that the classification assessment system was feasible to be implemented and could increase the effectiveness and efficiency of work with a score of 13.69 on a scale of 15 (91.28%). The results of business process simulations indicate the proposed system can reduce the average cycle time from 17.9 hours to 4.2 hours.