PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION

Currently, research on the use of machine learning in the health sector, especially in medicine, is growing, plus existing regulatory support. The need for accurate and timely data analysis related to health problems is essential for disease prevention and treatment. However, most research specif...

Full description

Saved in:
Bibliographic Details
Main Author: Berkat
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/84264
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:84264
spelling id-itb.:842642024-08-14T20:41:05ZPATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION Berkat Indonesia Theses Recommender system, CBF, multiclass, LightGBM, SHAP, K-NN, similarity/distance metric. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/84264 Currently, research on the use of machine learning in the health sector, especially in medicine, is growing, plus existing regulatory support. The need for accurate and timely data analysis related to health problems is essential for disease prevention and treatment. However, most research specifically focused on using machine learning to predict specific diseases alone or using only one or two patient medical record data types. The number of types of diseases is very large, and the existing regulations, especially in Indonesia, state that confirming patient diseases is the authority of medical personnel (doctors). Doctors in confirming patients' diseases need comprehensive patient medical record data. Therefore, it is necessary to build an algorithm model to overcome this problem. An artificial intelligence task that can overcome these problems is a recommender system with a multiclass output approach that can provide a top-n output of a patient's disease. Content-based filtering (CBF) is an approach in a recommender system that requires complete data attributes, and medical record data can meet that need. Patient medical record data has many attributes (features) and various data types. Not all of these medical record data features contribute to the patient's disease. Therefore, it is necessary to build an algorithm model to select features that contribute to the patient's disease. The combination of the Light Gradient Boosting Machine (LightGBM) and SHapley Additive exPlanations (SHAP) algorithms is one method that can calculate the contribution value of each feature to the target class and the K-Nearest Neighbors (K-NN) algorithm with different similarity/distance metrics according to the data type can overcome various feature values. This study proposes a recommender system model for patient diagnosis with CBF and multiclass approaches, a combination of LightGBM and SHAP to calculate the contribution value of each feature, and a K-NN algorithm with similarity/distance metric Euclidean and Jaccard to predict diseases. In general, this proposed model performs better than other reference models, with an accuracy of 82.19% and an f1-score of 82.38%. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Currently, research on the use of machine learning in the health sector, especially in medicine, is growing, plus existing regulatory support. The need for accurate and timely data analysis related to health problems is essential for disease prevention and treatment. However, most research specifically focused on using machine learning to predict specific diseases alone or using only one or two patient medical record data types. The number of types of diseases is very large, and the existing regulations, especially in Indonesia, state that confirming patient diseases is the authority of medical personnel (doctors). Doctors in confirming patients' diseases need comprehensive patient medical record data. Therefore, it is necessary to build an algorithm model to overcome this problem. An artificial intelligence task that can overcome these problems is a recommender system with a multiclass output approach that can provide a top-n output of a patient's disease. Content-based filtering (CBF) is an approach in a recommender system that requires complete data attributes, and medical record data can meet that need. Patient medical record data has many attributes (features) and various data types. Not all of these medical record data features contribute to the patient's disease. Therefore, it is necessary to build an algorithm model to select features that contribute to the patient's disease. The combination of the Light Gradient Boosting Machine (LightGBM) and SHapley Additive exPlanations (SHAP) algorithms is one method that can calculate the contribution value of each feature to the target class and the K-Nearest Neighbors (K-NN) algorithm with different similarity/distance metrics according to the data type can overcome various feature values. This study proposes a recommender system model for patient diagnosis with CBF and multiclass approaches, a combination of LightGBM and SHAP to calculate the contribution value of each feature, and a K-NN algorithm with similarity/distance metric Euclidean and Jaccard to predict diseases. In general, this proposed model performs better than other reference models, with an accuracy of 82.19% and an f1-score of 82.38%.
format Theses
author Berkat
spellingShingle Berkat
PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION
author_facet Berkat
author_sort Berkat
title PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION
title_short PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION
title_full PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION
title_fullStr PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION
title_full_unstemmed PATIENT DIAGNOSIS RECOMMENDER SYSTEM BASED ON IMPORTANT FEATURES SELECTION
title_sort patient diagnosis recommender system based on important features selection
url https://digilib.itb.ac.id/gdl/view/84264
_version_ 1822998493605658624