MACHINE LEARNING BASED DECISION SUPPORT SYSTEM FOR DIABETES CLASSIFICATION BASED ON NHANES 2013-2014 DATA

Diabetes is an important public health problem, which is one of the 4 (four) priority noncommunicable diseases targeted for follow-up by world leaders. The results of the Household Health Survey (SKRT) 1995-2001 and Riskesdas 2007 showed that non-communicable diseases such as stroke, hypertension,...

Full description

Saved in:
Bibliographic Details
Main Author: Margaretha Purwandari, Patricia
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/54040
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Diabetes is an important public health problem, which is one of the 4 (four) priority noncommunicable diseases targeted for follow-up by world leaders. The results of the Household Health Survey (SKRT) 1995-2001 and Riskesdas 2007 showed that non-communicable diseases such as stroke, hypertension, diabetes mellitus, tumors, and heart disease were the main causes of death in Indonesia. In 2007, 59.5% of the causes of death in Indonesia were non-communicable diseases. Although achieving diabetes awareness is a crucial issue, there are still few health technologies that are helping to develop this awareness at the individual level in society. Due to the low knowledge about diabetes, many people still don't know if they have diabetes. Therefore, early detection is very important to reduce the increasing prevalence of diabetes. In this final project, the author makes a machine learning model written in Python on Jupyter Notebook with the National Health and Nutrition Examination Survey (NHANES) 2013-2014 as the database for the classification of diabetes categories with the highest possible accuracy. The results of this study can be developed as a clinical decision support system (CDSS) to support early detection of diabetes. In this study, an accuracy of 92% was obtained with the MLP, SVM, and Gradient Boosting algorithms which have 84 features. And an accuracy of 86% on a model that uses only 13 features.