Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset

The steep rise of cases pertaining to Diabetes Mellitus (DM) condition among global population has encouraged extensive researches on DM, which led to exhaustive accumulation of data related to DM. In this case, data mining and machine learning applications prove to be a powerful tool in transformin...

Full description

Saved in:
Bibliographic Details
Main Authors: Abd Rahman, M. Hafiz Fazren, Wan Salim, Wan Wardatul Amani, Abd-Wahab, Firdaus
Format: Article
Language:English
Published: IIUM Press 2020
Subjects:
Online Access:http://irep.iium.edu.my/83609/1/83609_Risk%20prediction%20analysis%20for%20classifying%20type%202%20diabetes%20occurrence%20using%20local%20dataset_ft.pdf
http://irep.iium.edu.my/83609/
https://journals.iium.edu.my/bnrej/index.php/bnrej/article/view/43
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Islam Antarabangsa Malaysia
Language: English
id my.iium.irep.83609
record_format dspace
spelling my.iium.irep.836092020-10-13T04:29:09Z http://irep.iium.edu.my/83609/ Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset Abd Rahman, M. Hafiz Fazren Wan Salim, Wan Wardatul Amani Abd-Wahab, Firdaus QA300 Analysis R Medicine (General) The steep rise of cases pertaining to Diabetes Mellitus (DM) condition among global population has encouraged extensive researches on DM, which led to exhaustive accumulation of data related to DM. In this case, data mining and machine learning applications prove to be a powerful tool in transforming data into a meaningful knowledge. Several machine learning tools has shown great promise in diabetes classification. However, challenges remain in obtaining an accurate model suitable for real world application. Most disease risk-prediction modelling are found to be specific to a local population. Besides that, real world data are likely to be complex, incomplete and unorganized making it a challenge to develop models around it. This research aims to develop a robust prediction model for classification of type 2 diabetes mellitus (T2DM), with the interest of a Malaysian population, using several well-known machine learning algorithm such as Decision Tree, Support Vector Machine and Naïve Bayers. In order to achieve this, several data pre-processing method is implemented to improve the model performance. The models utilize local based datasets obtain from IIUM medical centre records. Besides that, each models is validated using split and 10 cross fold method. Ultimately, the performance of each model is evaluated and compare based on several statistical metrics that measures the accuracy, precision, sensitivity and efficiency. The final result shows that Random forest model provides the best overall prediction performance in terms of accuracy (0.87), sensitivity (0.9), specificity (0.8), precision (0.9), F1-score (0.9) and AUC value (0.93) (Normal). IIUM Press 2020-05-29 Article PeerReviewed application/pdf en http://irep.iium.edu.my/83609/1/83609_Risk%20prediction%20analysis%20for%20classifying%20type%202%20diabetes%20occurrence%20using%20local%20dataset_ft.pdf Abd Rahman, M. Hafiz Fazren and Wan Salim, Wan Wardatul Amani and Abd-Wahab, Firdaus (2020) Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset. Biological and Natural Resources Engineering Journal, 3 (1). E-ISSN 2637-0719 https://journals.iium.edu.my/bnrej/index.php/bnrej/article/view/43
institution Universiti Islam Antarabangsa Malaysia
building IIUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider International Islamic University Malaysia
content_source IIUM Repository (IREP)
url_provider http://irep.iium.edu.my/
language English
topic QA300 Analysis
R Medicine (General)
spellingShingle QA300 Analysis
R Medicine (General)
Abd Rahman, M. Hafiz Fazren
Wan Salim, Wan Wardatul Amani
Abd-Wahab, Firdaus
Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
description The steep rise of cases pertaining to Diabetes Mellitus (DM) condition among global population has encouraged extensive researches on DM, which led to exhaustive accumulation of data related to DM. In this case, data mining and machine learning applications prove to be a powerful tool in transforming data into a meaningful knowledge. Several machine learning tools has shown great promise in diabetes classification. However, challenges remain in obtaining an accurate model suitable for real world application. Most disease risk-prediction modelling are found to be specific to a local population. Besides that, real world data are likely to be complex, incomplete and unorganized making it a challenge to develop models around it. This research aims to develop a robust prediction model for classification of type 2 diabetes mellitus (T2DM), with the interest of a Malaysian population, using several well-known machine learning algorithm such as Decision Tree, Support Vector Machine and Naïve Bayers. In order to achieve this, several data pre-processing method is implemented to improve the model performance. The models utilize local based datasets obtain from IIUM medical centre records. Besides that, each models is validated using split and 10 cross fold method. Ultimately, the performance of each model is evaluated and compare based on several statistical metrics that measures the accuracy, precision, sensitivity and efficiency. The final result shows that Random forest model provides the best overall prediction performance in terms of accuracy (0.87), sensitivity (0.9), specificity (0.8), precision (0.9), F1-score (0.9) and AUC value (0.93) (Normal).
format Article
author Abd Rahman, M. Hafiz Fazren
Wan Salim, Wan Wardatul Amani
Abd-Wahab, Firdaus
author_facet Abd Rahman, M. Hafiz Fazren
Wan Salim, Wan Wardatul Amani
Abd-Wahab, Firdaus
author_sort Abd Rahman, M. Hafiz Fazren
title Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
title_short Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
title_full Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
title_fullStr Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
title_full_unstemmed Risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
title_sort risk prediction analysis for classifying type 2 diabetes occurrence using local dataset
publisher IIUM Press
publishDate 2020
url http://irep.iium.edu.my/83609/1/83609_Risk%20prediction%20analysis%20for%20classifying%20type%202%20diabetes%20occurrence%20using%20local%20dataset_ft.pdf
http://irep.iium.edu.my/83609/
https://journals.iium.edu.my/bnrej/index.php/bnrej/article/view/43
_version_ 1681489296677142528