A comprehensive exploration to the machine learning techniques for diabetes identification
Diabetes mellitus, known as diabetes, is a group of metabolic disorders and has affected hundreds of millions of people. The detection of diabetes is of great importance, concerning its severe complications. There have been plenty of research studies about diabetes identification, many of which are...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/89478 http://hdl.handle.net/10220/47703 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-89478 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-894782020-03-07T11:48:46Z A comprehensive exploration to the machine learning techniques for diabetes identification Wei, Sidong Zhao, Xuejiao Miao, Chunyan School of Computer Science and Engineering 2018 IEEE 4th World Forum on Internet of Things (WF-IoT) NTU-UBC Research Centre of Excellence in Active Living for the Elderly Deep Neural Network DRNTU::Engineering::Computer science and engineering Machine Learning Diabetes mellitus, known as diabetes, is a group of metabolic disorders and has affected hundreds of millions of people. The detection of diabetes is of great importance, concerning its severe complications. There have been plenty of research studies about diabetes identification, many of which are based on the Pima Indian diabetes data set. It’s a data set studying women in Pima Indian population started from 1965, where the onset rate for diabetes is comparatively high. Most of the research studies done before mainly focused on one or two particular complex technique to test the data, while a comprehensive research over many common techniques is missing. In this paper, we make a comprehensive exploration to the most popular techniques (e.g. DNN (Deep Neural Network), SVM (Support Vector Machine), etc.) used to identify diabetes and data preprocessing methods. Basically, we examine these techniques by the accuracy of cross-validation on the Pima Indian data set. We compare the accuracy of each classifier over several ways of data preprocessors and we modify the parameters to improve their accuracy. The best technique we find has 77.86% accuracy using 10-fold cross-validation. We also analyze the relevance between each feature with the classification result. Accepted version 2019-02-19T06:34:52Z 2019-12-06T17:26:36Z 2019-02-19T06:34:52Z 2019-12-06T17:26:36Z 2018 Conference Paper Wei, S., Zhao, X., & Miao, C. (2018). A comprehensive exploration to the machine learning techniques for diabetes identification. 2018 IEEE 4th World Forum on Internet of Things (WF-IoT). doi:10.1109/WF-IoT.2018.8355130 https://hdl.handle.net/10356/89478 http://hdl.handle.net/10220/47703 10.1109/WF-IoT.2018.8355130 208286 en © 2018 Institute of Electrical and Electronics Engineers (IEEE). All rights reserved. This paper was published in 2018 IEEE 4th World Forum on Internet of Things (WF-IoT) and is made available with permission of Institute of Electrical and Electronics Engineers (IEEE). 5 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Deep Neural Network DRNTU::Engineering::Computer science and engineering Machine Learning |
spellingShingle |
Deep Neural Network DRNTU::Engineering::Computer science and engineering Machine Learning Wei, Sidong Zhao, Xuejiao Miao, Chunyan A comprehensive exploration to the machine learning techniques for diabetes identification |
description |
Diabetes mellitus, known as diabetes, is a group of metabolic disorders and has affected hundreds of millions of people. The detection of diabetes is of great importance, concerning its severe complications. There have been plenty of research studies about diabetes identification, many of which are based on the Pima Indian diabetes data set. It’s a data set studying women in Pima Indian population started from 1965, where the onset rate for diabetes is comparatively high. Most of the research studies done before mainly focused on one or two particular complex technique to test the data, while a comprehensive research over many common techniques is missing. In this paper, we make a comprehensive exploration to the most popular techniques (e.g. DNN (Deep Neural Network), SVM (Support Vector Machine), etc.) used to identify diabetes and data preprocessing methods. Basically, we examine these techniques by the accuracy of cross-validation on the Pima Indian data set. We compare the accuracy of each classifier over several ways of data
preprocessors and we modify the parameters to improve their accuracy. The best technique we find has 77.86% accuracy using 10-fold cross-validation. We also analyze the relevance between
each feature with the classification result. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Wei, Sidong Zhao, Xuejiao Miao, Chunyan |
format |
Conference or Workshop Item |
author |
Wei, Sidong Zhao, Xuejiao Miao, Chunyan |
author_sort |
Wei, Sidong |
title |
A comprehensive exploration to the machine learning techniques for diabetes identification |
title_short |
A comprehensive exploration to the machine learning techniques for diabetes identification |
title_full |
A comprehensive exploration to the machine learning techniques for diabetes identification |
title_fullStr |
A comprehensive exploration to the machine learning techniques for diabetes identification |
title_full_unstemmed |
A comprehensive exploration to the machine learning techniques for diabetes identification |
title_sort |
comprehensive exploration to the machine learning techniques for diabetes identification |
publishDate |
2019 |
url |
https://hdl.handle.net/10356/89478 http://hdl.handle.net/10220/47703 |
_version_ |
1681037436498477056 |