A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset

Data mining in medical data has successfully converted raw data into useful information. This information helps the medical experts in improving the diagnosis and treatment of diseases. In this paper, we review studied data mining applications applied exclusively on an open source diabetes dataset....

Full description

Saved in:

Bibliographic Details
Main Authors:	Amatul, Zehra, Tuty Asmawaty, Abdul Kadir, M.A. M., Aznan
Format:	Conference or Workshop Item
Language:	English
Published:	2013
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://umpir.ump.edu.my/id/eprint/5035/1/31-UMP.pdf http://umpir.ump.edu.my/id/eprint/5035/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Malaysia Pahang
Language:	English

id	my.ump.umpir.5035
record_format	eprints
spelling	my.ump.umpir.50352018-04-26T00:56:22Z http://umpir.ump.edu.my/id/eprint/5035/ A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset Amatul, Zehra Tuty Asmawaty, Abdul Kadir M.A. M., Aznan QA75 Electronic computers. Computer science Data mining in medical data has successfully converted raw data into useful information. This information helps the medical experts in improving the diagnosis and treatment of diseases. In this paper, we review studied data mining applications applied exclusively on an open source diabetes dataset. Type II Diabetes Mellitus is one of the silent killer diseases worldwide. According to the World Health Organization, 346 million people are suffering from diabetes worldwide. Diagnosis or prediction of diabetes is done through various data mining techniques such as association, classification, clustering and pattern recognition. The study led to the related open issues of identifying the need of a relation between the major factors that lead to the development of diabetes. This is possible by mining patterns found between the independent and dependant variables in the dataset. This paper compares the classification accuracies of non-processed and pre-processed data. The results clearly show that the pre-processed data gives better classification accuracy. 2013-08-20 Conference or Workshop Item PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/5035/1/31-UMP.pdf Amatul, Zehra and Tuty Asmawaty, Abdul Kadir and M.A. M., Aznan (2013) A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset. In: 3rd International Conference on Software Engineering & Computer Systems (ICSECS - 2013), 20-22 August 2013 , Universiti Malaysia Pahang. pp. 1-10..
institution	Universiti Malaysia Pahang
building	UMP Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang
content_source	UMP Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Amatul, Zehra Tuty Asmawaty, Abdul Kadir M.A. M., Aznan A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset
description	Data mining in medical data has successfully converted raw data into useful information. This information helps the medical experts in improving the diagnosis and treatment of diseases. In this paper, we review studied data mining applications applied exclusively on an open source diabetes dataset. Type II Diabetes Mellitus is one of the silent killer diseases worldwide. According to the World Health Organization, 346 million people are suffering from diabetes worldwide. Diagnosis or prediction of diabetes is done through various data mining techniques such as association, classification, clustering and pattern recognition. The study led to the related open issues of identifying the need of a relation between the major factors that lead to the development of diabetes. This is possible by mining patterns found between the independent and dependant variables in the dataset. This paper compares the classification accuracies of non-processed and pre-processed data. The results clearly show that the pre-processed data gives better classification accuracy.
format	Conference or Workshop Item
author	Amatul, Zehra Tuty Asmawaty, Abdul Kadir M.A. M., Aznan
author_facet	Amatul, Zehra Tuty Asmawaty, Abdul Kadir M.A. M., Aznan
author_sort	Amatul, Zehra
title	A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset
title_short	A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset
title_full	A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset
title_fullStr	A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset
title_full_unstemmed	A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset
title_sort	comparative study on the pre-processing and mining of pima indian diabetes dataset
publishDate	2013
url	http://umpir.ump.edu.my/id/eprint/5035/1/31-UMP.pdf http://umpir.ump.edu.my/id/eprint/5035/
_version_	1643665116398354432

A Comparative Study on the Pre-Processing and Mining of Pima Indian Diabetes Dataset

Similar Items