Data Mining Classification Techniques and Performances on Medical Data

This study evaluates the performance of classification techniques with the application of several software, among them are Rosetta, Tanagra, Weka and Orange. The classification technique has been tested on six medical datasets from the UCI Machine Learning Repository. The study will help researcher...

Full description

Saved in:
Bibliographic Details
Main Author: Benyehmad, Yahyia Mohammed M. Ali
Format: Thesis
Language:English
English
Published: 2006
Subjects:
Online Access:http://etd.uum.edu.my/1864/1/Yahyia_Mohammed_M._Ali_Benyehmad_-_Data_mining_classification_techniques_and_performances_on_medical_data.pdf
http://etd.uum.edu.my/1864/2/Yahyia_Mohammed_M._Ali_Benyehmad_-_Data_mining_classification_techniques_and_performances_on_medical_data.pdf
http://etd.uum.edu.my/1864/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Utara Malaysia
Language: English
English
id my.uum.etd.1864
record_format eprints
spelling my.uum.etd.18642013-07-24T12:13:28Z http://etd.uum.edu.my/1864/ Data Mining Classification Techniques and Performances on Medical Data Benyehmad, Yahyia Mohammed M. Ali QA76 Computer software This study evaluates the performance of classification techniques with the application of several software, among them are Rosetta, Tanagra, Weka and Orange. The classification technique has been tested on six medical datasets from the UCI Machine Learning Repository. The study will help researchers to select the best suitable technique of classification problem for medical datasets in term of classification accuracy. In this thesis, sixteen classification techniques have been evaluated and compared. These are Radial Basis Function (RBF), Multilayer Perceptron (MLP) Neural Networks, Multi Linear Regression (MLR), Logistic Regression (LR), Classification Tree (ID3, C4.5, 548, CART), Naive Bayes (NB), Support Vector Machines (SVM), k- Nearest Neighbors (kNN), Linear discriminate analysis (LDA),Rule based classifier, Standard voting, Voting with object tracking and Standard tuned voting (RSES). The experiments have been validated using 10-fold cross validation method. The results of the study shows that the most suitable classification technique is NB with an average classification accuracy of 90.13% and an average error rate of 9.87%. The worst classification technique is SLR with an average classification accuracy of 50.16% and an average error rate of 49.84%. The classification techniques has been ranked from the best to the worst based on average classification accuracy and average error rate. The top of the rank is NB and the bottom is SLR. The sequence of ranking from the best to the worst is NB, LDA, LR, SVM, C4.5, MLP, RBF, kNN, RuleB, ID3, CART, 548, SV, RSES, V, and SLR. 2006 Thesis NonPeerReviewed application/pdf en http://etd.uum.edu.my/1864/1/Yahyia_Mohammed_M._Ali_Benyehmad_-_Data_mining_classification_techniques_and_performances_on_medical_data.pdf application/pdf en http://etd.uum.edu.my/1864/2/Yahyia_Mohammed_M._Ali_Benyehmad_-_Data_mining_classification_techniques_and_performances_on_medical_data.pdf Benyehmad, Yahyia Mohammed M. Ali (2006) Data Mining Classification Techniques and Performances on Medical Data. Masters thesis, Universiti Utara Malaysia.
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Electronic Theses
url_provider http://etd.uum.edu.my/
language English
English
topic QA76 Computer software
spellingShingle QA76 Computer software
Benyehmad, Yahyia Mohammed M. Ali
Data Mining Classification Techniques and Performances on Medical Data
description This study evaluates the performance of classification techniques with the application of several software, among them are Rosetta, Tanagra, Weka and Orange. The classification technique has been tested on six medical datasets from the UCI Machine Learning Repository. The study will help researchers to select the best suitable technique of classification problem for medical datasets in term of classification accuracy. In this thesis, sixteen classification techniques have been evaluated and compared. These are Radial Basis Function (RBF), Multilayer Perceptron (MLP) Neural Networks, Multi Linear Regression (MLR), Logistic Regression (LR), Classification Tree (ID3, C4.5, 548, CART), Naive Bayes (NB), Support Vector Machines (SVM), k- Nearest Neighbors (kNN), Linear discriminate analysis (LDA),Rule based classifier, Standard voting, Voting with object tracking and Standard tuned voting (RSES). The experiments have been validated using 10-fold cross validation method. The results of the study shows that the most suitable classification technique is NB with an average classification accuracy of 90.13% and an average error rate of 9.87%. The worst classification technique is SLR with an average classification accuracy of 50.16% and an average error rate of 49.84%. The classification techniques has been ranked from the best to the worst based on average classification accuracy and average error rate. The top of the rank is NB and the bottom is SLR. The sequence of ranking from the best to the worst is NB, LDA, LR, SVM, C4.5, MLP, RBF, kNN, RuleB, ID3, CART, 548, SV, RSES, V, and SLR.
format Thesis
author Benyehmad, Yahyia Mohammed M. Ali
author_facet Benyehmad, Yahyia Mohammed M. Ali
author_sort Benyehmad, Yahyia Mohammed M. Ali
title Data Mining Classification Techniques and Performances on Medical Data
title_short Data Mining Classification Techniques and Performances on Medical Data
title_full Data Mining Classification Techniques and Performances on Medical Data
title_fullStr Data Mining Classification Techniques and Performances on Medical Data
title_full_unstemmed Data Mining Classification Techniques and Performances on Medical Data
title_sort data mining classification techniques and performances on medical data
publishDate 2006
url http://etd.uum.edu.my/1864/1/Yahyia_Mohammed_M._Ali_Benyehmad_-_Data_mining_classification_techniques_and_performances_on_medical_data.pdf
http://etd.uum.edu.my/1864/2/Yahyia_Mohammed_M._Ali_Benyehmad_-_Data_mining_classification_techniques_and_performances_on_medical_data.pdf
http://etd.uum.edu.my/1864/
_version_ 1644276538223362048