Missing data estimation on heart disease using artificial neural network and rough set theory

The objective of this research is to implement a method for estimating the real missing data in heart disease datasets and to show how it affects the resulting knowledge. Missing data is common problem in Knowledge Discovery from Database (KDD) processes that can lead significant error in extracted...

Full description

Saved in:
Bibliographic Details
Main Authors: A.F.M., Hani, N.A., Setiawan, P.A., Venkatachalam
Format: Conference or Workshop Item
Published: 2007
Subjects:
Online Access:http://eprints.utp.edu.my/398/1/paper.pdf
http://www.scopus.com/inward/record.url?eid=2-s2.0-57949088688&partnerID=40&md5=75bbf75b9828358c3bcaaa97f01b7095
http://eprints.utp.edu.my/398/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Petronas
id my.utp.eprints.398
record_format eprints
spelling my.utp.eprints.3982017-01-19T08:26:59Z Missing data estimation on heart disease using artificial neural network and rough set theory A.F.M., Hani N.A., Setiawan P.A., Venkatachalam TK Electrical engineering. Electronics Nuclear engineering The objective of this research is to implement a method for estimating the real missing data in heart disease datasets and to show how it affects the resulting knowledge. Missing data is common problem in Knowledge Discovery from Database (KDD) processes that can lead significant error in extracted knowledge. We use hybridization of Artificial Neural Network and Rough Set Theory (ANNRST) to estimate the real missing data on heart disease from UCI (University of California, Irvine) datasets [1]. ANN with reduced input features is used to estimate the missing data. RST is used to reduce the dimensionality of input features and to extract the knowledge as reducts and rules from heart disease datasets with estimated missing data. RST, decomposition tree, Local Transfer Function Classifier (LTF-C) and k-Nearest Neighbor (k-NN) classifier are used to calculate the accuracy. Comparative study with k-NN estimation, most common attribute value filling and deletion of missing data are made to evaluate the extracted knowledge. ANNRST can be considered as the appropriate estimation method when strong relationship between original complete datasets and estimated datasets is important (the estimated datasets really represent the nature of original complete datasets) as it gives the best accuracy and coverage for almost all the classifiers. ©2007 IEEE. 2007 Conference or Workshop Item NonPeerReviewed application/pdf http://eprints.utp.edu.my/398/1/paper.pdf http://www.scopus.com/inward/record.url?eid=2-s2.0-57949088688&partnerID=40&md5=75bbf75b9828358c3bcaaa97f01b7095 A.F.M., Hani and N.A., Setiawan and P.A., Venkatachalam (2007) Missing data estimation on heart disease using artificial neural network and rough set theory. In: 2007 International Conference on Intelligent and Advanced Systems, ICIAS 2007, 25 November 2007 through 28 November 2007, Kuala Lumpur. http://eprints.utp.edu.my/398/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
A.F.M., Hani
N.A., Setiawan
P.A., Venkatachalam
Missing data estimation on heart disease using artificial neural network and rough set theory
description The objective of this research is to implement a method for estimating the real missing data in heart disease datasets and to show how it affects the resulting knowledge. Missing data is common problem in Knowledge Discovery from Database (KDD) processes that can lead significant error in extracted knowledge. We use hybridization of Artificial Neural Network and Rough Set Theory (ANNRST) to estimate the real missing data on heart disease from UCI (University of California, Irvine) datasets [1]. ANN with reduced input features is used to estimate the missing data. RST is used to reduce the dimensionality of input features and to extract the knowledge as reducts and rules from heart disease datasets with estimated missing data. RST, decomposition tree, Local Transfer Function Classifier (LTF-C) and k-Nearest Neighbor (k-NN) classifier are used to calculate the accuracy. Comparative study with k-NN estimation, most common attribute value filling and deletion of missing data are made to evaluate the extracted knowledge. ANNRST can be considered as the appropriate estimation method when strong relationship between original complete datasets and estimated datasets is important (the estimated datasets really represent the nature of original complete datasets) as it gives the best accuracy and coverage for almost all the classifiers. ©2007 IEEE.
format Conference or Workshop Item
author A.F.M., Hani
N.A., Setiawan
P.A., Venkatachalam
author_facet A.F.M., Hani
N.A., Setiawan
P.A., Venkatachalam
author_sort A.F.M., Hani
title Missing data estimation on heart disease using artificial neural network and rough set theory
title_short Missing data estimation on heart disease using artificial neural network and rough set theory
title_full Missing data estimation on heart disease using artificial neural network and rough set theory
title_fullStr Missing data estimation on heart disease using artificial neural network and rough set theory
title_full_unstemmed Missing data estimation on heart disease using artificial neural network and rough set theory
title_sort missing data estimation on heart disease using artificial neural network and rough set theory
publishDate 2007
url http://eprints.utp.edu.my/398/1/paper.pdf
http://www.scopus.com/inward/record.url?eid=2-s2.0-57949088688&partnerID=40&md5=75bbf75b9828358c3bcaaa97f01b7095
http://eprints.utp.edu.my/398/
_version_ 1738655060961787904