Enhancing rough set theory attributes selection of KDD Cup 1999

Attribute selection (Feature Selection) is a significant technique for data preprocessing and dimensionality reduction. Rough set has been used for attribute selection with great success. The optimal solution of rough set attribute selection is a subset of attributes called a reduct. Rough set uses...

Full description

Saved in:
Bibliographic Details
Main Authors: Jebur, Hamid H., Maarof, Mohd. Aizaini, Zainal, Anazida
Format: Article
Published: Asian Research Publishing Network 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/55033/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Description
Summary:Attribute selection (Feature Selection) is a significant technique for data preprocessing and dimensionality reduction. Rough set has been used for attribute selection with great success. The optimal solution of rough set attribute selection is a subset of attributes called a reduct. Rough set uses approximation during reduction process to handle information inconsistency. However, some rough set approaches to attribute selection are inadequate at finding optimal reductions as no perfect heuristic can ensure optimality. Applying rough set for selecting the optimal subset of KDD Cup 1999 does not guarantee finding the optimal reduct of each class of this dataset due to the overlap between the lower and upper approximation of each class and the overlap between the reducts of all classes. This paper introduces a new approach to enhance the reduct of all classes by overcoming the overlap problem of rough set through adding union and voting attributes of all dataset classes as new reducts in addition to the normal reduct. The all reducts were evaluated by using different classification algorithms. The approach led to generate two generic attributes sets that achieved high and comparable accuracy rates as the normal attributes of rough set for the same dataset.