Privacy preservation for associative classification

© 2014 Wiley Periodicals, Inc. Privacy preservation is becoming a critical issue to data-mining processes. In practice, a data transformation process is often needed to preserve privacy. However, data transformation would introduce a data quality issue. In this case, the impact on data quality due t...

Full description

Saved in:
Bibliographic Details
Main Authors: Harnsamut,N., Natwichai,J., Sun,X., Li,X.
Format: Article
Published: Wiley-Blackwell 2015
Subjects:
Online Access:http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=84911003397&origin=inward
http://cmuir.cmu.ac.th/handle/6653943832/39069
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Chiang Mai University
Description
Summary:© 2014 Wiley Periodicals, Inc. Privacy preservation is becoming a critical issue to data-mining processes. In practice, a data transformation process is often needed to preserve privacy. However, data transformation would introduce a data quality issue. In this case, the impact on data quality due to the data transformation should be estimated and made clear to the user of the data transformation process. In this article, we consider the problem of k-anonymization transformation in associative classification. The privacy preservation and data quality issues are considered in twofold. First, we propose a frequency-based data quality metric to represent the data quality for associative classification. Second, a novel heuristic algorithm, namely minimum classification correction rate transformation, is proposed. The algorithm is guided by the classification correction rate of the given datasets. We validate our proposed metric and algorithm with University of California-Irvine repository datasets. The experiment results have shown that our proposed metric can effectively demonstrate the data quality for associative classification. The results also show that the proposed algorithm is not only efficient but also highly effective.