An efficient algorithm for incremental privacy breach on (k, e)-anonymous model

Collaboration between business partners have become crucial these days. An important issue to be addressed is data privacy. In this paper, we address a problem of data privacy based on a prominent privacy model, (k, e)-Anonymous, when a new dataset is to be released, meanwhile there might be existin...

Full description

Saved in:
Bibliographic Details
Main Authors: Bowonsak Srisungsittisunti, Juggapong Natwichai
Format: Conference Proceeding
Published: 2018
Subjects:
Online Access:https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84893299608&origin=inward
http://cmuir.cmu.ac.th/jspui/handle/6653943832/52452
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Chiang Mai University
Description
Summary:Collaboration between business partners have become crucial these days. An important issue to be addressed is data privacy. In this paper, we address a problem of data privacy based on a prominent privacy model, (k, e)-Anonymous, when a new dataset is to be released, meanwhile there might be existing datasets released elsewhere. Since some attackers might obtain multiple versions of the datasets and compare them with the newly released dataset. Though, the privacy of all the datasets have been well-preserved individually, such comparison can lead to an privacy breach. We study the characteristics of the effects of multiple dataset releasing theoretically. It has been found that the privacy breach subjected to the increment occurs when there exists overlapping between any partition of the new dataset with any partition of any existing dataset. Based on our proposed studies, a polynomial-time algorithm is proposed. Not only it needs only considering one previous version of the dataset, it also can skip computing the overlapping partitions. Thus, the computational complexity of the proposed algorithm is only O(pn3) where p is the number of partitions and n is the number of tuples, meanwhile the privacy of all released datasets as well as the optimal solution can be always guaranteed. In addition, the experiments results, which can illustrate the efficiency of our algorithm, on the real-world dataset is presented. © 2013 IEEE.