Differentially private histogram publication

Differential privacy (DP) is a promising scheme for releasing the results of statistical queries on sensitive data, with strong privacy guarantees against adversaries with arbitrary background knowledge. Existing studies on DP mostly focus on simple aggregations such as counts. This paper investigat...

Full description

Saved in:
Bibliographic Details
Main Authors: Xu, Jia, Zhang, Zhenjie, Xiao, Xiaokui, Yang, Yin, Yu, Ge
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/99437
http://hdl.handle.net/10220/13029
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-99437
record_format dspace
spelling sg-ntu-dr.10356-994372020-05-28T07:17:31Z Differentially private histogram publication Xu, Jia Zhang, Zhenjie Xiao, Xiaokui Yang, Yin Yu, Ge School of Computer Engineering IEEE International Conference on Data Engineering (28th : 2012 : Washington, D. C., US) DRNTU::Engineering::Computer science and engineering Differential privacy (DP) is a promising scheme for releasing the results of statistical queries on sensitive data, with strong privacy guarantees against adversaries with arbitrary background knowledge. Existing studies on DP mostly focus on simple aggregations such as counts. This paper investigates the publication of DP-compliant histograms, which is an important analytical tool for showing the distribution of a random variable, e.g., hospital bill size for certain patients. Compared to simple aggregations whose results are purely numerical, a histogram query is inherently more complex, since it must also determine its structure, i.e., the ranges of the bins. As we demonstrate in the paper, a DP-compliant histogram with finer bins may actually lead to significantly lower accuracy than a coarser one, since the former requires stronger perturbations in order to satisfy DP. Moreover, the histogram structure itself may reveal sensitive information, which further complicates the problem. Motivated by this, we propose two novel algorithms, namely Noise First and Structure First, for computing DP-compliant histograms. Their main difference lies in the relative order of the noise injection and the histogram structure computation steps. Noise First has the additional benefit that it can improve the accuracy of an already published DP-complaint histogram computed using a naiive method. Going one step further, we extend both solutions to answer arbitrary range queries. Extensive experiments, using several real data sets, confirm that the proposed methods output highly accurate query answers, and consistently outperform existing competitors. ASTAR (Agency for Sci., Tech. and Research, S’pore) 2013-08-06T03:07:00Z 2019-12-06T20:07:15Z 2013-08-06T03:07:00Z 2019-12-06T20:07:15Z 2012 2012 Conference Paper https://hdl.handle.net/10356/99437 http://hdl.handle.net/10220/13029 10.1109/ICDE.2012.48 en
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Xu, Jia
Zhang, Zhenjie
Xiao, Xiaokui
Yang, Yin
Yu, Ge
Differentially private histogram publication
description Differential privacy (DP) is a promising scheme for releasing the results of statistical queries on sensitive data, with strong privacy guarantees against adversaries with arbitrary background knowledge. Existing studies on DP mostly focus on simple aggregations such as counts. This paper investigates the publication of DP-compliant histograms, which is an important analytical tool for showing the distribution of a random variable, e.g., hospital bill size for certain patients. Compared to simple aggregations whose results are purely numerical, a histogram query is inherently more complex, since it must also determine its structure, i.e., the ranges of the bins. As we demonstrate in the paper, a DP-compliant histogram with finer bins may actually lead to significantly lower accuracy than a coarser one, since the former requires stronger perturbations in order to satisfy DP. Moreover, the histogram structure itself may reveal sensitive information, which further complicates the problem. Motivated by this, we propose two novel algorithms, namely Noise First and Structure First, for computing DP-compliant histograms. Their main difference lies in the relative order of the noise injection and the histogram structure computation steps. Noise First has the additional benefit that it can improve the accuracy of an already published DP-complaint histogram computed using a naiive method. Going one step further, we extend both solutions to answer arbitrary range queries. Extensive experiments, using several real data sets, confirm that the proposed methods output highly accurate query answers, and consistently outperform existing competitors.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Xu, Jia
Zhang, Zhenjie
Xiao, Xiaokui
Yang, Yin
Yu, Ge
format Conference or Workshop Item
author Xu, Jia
Zhang, Zhenjie
Xiao, Xiaokui
Yang, Yin
Yu, Ge
author_sort Xu, Jia
title Differentially private histogram publication
title_short Differentially private histogram publication
title_full Differentially private histogram publication
title_fullStr Differentially private histogram publication
title_full_unstemmed Differentially private histogram publication
title_sort differentially private histogram publication
publishDate 2013
url https://hdl.handle.net/10356/99437
http://hdl.handle.net/10220/13029
_version_ 1681056141867483136