Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources

Preserving the privacy of individual databases when carrying out statistical calculations has a relatively long history in statistics and had been the focus of much recent attention in machine learning. In this paper, we present a protocol for fitting a logistic regression when the data are held by...

Full description

Saved in:
Bibliographic Details
Main Authors: Nardi, Yuval, FIENBERG, Stephen, HALL, Robert J.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/larc/2
https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1001&context=larc
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.larc-1001
record_format dspace
spelling sg-smu-ink.larc-10012018-07-09T06:03:47Z Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources Nardi, Yuval FIENBERG, Stephen HALL, Robert J. Preserving the privacy of individual databases when carrying out statistical calculations has a relatively long history in statistics and had been the focus of much recent attention in machine learning. In this paper, we present a protocol for fitting a logistic regression when the data are held by separate parties - without actually combining information sources - by exploiting results from the literature on multi-party secure computation. Our protocol provides only the final result of the calculation compared with other methods that share intermediate values and thus present an opportunity for compromise of values in the individual databases. Our paper has two themes: (1) the development of a secure protocol for computing the logistic parameters, and a demonstration of its performances in practice, and (2) the presentation of an amended protocol that speeds up the computation of the logistic function. We illustrate the nature of the calculations and their accuracy using an extract of data from the Current Population Survey divided between two parties. Throughout, we build our protocol from existing cryptographic primitives, thus the novelty is in designing a concrete procedure for private computation of the logistic regression MLE rather than to propose new cryptographic constructions. 2012-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/larc/2 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1001&context=larc http://creativecommons.org/licenses/by-nc-nd/4.0/ LARC Research Publications eng Institutional Knowledge at Singapore Management University Information Security Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
country Singapore
collection InK@SMU
language English
topic Information Security
Numerical Analysis and Scientific Computing
spellingShingle Information Security
Numerical Analysis and Scientific Computing
Nardi, Yuval
FIENBERG, Stephen
HALL, Robert J.
Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources
description Preserving the privacy of individual databases when carrying out statistical calculations has a relatively long history in statistics and had been the focus of much recent attention in machine learning. In this paper, we present a protocol for fitting a logistic regression when the data are held by separate parties - without actually combining information sources - by exploiting results from the literature on multi-party secure computation. Our protocol provides only the final result of the calculation compared with other methods that share intermediate values and thus present an opportunity for compromise of values in the individual databases. Our paper has two themes: (1) the development of a secure protocol for computing the logistic parameters, and a demonstration of its performances in practice, and (2) the presentation of an amended protocol that speeds up the computation of the logistic function. We illustrate the nature of the calculations and their accuracy using an extract of data from the Current Population Survey divided between two parties. Throughout, we build our protocol from existing cryptographic primitives, thus the novelty is in designing a concrete procedure for private computation of the logistic regression MLE rather than to propose new cryptographic constructions.
format text
author Nardi, Yuval
FIENBERG, Stephen
HALL, Robert J.
author_facet Nardi, Yuval
FIENBERG, Stephen
HALL, Robert J.
author_sort Nardi, Yuval
title Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources
title_short Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources
title_full Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources
title_fullStr Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources
title_full_unstemmed Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources
title_sort achieving both valid and secure logistic regression analysis on aggregated data from different private sources
publisher Institutional Knowledge at Singapore Management University
publishDate 2012
url https://ink.library.smu.edu.sg/larc/2
https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1001&context=larc
_version_ 1681132862743511040