Embedded caching framework for privacy preserving data mining

Most existing work on Privacy-Preserving Data Mining (PPDM) focus on enabling conventional data mining algorithms with the ability to run in a secure manner in a multi-party setting. Although various algorithms in data mining have been enhanced to incorporate secure mechanisms for data privacy prese...

Full description

Saved in:
Bibliographic Details
Main Author: Zhai, Ke
Other Authors: Ng Wee Keong
Format: Final Year Project
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10356/16947
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-16947
record_format dspace
spelling sg-ntu-dr.10356-169472023-03-03T20:39:17Z Embedded caching framework for privacy preserving data mining Zhai, Ke Ng Wee Keong School of Computer Engineering Centre for Advanced Information Systems DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications Most existing work on Privacy-Preserving Data Mining (PPDM) focus on enabling conventional data mining algorithms with the ability to run in a secure manner in a multi-party setting. Although various algorithms in data mining have been enhanced to incorporate secure mechanisms for data privacy preservation, their computation performance is far too high to allow them to be practically useful. This is especially true for those algorithms that make use of common cryptosystems and/or linear algebra transformation. Such cryptographic and/or transformation operations are indeed costly, particularly when they are performed repeatedly in PPDM algorithms. High computational overheads would render these algorithms less practically applicable. In this report, we address the efficiency issue of several PPDM algorithms by proposing to cache result data that are used more than once by Secure Multi-party Computation (SMC) protocols. For this to be possible, we carefully examine the micro steps of many SMC procedures to identify the repetitive or iterative portions. We then establish a general framework for SMC protocols and generalize intrinsic properties from this investigation so as to be able to evaluate their feasibility and capability for intermediate result caching. We design proper intermediate result caching interfaces for various SMC protocols, then apply them to many PPDM algorithms, thus reduce the overall computational cost by caching intermediate results/data. Our experiments show that the caching technique is generalizable to common data mining algorithms and the efficiency of PPDM algorithms can be greatly improved without compromising data privacy. Besides, we also proposed a novel Secure Scalar Product (SSP) protocol to facilitate our caching concept on SMC protocol family. This protocol is built on an efficient cryptosystem and would be able to incorporate intermediate result caching capability inside. In future iterations, these results could be utilized to derive relevant system input/output without costly cryptographic operations, hence, to further reduce overall computational overheads. Our empirical experiments showed that our proposed protocol is able to significantly reduce cryptographic overheads in SSP operation. Bachelor of Engineering (Computer Engineering) 2009-05-29T02:24:40Z 2009-05-29T02:24:40Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16947 en Nanyang Technological University 85 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
Zhai, Ke
Embedded caching framework for privacy preserving data mining
description Most existing work on Privacy-Preserving Data Mining (PPDM) focus on enabling conventional data mining algorithms with the ability to run in a secure manner in a multi-party setting. Although various algorithms in data mining have been enhanced to incorporate secure mechanisms for data privacy preservation, their computation performance is far too high to allow them to be practically useful. This is especially true for those algorithms that make use of common cryptosystems and/or linear algebra transformation. Such cryptographic and/or transformation operations are indeed costly, particularly when they are performed repeatedly in PPDM algorithms. High computational overheads would render these algorithms less practically applicable. In this report, we address the efficiency issue of several PPDM algorithms by proposing to cache result data that are used more than once by Secure Multi-party Computation (SMC) protocols. For this to be possible, we carefully examine the micro steps of many SMC procedures to identify the repetitive or iterative portions. We then establish a general framework for SMC protocols and generalize intrinsic properties from this investigation so as to be able to evaluate their feasibility and capability for intermediate result caching. We design proper intermediate result caching interfaces for various SMC protocols, then apply them to many PPDM algorithms, thus reduce the overall computational cost by caching intermediate results/data. Our experiments show that the caching technique is generalizable to common data mining algorithms and the efficiency of PPDM algorithms can be greatly improved without compromising data privacy. Besides, we also proposed a novel Secure Scalar Product (SSP) protocol to facilitate our caching concept on SMC protocol family. This protocol is built on an efficient cryptosystem and would be able to incorporate intermediate result caching capability inside. In future iterations, these results could be utilized to derive relevant system input/output without costly cryptographic operations, hence, to further reduce overall computational overheads. Our empirical experiments showed that our proposed protocol is able to significantly reduce cryptographic overheads in SSP operation.
author2 Ng Wee Keong
author_facet Ng Wee Keong
Zhai, Ke
format Final Year Project
author Zhai, Ke
author_sort Zhai, Ke
title Embedded caching framework for privacy preserving data mining
title_short Embedded caching framework for privacy preserving data mining
title_full Embedded caching framework for privacy preserving data mining
title_fullStr Embedded caching framework for privacy preserving data mining
title_full_unstemmed Embedded caching framework for privacy preserving data mining
title_sort embedded caching framework for privacy preserving data mining
publishDate 2009
url http://hdl.handle.net/10356/16947
_version_ 1759857112289116160