Fault tolerant privacy preseving decision tree induction
Privacy-Preserving Data Mining (PPDM) is an emerging technology that allows many parties to gain a special knowledge of their combined information. However, this information usually contains private data that can not be disclosed to any parties. Various techniques and algorithms have been p...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/39732 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Privacy-Preserving Data Mining (PPDM) is an emerging technology that allows many
parties to gain a special knowledge of their combined information. However, this information
usually contains private data that can not be disclosed to any parties. Various
techniques and algorithms have been proposed and developed to achieve the goal
without compromising individual privacy. These techniques usually depend highly on
Secure Multi-Party Computation (SMC) protocol that makes use of complex cryptography
protocol. These cryptography protocols alone are very expensive and usually have
considerably a huge time complexity especially in high dimensional and huge dataset.
Combined with the nature of the data mining algorithm that is an iterative process and
also current network infrastructure that is considerably slow compared with the current
computer processing speed, PPDM is extremely expensive process.
In order to exchange data between parties in PPDM algorithm, we require network
infrastructure. As we know, nowadays our network infrastructure is not reliable enough
to guarantee its service. As a result, there is a probability that a network failure might
occur in the middle of the algorithm execution. Considering that PPDM algorithm can
spend days or months in order to complete its process, it would be very expensive to reexecute
the algorithm each time a network failure occurs. In this paper, we would suggest
a system that could handle a certain level of network failure to avoid re-executing the
algorithm over and over from beginning. We will examine the algorithm and its secure
protocol step by step and suggest many techniques in order to handle each case by case
scenario of network failure that might happen anytime in the process. |
---|