Privacy preserving association rule mining

With the growing advancement in technology, amount of data generated is constantly increasing thus leading to the need for data mining technologies to mine valid patterns and relationships in large data sets. In connection with this dramatic increase in data and the popularity of data mining, issues...

Full description

Saved in:

Bibliographic Details
Main Author:	Suruchi Sharma.
Other Authors:	Ng Wee Keong
Format:	Final Year Project
Language:	English
Published:	2009
Subjects:	DRNTU::Engineering::Computer science and engineering::Information systems
Online Access:	http://hdl.handle.net/10356/16919
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-16919
record_format	dspace
spelling	sg-ntu-dr.10356-169192023-03-03T20:28:50Z Privacy preserving association rule mining Suruchi Sharma. Ng Wee Keong School of Computer Engineering Centre for Advanced Information Systems DRNTU::Engineering::Computer science and engineering::Information systems With the growing advancement in technology, amount of data generated is constantly increasing thus leading to the need for data mining technologies to mine valid patterns and relationships in large data sets. In connection with this dramatic increase in data and the popularity of data mining, issues about privacy preservation have become a great concern. Through this report, I intend to understand privacy preserving mining of association rules and to compare and contrast two randomization approaches to privacy preservation, namely cut‐andpaste randomization and MASK. Firstly, I looked at the process of data mining and its various classes like clustering, classification, prediction and association rule mining. I then looked at association rule mining in greater detail and described the Apriori algorithm for finding frequent itemsets. Following this, I looked at the techniques used by cut‐and‐paste randomization operator and MASK scheme to ensure privacy of the data bring used while accurately mining frequent itemsets from a set of randomized transactions. I implemented cut‐and‐paste and MASK in java using the client‐server architecture for communication in order to investigate their performance in terms of accuracy while maintaining privacy. I conducted several experimentations on the two schemes and found out that at 50% privacy levels, cut‐and‐paste randomization performed slightly better than MASK. However, since the difference in the results was not that that big, I concluded that both schemes performed equally well. I then pointed out certain limitations of the two schemes and explained the condition where these schemes were able to perform well. Bachelor of Engineering (Computer Engineering) 2009-05-29T01:43:47Z 2009-05-29T01:43:47Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16919 en Nanyang Technological University 63 pages 62 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Information systems
spellingShingle	DRNTU::Engineering::Computer science and engineering::Information systems Suruchi Sharma. Privacy preserving association rule mining
description	With the growing advancement in technology, amount of data generated is constantly increasing thus leading to the need for data mining technologies to mine valid patterns and relationships in large data sets. In connection with this dramatic increase in data and the popularity of data mining, issues about privacy preservation have become a great concern. Through this report, I intend to understand privacy preserving mining of association rules and to compare and contrast two randomization approaches to privacy preservation, namely cut‐andpaste randomization and MASK. Firstly, I looked at the process of data mining and its various classes like clustering, classification, prediction and association rule mining. I then looked at association rule mining in greater detail and described the Apriori algorithm for finding frequent itemsets. Following this, I looked at the techniques used by cut‐and‐paste randomization operator and MASK scheme to ensure privacy of the data bring used while accurately mining frequent itemsets from a set of randomized transactions. I implemented cut‐and‐paste and MASK in java using the client‐server architecture for communication in order to investigate their performance in terms of accuracy while maintaining privacy. I conducted several experimentations on the two schemes and found out that at 50% privacy levels, cut‐and‐paste randomization performed slightly better than MASK. However, since the difference in the results was not that that big, I concluded that both schemes performed equally well. I then pointed out certain limitations of the two schemes and explained the condition where these schemes were able to perform well.
author2	Ng Wee Keong
author_facet	Ng Wee Keong Suruchi Sharma.
format	Final Year Project
author	Suruchi Sharma.
author_sort	Suruchi Sharma.
title	Privacy preserving association rule mining
title_short	Privacy preserving association rule mining
title_full	Privacy preserving association rule mining
title_fullStr	Privacy preserving association rule mining
title_full_unstemmed	Privacy preserving association rule mining
title_sort	privacy preserving association rule mining
publishDate	2009
url	http://hdl.handle.net/10356/16919
_version_	1759853363361480704

Privacy preserving association rule mining

Similar Items