Privacy preserving association rule mining
With the growing advancement in technology, amount of data generated is constantly increasing thus leading to the need for data mining technologies to mine valid patterns and relationships in large data sets. In connection with this dramatic increase in data and the popularity of data mining, issues...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/16919 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-16919 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-169192023-03-03T20:28:50Z Privacy preserving association rule mining Suruchi Sharma. Ng Wee Keong School of Computer Engineering Centre for Advanced Information Systems DRNTU::Engineering::Computer science and engineering::Information systems With the growing advancement in technology, amount of data generated is constantly increasing thus leading to the need for data mining technologies to mine valid patterns and relationships in large data sets. In connection with this dramatic increase in data and the popularity of data mining, issues about privacy preservation have become a great concern. Through this report, I intend to understand privacy preserving mining of association rules and to compare and contrast two randomization approaches to privacy preservation, namely cut‐andpaste randomization and MASK. Firstly, I looked at the process of data mining and its various classes like clustering, classification, prediction and association rule mining. I then looked at association rule mining in greater detail and described the Apriori algorithm for finding frequent itemsets. Following this, I looked at the techniques used by cut‐and‐paste randomization operator and MASK scheme to ensure privacy of the data bring used while accurately mining frequent itemsets from a set of randomized transactions. I implemented cut‐and‐paste and MASK in java using the client‐server architecture for communication in order to investigate their performance in terms of accuracy while maintaining privacy. I conducted several experimentations on the two schemes and found out that at 50% privacy levels, cut‐and‐paste randomization performed slightly better than MASK. However, since the difference in the results was not that that big, I concluded that both schemes performed equally well. I then pointed out certain limitations of the two schemes and explained the condition where these schemes were able to perform well. Bachelor of Engineering (Computer Engineering) 2009-05-29T01:43:47Z 2009-05-29T01:43:47Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16919 en Nanyang Technological University 63 pages 62 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems Suruchi Sharma. Privacy preserving association rule mining |
description |
With the growing advancement in technology, amount of data generated is constantly increasing thus leading to the need for data mining technologies to mine valid patterns and relationships in large data sets. In connection with this dramatic increase in data and the popularity of data mining, issues about privacy preservation have become a great concern. Through this report, I intend to understand privacy preserving mining of association rules and to compare and contrast two randomization approaches to privacy preservation, namely cut‐andpaste randomization and MASK. Firstly, I looked at the process of data mining and its various classes like clustering, classification, prediction and association rule mining. I then looked at association rule mining in greater detail and described the Apriori algorithm for finding frequent itemsets. Following this, I looked at the techniques used by cut‐and‐paste randomization operator and MASK scheme to ensure privacy of the data bring used while accurately mining frequent itemsets from a set of randomized transactions. I implemented cut‐and‐paste and MASK in java using the client‐server architecture for communication in order to investigate their performance in terms of accuracy while maintaining privacy. I conducted several experimentations on the two schemes and found out that at 50% privacy levels, cut‐and‐paste randomization performed slightly better than MASK. However, since the difference in the results was not that that big, I concluded that both schemes performed equally well. I then pointed out certain limitations of the two schemes and explained the condition where these schemes were able to perform well. |
author2 |
Ng Wee Keong |
author_facet |
Ng Wee Keong Suruchi Sharma. |
format |
Final Year Project |
author |
Suruchi Sharma. |
author_sort |
Suruchi Sharma. |
title |
Privacy preserving association rule mining |
title_short |
Privacy preserving association rule mining |
title_full |
Privacy preserving association rule mining |
title_fullStr |
Privacy preserving association rule mining |
title_full_unstemmed |
Privacy preserving association rule mining |
title_sort |
privacy preserving association rule mining |
publishDate |
2009 |
url |
http://hdl.handle.net/10356/16919 |
_version_ |
1759853363361480704 |