Balancing data utility versus information loss in data-privacy protection using k-Anonymity

Data privacy has been an important area of research in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize interests of individuals associated with the data. In order to resolve this issue, privacy techniques can be used to hinder the identification of a p...

Full description

Saved in:

Bibliographic Details
Main Authors:	Esmeel, Thamer Khalil, Hasan, Md Munirul, Kabir, Muhammad Nomani, Ahmad, Firdaus
Format:	Conference or Workshop Item
Language:	English
Published:	IEEE
Subjects:	QA75 Electronic computers. Computer science QA76 Computer software
Online Access:	http://umpir.ump.edu.my/id/eprint/31545/1/Balancing%20Data%20Utility%20versus%20Information%20Loss%20in.pdf http://umpir.ump.edu.my/id/eprint/31545/ http://10.1109/ICSPC50992.2020.9305776
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Malaysia Pahang
Language:	English

id	my.ump.umpir.31545
record_format	eprints
spelling	my.ump.umpir.315452021-08-17T08:37:35Z http://umpir.ump.edu.my/id/eprint/31545/ Balancing data utility versus information loss in data-privacy protection using k-Anonymity Esmeel, Thamer Khalil Hasan, Md Munirul Kabir, Muhammad Nomani Ahmad, Firdaus QA75 Electronic computers. Computer science QA76 Computer software Data privacy has been an important area of research in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize interests of individuals associated with the data. In order to resolve this issue, privacy techniques can be used to hinder the identification of a person through anonymization of the sensitive data in the dataset to protect sensitive information, while the anonymized dataset can be used by the third parties for analysis purposes without obstruction. In this research, we investigated a privacy technique, k-anonymity for different values of k on different number c of columns of the dataset. Next, the information loss due to k-anonymity is computed. The anonymized files go through the classification process by some machine-learning algorithms i.e., Naive Bayes, J48 and neural network in order to check a balance between data anonymity and data utility. Based on the classification accuracy, the optimal values of k and c are obtained, and thus, the optimal k and c can be used for kanonymity algorithm to anonymize optimal number of columns of the dataset. IEEE Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/31545/1/Balancing%20Data%20Utility%20versus%20Information%20Loss%20in.pdf Esmeel, Thamer Khalil and Hasan, Md Munirul and Kabir, Muhammad Nomani and Ahmad, Firdaus Balancing data utility versus information loss in data-privacy protection using k-Anonymity. In: IEEE 8th Conference on Systems, Process and Control (ICSPC), 11–12 December 2020 , Melaka, Malaysia. pp. 158-161.. ISBN 978-1-7281-8861-4/20 http://10.1109/ICSPC50992.2020.9305776
institution	Universiti Malaysia Pahang
building	UMP Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang
content_source	UMP Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English
topic	QA75 Electronic computers. Computer science QA76 Computer software
spellingShingle	QA75 Electronic computers. Computer science QA76 Computer software Esmeel, Thamer Khalil Hasan, Md Munirul Kabir, Muhammad Nomani Ahmad, Firdaus Balancing data utility versus information loss in data-privacy protection using k-Anonymity
description	Data privacy has been an important area of research in recent years. Dataset often consists of sensitive data fields, exposure of which may jeopardize interests of individuals associated with the data. In order to resolve this issue, privacy techniques can be used to hinder the identification of a person through anonymization of the sensitive data in the dataset to protect sensitive information, while the anonymized dataset can be used by the third parties for analysis purposes without obstruction. In this research, we investigated a privacy technique, k-anonymity for different values of k on different number c of columns of the dataset. Next, the information loss due to k-anonymity is computed. The anonymized files go through the classification process by some machine-learning algorithms i.e., Naive Bayes, J48 and neural network in order to check a balance between data anonymity and data utility. Based on the classification accuracy, the optimal values of k and c are obtained, and thus, the optimal k and c can be used for kanonymity algorithm to anonymize optimal number of columns of the dataset.
format	Conference or Workshop Item
author	Esmeel, Thamer Khalil Hasan, Md Munirul Kabir, Muhammad Nomani Ahmad, Firdaus
author_facet	Esmeel, Thamer Khalil Hasan, Md Munirul Kabir, Muhammad Nomani Ahmad, Firdaus
author_sort	Esmeel, Thamer Khalil
title	Balancing data utility versus information loss in data-privacy protection using k-Anonymity
title_short	Balancing data utility versus information loss in data-privacy protection using k-Anonymity
title_full	Balancing data utility versus information loss in data-privacy protection using k-Anonymity
title_fullStr	Balancing data utility versus information loss in data-privacy protection using k-Anonymity
title_full_unstemmed	Balancing data utility versus information loss in data-privacy protection using k-Anonymity
title_sort	balancing data utility versus information loss in data-privacy protection using k-anonymity
publisher	IEEE
url	http://umpir.ump.edu.my/id/eprint/31545/1/Balancing%20Data%20Utility%20versus%20Information%20Loss%20in.pdf http://umpir.ump.edu.my/id/eprint/31545/ http://10.1109/ICSPC50992.2020.9305776
_version_	1709667682225225728

Balancing data utility versus information loss in data-privacy protection using k-Anonymity

Similar Items