Utility-driven k-anonymization of public transport user data

In this paper, we propose a k-anonymity approach that prioritizes the generalization of attributes based on their utility. We focus on transport data, which we consider a special case in which many or all attributes are quasi-identifiers (e.g., origin, destination, ride start time), as they allow co...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Bhati, Bhawani Shanker, Ivanchev, Jordan, Bojic, Iva, Datta, Anwitaman, Eckhoff, David
مؤلفون آخرون:	School of Computer Science and Engineering
التنسيق:	مقال
اللغة:	English
منشور في:	2021
الموضوعات:	Engineering::Computer science and engineering Clustering K-anonymity
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/146652
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-146652
record_format	dspace
spelling	sg-ntu-dr.10356-1466522021-03-04T05:50:55Z Utility-driven k-anonymization of public transport user data Bhati, Bhawani Shanker Ivanchev, Jordan Bojic, Iva Datta, Anwitaman Eckhoff, David School of Computer Science and Engineering Engineering::Computer science and engineering Clustering K-anonymity In this paper, we propose a k-anonymity approach that prioritizes the generalization of attributes based on their utility. We focus on transport data, which we consider a special case in which many or all attributes are quasi-identifiers (e.g., origin, destination, ride start time), as they allow correlation with easily observable auxiliary data. The novelty in our approach lies in introducing normalization techniques as well as distance and utility metrics that allow the consideration of not only numerical attributes but also categorical attributes by representing them in tree or graph form. The prioritization of the attributes in the generalization process is based on the attributes’ utility and can further be influenced by either automatically or manually assigned attribute weights. We evaluate and compare different options for all components of our mechanism as well as present an extensive performance evaluation of our approach using real-world data. Lastly, we show in which cases suppression of records can counter-intuitively lead to higher data utility. National Research Foundation (NRF) Published version This work was supported by the Singapore National Research Foundation through the Campus for Research Excellence and Technological Enterprise (CREATE) Programme. 2021-03-04T05:50:55Z 2021-03-04T05:50:55Z 2021 Journal Article Bhati, B. S., Ivanchev, J., Bojic, I., Datta, A., & Eckhoff, D. (2021). Utility-driven k-anonymization of public transport user data. IEEE Access, 9, 23608-23623. doi:10.1109/ACCESS.2021.3055505 2169-3536 https://hdl.handle.net/10356/146652 10.1109/ACCESS.2021.3055505 2-s2.0-85100468873 9 23608 23623 en IEEE Access © 2021 IEEE. This journal is 100% open access, which means that all content is freely available without charge to users or their institutions. All articles accepted after 12 June 2019 are published under a CC BY 4.0 license, and the author retains copyright. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, as long as proper attribution is given. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Clustering K-anonymity
spellingShingle	Engineering::Computer science and engineering Clustering K-anonymity Bhati, Bhawani Shanker Ivanchev, Jordan Bojic, Iva Datta, Anwitaman Eckhoff, David Utility-driven k-anonymization of public transport user data
description	In this paper, we propose a k-anonymity approach that prioritizes the generalization of attributes based on their utility. We focus on transport data, which we consider a special case in which many or all attributes are quasi-identifiers (e.g., origin, destination, ride start time), as they allow correlation with easily observable auxiliary data. The novelty in our approach lies in introducing normalization techniques as well as distance and utility metrics that allow the consideration of not only numerical attributes but also categorical attributes by representing them in tree or graph form. The prioritization of the attributes in the generalization process is based on the attributes’ utility and can further be influenced by either automatically or manually assigned attribute weights. We evaluate and compare different options for all components of our mechanism as well as present an extensive performance evaluation of our approach using real-world data. Lastly, we show in which cases suppression of records can counter-intuitively lead to higher data utility.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Bhati, Bhawani Shanker Ivanchev, Jordan Bojic, Iva Datta, Anwitaman Eckhoff, David
format	Article
author	Bhati, Bhawani Shanker Ivanchev, Jordan Bojic, Iva Datta, Anwitaman Eckhoff, David
author_sort	Bhati, Bhawani Shanker
title	Utility-driven k-anonymization of public transport user data
title_short	Utility-driven k-anonymization of public transport user data
title_full	Utility-driven k-anonymization of public transport user data
title_fullStr	Utility-driven k-anonymization of public transport user data
title_full_unstemmed	Utility-driven k-anonymization of public transport user data
title_sort	utility-driven k-anonymization of public transport user data
publishDate	2021
url	https://hdl.handle.net/10356/146652
_version_	1695706215821082624

Utility-driven k-anonymization of public transport user data

مواد مشابهة