Inferring sensitive user information from tap-on tap-off public transport data

EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of t...

Full description

Saved in:
Bibliographic Details
Main Author: Cheng, Kelly Wen Xin
Other Authors: Cai Wentong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137829
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-137829
record_format dspace
spelling sg-ntu-dr.10356-1378292020-04-15T11:56:14Z Inferring sensitive user information from tap-on tap-off public transport data Cheng, Kelly Wen Xin Cai Wentong School of Computer Science and Engineering TUMCREATE aswtcai@ntu.edu.sg Engineering::Computer science and engineering EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of the home of residence, work, and activity, as well as the social relationship between pairs of users. As a proof of concept, a stratified sample of 50,000 Card ID was sampled based on passenger type. Only 4 journey records exact to seconds is required for a user to be unique. Hence, targeted de-anonymisation could be performed easily with 4 or less known data points. A rule-based approach was implemented to estimate the home of residence, activity location and purpose of the activity. 28.47% of the users’ home of residence could be estimated from the rule-based approach. The social relationship between pairs of users is calculated using cosine similarity which serves as an indicator for the closeness of social relationship. Use-cases are plotted for pairs of users with various degree of closeness and they were discovered to have the same estimated home of residence. This implies that a family or household tends to commute together. The privacy risks involved de-anonymisation of mobility data using auxiliary information. De-anonymisation leads to exposure of sensitive information for the users. It allows for the identification of vulnerable groups such as Child/Student travelling unaccompanied too. It is possible to further correlate the de-anonymised mobility data against other leaked databases for malicious intent. Bachelor of Engineering (Computer Science) 2020-04-15T11:56:13Z 2020-04-15T11:56:13Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/137829 en SCSE19-0444 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Cheng, Kelly Wen Xin
Inferring sensitive user information from tap-on tap-off public transport data
description EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of the home of residence, work, and activity, as well as the social relationship between pairs of users. As a proof of concept, a stratified sample of 50,000 Card ID was sampled based on passenger type. Only 4 journey records exact to seconds is required for a user to be unique. Hence, targeted de-anonymisation could be performed easily with 4 or less known data points. A rule-based approach was implemented to estimate the home of residence, activity location and purpose of the activity. 28.47% of the users’ home of residence could be estimated from the rule-based approach. The social relationship between pairs of users is calculated using cosine similarity which serves as an indicator for the closeness of social relationship. Use-cases are plotted for pairs of users with various degree of closeness and they were discovered to have the same estimated home of residence. This implies that a family or household tends to commute together. The privacy risks involved de-anonymisation of mobility data using auxiliary information. De-anonymisation leads to exposure of sensitive information for the users. It allows for the identification of vulnerable groups such as Child/Student travelling unaccompanied too. It is possible to further correlate the de-anonymised mobility data against other leaked databases for malicious intent.
author2 Cai Wentong
author_facet Cai Wentong
Cheng, Kelly Wen Xin
format Final Year Project
author Cheng, Kelly Wen Xin
author_sort Cheng, Kelly Wen Xin
title Inferring sensitive user information from tap-on tap-off public transport data
title_short Inferring sensitive user information from tap-on tap-off public transport data
title_full Inferring sensitive user information from tap-on tap-off public transport data
title_fullStr Inferring sensitive user information from tap-on tap-off public transport data
title_full_unstemmed Inferring sensitive user information from tap-on tap-off public transport data
title_sort inferring sensitive user information from tap-on tap-off public transport data
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/137829
_version_ 1681056853171109888