Inferring sensitive user information from tap-on tap-off public transport data

EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of t...

Full description

Saved in:
Bibliographic Details
Main Author: Cheng, Kelly Wen Xin
Other Authors: Cai Wentong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137829
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of the home of residence, work, and activity, as well as the social relationship between pairs of users. As a proof of concept, a stratified sample of 50,000 Card ID was sampled based on passenger type. Only 4 journey records exact to seconds is required for a user to be unique. Hence, targeted de-anonymisation could be performed easily with 4 or less known data points. A rule-based approach was implemented to estimate the home of residence, activity location and purpose of the activity. 28.47% of the users’ home of residence could be estimated from the rule-based approach. The social relationship between pairs of users is calculated using cosine similarity which serves as an indicator for the closeness of social relationship. Use-cases are plotted for pairs of users with various degree of closeness and they were discovered to have the same estimated home of residence. This implies that a family or household tends to commute together. The privacy risks involved de-anonymisation of mobility data using auxiliary information. De-anonymisation leads to exposure of sensitive information for the users. It allows for the identification of vulnerable groups such as Child/Student travelling unaccompanied too. It is possible to further correlate the de-anonymised mobility data against other leaked databases for malicious intent.