Inferring sensitive user information from tap-on tap-off public transport data
EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/137829 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-137829 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1378292020-04-15T11:56:14Z Inferring sensitive user information from tap-on tap-off public transport data Cheng, Kelly Wen Xin Cai Wentong School of Computer Science and Engineering TUMCREATE aswtcai@ntu.edu.sg Engineering::Computer science and engineering EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of the home of residence, work, and activity, as well as the social relationship between pairs of users. As a proof of concept, a stratified sample of 50,000 Card ID was sampled based on passenger type. Only 4 journey records exact to seconds is required for a user to be unique. Hence, targeted de-anonymisation could be performed easily with 4 or less known data points. A rule-based approach was implemented to estimate the home of residence, activity location and purpose of the activity. 28.47% of the users’ home of residence could be estimated from the rule-based approach. The social relationship between pairs of users is calculated using cosine similarity which serves as an indicator for the closeness of social relationship. Use-cases are plotted for pairs of users with various degree of closeness and they were discovered to have the same estimated home of residence. This implies that a family or household tends to commute together. The privacy risks involved de-anonymisation of mobility data using auxiliary information. De-anonymisation leads to exposure of sensitive information for the users. It allows for the identification of vulnerable groups such as Child/Student travelling unaccompanied too. It is possible to further correlate the de-anonymised mobility data against other leaked databases for malicious intent. Bachelor of Engineering (Computer Science) 2020-04-15T11:56:13Z 2020-04-15T11:56:13Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/137829 en SCSE19-0444 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Cheng, Kelly Wen Xin Inferring sensitive user information from tap-on tap-off public transport data |
description |
EZ-Link card is a contactless stored-value card used for public transport fare payment in Singapore. This project aims to infer sensitive information about users from mobility data and identify the privacy risks involved. Examples of sensitive information inferred are duration and probable area of the home of residence, work, and activity, as well as the social relationship between pairs of users.
As a proof of concept, a stratified sample of 50,000 Card ID was sampled based on passenger type. Only 4 journey records exact to seconds is required for a user to be unique. Hence, targeted de-anonymisation could be performed easily with 4 or less known data points. A rule-based approach was implemented to estimate the home of residence, activity location and purpose of the activity. 28.47% of the users’ home of residence could be estimated from the rule-based approach. The social relationship between pairs of users is calculated using cosine similarity which serves as an indicator for the closeness of social relationship. Use-cases are plotted for pairs of users with various degree of closeness and they were discovered to have the same estimated home of residence. This implies that a family or household tends to commute together.
The privacy risks involved de-anonymisation of mobility data using auxiliary information. De-anonymisation leads to exposure of sensitive information for the users. It allows for the identification of vulnerable groups such as Child/Student travelling unaccompanied too. It is possible to further correlate the de-anonymised mobility data against other leaked databases for malicious intent. |
author2 |
Cai Wentong |
author_facet |
Cai Wentong Cheng, Kelly Wen Xin |
format |
Final Year Project |
author |
Cheng, Kelly Wen Xin |
author_sort |
Cheng, Kelly Wen Xin |
title |
Inferring sensitive user information from tap-on tap-off public transport data |
title_short |
Inferring sensitive user information from tap-on tap-off public transport data |
title_full |
Inferring sensitive user information from tap-on tap-off public transport data |
title_fullStr |
Inferring sensitive user information from tap-on tap-off public transport data |
title_full_unstemmed |
Inferring sensitive user information from tap-on tap-off public transport data |
title_sort |
inferring sensitive user information from tap-on tap-off public transport data |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/137829 |
_version_ |
1681056853171109888 |