Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis

Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue p...

Full description

Saved in:
Bibliographic Details
Main Authors: Huang, Liping, Li, Zhenghuan, Luo, Ruikang, Su, Rong
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/165598
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-165598
record_format dspace
spelling sg-ntu-dr.10356-1655982023-04-07T15:44:13Z Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis Huang, Liping Li, Zhenghuan Luo, Ruikang Su, Rong School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Missing Data Urban Traffic Sensing Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue poses an obstacle for many further explorations, e.g., the link-based traffic status modeling. Although many studies have focused on tackling this kind of problem, existing studies mainly focus on the situation in which data are missing at random and ignore the distinction between links of missing data. In the practical scenario, traffic speed data are always missing not at random (MNAR). The distinction for recovering missing data on different links has not been studied yet. In this paper, we propose a general linear model based on probabilistic principal component analysis (PPCA) for solving MNAR traffic speed data imputation. Furthermore, we propose a metric, i.e., Pearson score (p-score), for distinguishing links and investigate how the model performs on links with different p-score values. Experimental results show that the new model outperforms the typically used PPCA model, and missing data on links with higher p-score values can be better recovered. Agency for Science, Technology and Research (A*STAR) Published version This study is supported under the RIE2020 Industry Alignment Fund—Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s), and A*STAR under its Industry Alignment Fund (LOA Award I1901E0046). 2023-04-03T06:39:53Z 2023-04-03T06:39:53Z 2023 Journal Article Huang, L., Li, Z., Luo, R. & Su, R. (2023). Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis. Sensors, 23(1), 204-. https://dx.doi.org/10.3390/s23010204 1424-8220 https://hdl.handle.net/10356/165598 10.3390/s23010204 36616802 2-s2.0-85145965506 1 23 204 en I1901E0046 Sensors © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
Missing Data
Urban Traffic Sensing
spellingShingle Engineering::Electrical and electronic engineering
Missing Data
Urban Traffic Sensing
Huang, Liping
Li, Zhenghuan
Luo, Ruikang
Su, Rong
Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
description Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue poses an obstacle for many further explorations, e.g., the link-based traffic status modeling. Although many studies have focused on tackling this kind of problem, existing studies mainly focus on the situation in which data are missing at random and ignore the distinction between links of missing data. In the practical scenario, traffic speed data are always missing not at random (MNAR). The distinction for recovering missing data on different links has not been studied yet. In this paper, we propose a general linear model based on probabilistic principal component analysis (PPCA) for solving MNAR traffic speed data imputation. Furthermore, we propose a metric, i.e., Pearson score (p-score), for distinguishing links and investigate how the model performs on links with different p-score values. Experimental results show that the new model outperforms the typically used PPCA model, and missing data on links with higher p-score values can be better recovered.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Huang, Liping
Li, Zhenghuan
Luo, Ruikang
Su, Rong
format Article
author Huang, Liping
Li, Zhenghuan
Luo, Ruikang
Su, Rong
author_sort Huang, Liping
title Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
title_short Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
title_full Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
title_fullStr Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
title_full_unstemmed Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
title_sort missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
publishDate 2023
url https://hdl.handle.net/10356/165598
_version_ 1764208066768666624