Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis

Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue p...

Full description

Saved in:

Bibliographic Details
Main Authors:	Huang, Liping, Li, Zhenghuan, Luo, Ruikang, Su, Rong
Other Authors:	School of Electrical and Electronic Engineering
Format:	Article
Language:	English
Published:	2023
Subjects:	Engineering::Electrical and electronic engineering Missing Data Urban Traffic Sensing
Online Access:	https://hdl.handle.net/10356/165598
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue poses an obstacle for many further explorations, e.g., the link-based traffic status modeling. Although many studies have focused on tackling this kind of problem, existing studies mainly focus on the situation in which data are missing at random and ignore the distinction between links of missing data. In the practical scenario, traffic speed data are always missing not at random (MNAR). The distinction for recovering missing data on different links has not been studied yet. In this paper, we propose a general linear model based on probabilistic principal component analysis (PPCA) for solving MNAR traffic speed data imputation. Furthermore, we propose a metric, i.e., Pearson score (p-score), for distinguishing links and investigate how the model performs on links with different p-score values. Experimental results show that the new model outperforms the typically used PPCA model, and missing data on links with higher p-score values can be better recovered.

Missing traffic data imputation with a linear generative model based on probabilistic principal component analysis

Similar Items