Differentiable generative models for trajectory data analytics

With the proliferation of GPS-enabled devices, trajectory data is being generated at an unprecedented speed. The trajectories are typically represented as sequences of discrete sample points, which carry rich spatiotemporal information. Mining patterns and distilling knowledge from such a large amou...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Xiucheng
Other Authors: Cong Gao
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137159
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-137159
record_format dspace
spelling sg-ntu-dr.10356-1371592020-10-28T08:40:56Z Differentiable generative models for trajectory data analytics Li, Xiucheng Cong Gao School of Computer Science and Engineering gaocong@ntu.edu.sg Engineering::Computer science and engineering::Mathematics of computing::Probability and statistics With the proliferation of GPS-enabled devices, trajectory data is being generated at an unprecedented speed. The trajectories are typically represented as sequences of discrete sample points, which carry rich spatiotemporal information. Mining patterns and distilling knowledge from such a large amount of trajectory data could potentially help address many real-world problems and improve our daily life experience. For instance, accurately and efficiently quantifying the similarity between two trajectories is a foundation for many trajectory based applications, such as tracking migration patterns of animals, mining hot routes in cities, trajectory clustering and moving group discovery. In this dissertation, we seek to effectively and efficiently distill knowledge from trajectory data with differentiable generative models. We develop three flexible generative models with efficiency in mind. The resulting models not only are capable of revealing the useful patterns underlying the data, but also admit end-to-end training. Moreover, our methods scale to real-world large-scale trajectory datasets easily. Specifically, we explore three important research problems arising in big trajectory data analytics: 1) learning representation for trajectory similarity computation; 2) learning the travel time distribution for any route on the road network; 3) spatial transition learning on the road network. In the study of trajectory representation learning, we propose the first deep learning approach – t2vec – to learning representations of trajectories that is robust to low data quality, thus supporting accurate and efficient trajectory similarity computation and search. Experiments show that our method is capable of higher accuracy and is at least one order of magnitude faster than the state-of-the-art methods for k-nearest trajectory search. In the study of travel time distribution learning, we develop a novel deep generative model – DeepGTT – to learn the travel time distribution for any route on the route network by conditioning on the real-time traffic. DeepGTT interprets the generation of travel time using a three-layers hierarchical probabilistic model, and describes the generation process in a reasonable manner rather than simply learning by brute force, and thus it not only produces more accurate results but also is quite data-efficient. A variational loss is further derived and the entire model is fully differentiable, which makes the model easily scale to large data sets. In the study of spatial transition learning on the road network, we present a novel deep probabilistic model – DeepST – which unifies three explanatory factors, the past traveled route, the impact of destination and real-time traffic for the route decision. DeepST explains the generation of next road link by conditioning on the representations of the three explanatory factors. To enable effectively sharing the statistical strength, we propose to learn representations of k-destination proxies with an adjoint generative model. To incorporate the impact of real-time traffic, we introduce a high-dimensional latent variable as its representation whose posterior distribution can then be inferred from observations. An efficient inference method is developed within the Variational Auto-Encoders framework to scale DeepST to large-scale data sets. Doctor of Philosophy 2020-03-04T04:25:53Z 2020-03-04T04:25:53Z 2019 Thesis-Doctor of Philosophy Li, X. (2019). Differentiable generative models for trajectory data analytics. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/137159 10.32657/10356/137159 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Mathematics of computing::Probability and statistics
spellingShingle Engineering::Computer science and engineering::Mathematics of computing::Probability and statistics
Li, Xiucheng
Differentiable generative models for trajectory data analytics
description With the proliferation of GPS-enabled devices, trajectory data is being generated at an unprecedented speed. The trajectories are typically represented as sequences of discrete sample points, which carry rich spatiotemporal information. Mining patterns and distilling knowledge from such a large amount of trajectory data could potentially help address many real-world problems and improve our daily life experience. For instance, accurately and efficiently quantifying the similarity between two trajectories is a foundation for many trajectory based applications, such as tracking migration patterns of animals, mining hot routes in cities, trajectory clustering and moving group discovery. In this dissertation, we seek to effectively and efficiently distill knowledge from trajectory data with differentiable generative models. We develop three flexible generative models with efficiency in mind. The resulting models not only are capable of revealing the useful patterns underlying the data, but also admit end-to-end training. Moreover, our methods scale to real-world large-scale trajectory datasets easily. Specifically, we explore three important research problems arising in big trajectory data analytics: 1) learning representation for trajectory similarity computation; 2) learning the travel time distribution for any route on the road network; 3) spatial transition learning on the road network. In the study of trajectory representation learning, we propose the first deep learning approach – t2vec – to learning representations of trajectories that is robust to low data quality, thus supporting accurate and efficient trajectory similarity computation and search. Experiments show that our method is capable of higher accuracy and is at least one order of magnitude faster than the state-of-the-art methods for k-nearest trajectory search. In the study of travel time distribution learning, we develop a novel deep generative model – DeepGTT – to learn the travel time distribution for any route on the route network by conditioning on the real-time traffic. DeepGTT interprets the generation of travel time using a three-layers hierarchical probabilistic model, and describes the generation process in a reasonable manner rather than simply learning by brute force, and thus it not only produces more accurate results but also is quite data-efficient. A variational loss is further derived and the entire model is fully differentiable, which makes the model easily scale to large data sets. In the study of spatial transition learning on the road network, we present a novel deep probabilistic model – DeepST – which unifies three explanatory factors, the past traveled route, the impact of destination and real-time traffic for the route decision. DeepST explains the generation of next road link by conditioning on the representations of the three explanatory factors. To enable effectively sharing the statistical strength, we propose to learn representations of k-destination proxies with an adjoint generative model. To incorporate the impact of real-time traffic, we introduce a high-dimensional latent variable as its representation whose posterior distribution can then be inferred from observations. An efficient inference method is developed within the Variational Auto-Encoders framework to scale DeepST to large-scale data sets.
author2 Cong Gao
author_facet Cong Gao
Li, Xiucheng
format Thesis-Doctor of Philosophy
author Li, Xiucheng
author_sort Li, Xiucheng
title Differentiable generative models for trajectory data analytics
title_short Differentiable generative models for trajectory data analytics
title_full Differentiable generative models for trajectory data analytics
title_fullStr Differentiable generative models for trajectory data analytics
title_full_unstemmed Differentiable generative models for trajectory data analytics
title_sort differentiable generative models for trajectory data analytics
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/137159
_version_ 1683494345867526144