Latent representation models for mining geo-spatial data

With the proliferation of mobile devices and location-based services (LBS), huge amounts of geo-spatial data (e.g., check-ins, points of interest, GPS traces) are becoming available from numerous sources. These data provides us unprecedented opportunities to uncover the complex semantics and dynamic...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Yiding
Other Authors: Cong Gao
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/137358
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the proliferation of mobile devices and location-based services (LBS), huge amounts of geo-spatial data (e.g., check-ins, points of interest, GPS traces) are becoming available from numerous sources. These data provides us unprecedented opportunities to uncover the complex semantics and dynamics of geographical space and human mobility. It is promising to develop data-driven approaches for harnessing geo-spatial data and discovering pivotal knowledge to support a wide spectrum of real-world applications, such as business analytics, spatial information retrieval, traffic management, etc. In this dissertation, we design latent representation models for mining geo-spatial data. In particular, we aim at discovering knowledge and learning representations for four types of geo-spatial entities: point-of-interest (POI), trajectory, region and spatiotemporal activity, to respectively support four location-based applications: POI recommendation, anomalous trajectory detection, similar region search, and spatiotemporal activity modeling. Correspondingly, our solutions include four types of latent representation models: Matrix Factorization (MF), Variational Sequence Auto-Encoder, Convolutional Neural Network (CNN) and Multimodal Embedding (ME). More importantly, we investigate and improve existing latent representation models by considering the unique properties of different types of geo-spatial data. The detailed contributions are listed as follows: - Learning POI/user representation for POI recommendation. We provide an all-around evaluation of 11 state-of-the-art POI recommendation models. From the evaluation, we obtain several important findings, such as the superiority of modeling check-in as implicit feedback with ranking-based/weighted Matrix Factorization, and how to model the geographical influence for different types of users. Based on these findings, we can better understand and utilize POI recommendation models in various scenarios. We provide an overall picture of the cutting-edge research on POI recommendation. - Learning route representation for anomalous trajectory detection. We study the problem of anomalous trajectory detection and propose a novel VAE-based model, namely Gaussian Mixture Variational Sequence AutoEncoder (GM-VSAE). Our GM-VSAE model is able to (1) capture complex sequential information enclosed in trajectories, (2) discover different types of normal routes from trajectories and represent them in a continuous latent space, and (3) support efficient online detection via trajectory generation. Our experiments on two real-world datasets demonstrate that GM-VSAE is more effective than the state-of-the-art baselines and is efficient for online anomalous trajectory detection. - Learning region representation for similar region search. We study the problem of similar region search on geographical space. We propose a novel solution that is equipped by (1) a deep metric learning approach to learn the similarity that considers the relative locations among the geo-spatial objects within each of the regions; and (2) an efficient branch and bound search algorithm for finding top-N similar regions. Moreover, we propose an approximation method to further improve the efficiency by slightly sacrificing the accuracy. Our experiments on three real world datasets demonstrate that our solution improves both the accuracy and search efficiency by a significant margin compared with the state-of-the-art methods. - Learning multimodal representation for spatiotemporal activity modeling. We study the problem of spatiotemporal activity modeling, which aims at jointly learning representations of different regions, time units (e.g., hours) and activities (i.e., keywords). We propose a Meta Multimodal Embedding (MME) method that can capture rich semantics of regions, time units and activities from geo-textual data. More importantly, it can explicitly learn temporally transferable priors for different regions in a geographical space, which create specialized activity models for different time periods. We further extend the MME model to a Spatial-Aware MME model that incorporates geographical influence between nearby regions. Our experiments on two real world datasets demonstrate that our proposed methods can better answering region prediction queries on nonstationary geo-textual data.