Latent representation models for mining geo-spatial data
With the proliferation of mobile devices and location-based services (LBS), huge amounts of geo-spatial data (e.g., check-ins, points of interest, GPS traces) are becoming available from numerous sources. These data provides us unprecedented opportunities to uncover the complex semantics and dynamic...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/137358 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-137358 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Mathematics of computing::Probability and statistics |
spellingShingle |
Engineering::Computer science and engineering::Mathematics of computing::Probability and statistics Liu, Yiding Latent representation models for mining geo-spatial data |
description |
With the proliferation of mobile devices and location-based services (LBS), huge amounts of geo-spatial data (e.g., check-ins, points of interest, GPS traces) are becoming available from numerous sources. These data provides us unprecedented opportunities to uncover the complex semantics and dynamics of geographical space and human mobility. It is promising to develop data-driven approaches for harnessing geo-spatial data and discovering pivotal knowledge to support a wide spectrum of real-world applications, such as business analytics, spatial information retrieval, traffic management, etc.
In this dissertation, we design latent representation models for mining geo-spatial data. In particular, we aim at discovering knowledge and learning representations for four types of geo-spatial entities: point-of-interest (POI), trajectory, region and spatiotemporal activity, to respectively support four location-based applications: POI recommendation, anomalous trajectory detection, similar region search, and spatiotemporal activity modeling. Correspondingly, our solutions include four types of latent representation models: Matrix Factorization (MF), Variational Sequence Auto-Encoder, Convolutional Neural Network (CNN) and Multimodal Embedding (ME). More importantly, we investigate and improve existing latent representation models by considering the unique properties of different types of geo-spatial data. The detailed contributions are listed as follows:
- Learning POI/user representation for POI recommendation. We provide an all-around evaluation of 11 state-of-the-art POI recommendation models. From the evaluation, we obtain several important findings, such as the superiority of modeling check-in as implicit feedback with ranking-based/weighted Matrix Factorization, and how to model the geographical influence for different types of users. Based on these findings, we can better understand and utilize POI recommendation models in various scenarios. We provide an overall picture of the cutting-edge research on POI recommendation.
- Learning route representation for anomalous trajectory detection. We study the problem of anomalous trajectory detection and propose a novel VAE-based model, namely Gaussian Mixture Variational Sequence AutoEncoder (GM-VSAE). Our GM-VSAE model is able to (1) capture complex sequential information enclosed in trajectories, (2) discover different types of normal routes from trajectories and represent them in a continuous latent space, and (3) support efficient online detection via trajectory generation. Our experiments on two real-world datasets demonstrate that GM-VSAE is more effective than the state-of-the-art baselines and is efficient for online anomalous trajectory detection.
- Learning region representation for similar region search. We study the problem of similar region search on geographical space. We propose a novel solution that is equipped by (1) a deep metric learning approach to learn the similarity that considers the relative locations among the geo-spatial objects within each of the regions; and (2) an efficient branch and bound search algorithm for finding top-N similar regions. Moreover, we propose an approximation method to further improve the efficiency by slightly sacrificing the accuracy. Our experiments on three real world datasets demonstrate that our solution improves both the accuracy and search efficiency by a significant margin compared with the state-of-the-art methods.
- Learning multimodal representation for spatiotemporal activity modeling. We study the problem of spatiotemporal activity modeling, which aims at jointly learning representations of different regions, time units (e.g., hours) and activities (i.e., keywords). We propose a Meta Multimodal Embedding (MME) method that can capture rich semantics of regions, time units and activities from geo-textual data. More importantly, it can explicitly learn temporally transferable priors for different regions in a geographical space, which create specialized activity models for different time periods. We further extend the MME model to a Spatial-Aware MME model that incorporates geographical influence between nearby regions. Our experiments on two real world datasets demonstrate that our proposed methods can better answering region prediction queries on nonstationary geo-textual data. |
author2 |
Cong Gao |
author_facet |
Cong Gao Liu, Yiding |
format |
Thesis-Doctor of Philosophy |
author |
Liu, Yiding |
author_sort |
Liu, Yiding |
title |
Latent representation models for mining geo-spatial data |
title_short |
Latent representation models for mining geo-spatial data |
title_full |
Latent representation models for mining geo-spatial data |
title_fullStr |
Latent representation models for mining geo-spatial data |
title_full_unstemmed |
Latent representation models for mining geo-spatial data |
title_sort |
latent representation models for mining geo-spatial data |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/137358 |
_version_ |
1683493643693850624 |
spelling |
sg-ntu-dr.10356-1373582020-10-28T08:40:56Z Latent representation models for mining geo-spatial data Liu, Yiding Cong Gao School of Computer Science and Engineering gaocong@ntu.edu.sg Engineering::Computer science and engineering::Mathematics of computing::Probability and statistics With the proliferation of mobile devices and location-based services (LBS), huge amounts of geo-spatial data (e.g., check-ins, points of interest, GPS traces) are becoming available from numerous sources. These data provides us unprecedented opportunities to uncover the complex semantics and dynamics of geographical space and human mobility. It is promising to develop data-driven approaches for harnessing geo-spatial data and discovering pivotal knowledge to support a wide spectrum of real-world applications, such as business analytics, spatial information retrieval, traffic management, etc. In this dissertation, we design latent representation models for mining geo-spatial data. In particular, we aim at discovering knowledge and learning representations for four types of geo-spatial entities: point-of-interest (POI), trajectory, region and spatiotemporal activity, to respectively support four location-based applications: POI recommendation, anomalous trajectory detection, similar region search, and spatiotemporal activity modeling. Correspondingly, our solutions include four types of latent representation models: Matrix Factorization (MF), Variational Sequence Auto-Encoder, Convolutional Neural Network (CNN) and Multimodal Embedding (ME). More importantly, we investigate and improve existing latent representation models by considering the unique properties of different types of geo-spatial data. The detailed contributions are listed as follows: - Learning POI/user representation for POI recommendation. We provide an all-around evaluation of 11 state-of-the-art POI recommendation models. From the evaluation, we obtain several important findings, such as the superiority of modeling check-in as implicit feedback with ranking-based/weighted Matrix Factorization, and how to model the geographical influence for different types of users. Based on these findings, we can better understand and utilize POI recommendation models in various scenarios. We provide an overall picture of the cutting-edge research on POI recommendation. - Learning route representation for anomalous trajectory detection. We study the problem of anomalous trajectory detection and propose a novel VAE-based model, namely Gaussian Mixture Variational Sequence AutoEncoder (GM-VSAE). Our GM-VSAE model is able to (1) capture complex sequential information enclosed in trajectories, (2) discover different types of normal routes from trajectories and represent them in a continuous latent space, and (3) support efficient online detection via trajectory generation. Our experiments on two real-world datasets demonstrate that GM-VSAE is more effective than the state-of-the-art baselines and is efficient for online anomalous trajectory detection. - Learning region representation for similar region search. We study the problem of similar region search on geographical space. We propose a novel solution that is equipped by (1) a deep metric learning approach to learn the similarity that considers the relative locations among the geo-spatial objects within each of the regions; and (2) an efficient branch and bound search algorithm for finding top-N similar regions. Moreover, we propose an approximation method to further improve the efficiency by slightly sacrificing the accuracy. Our experiments on three real world datasets demonstrate that our solution improves both the accuracy and search efficiency by a significant margin compared with the state-of-the-art methods. - Learning multimodal representation for spatiotemporal activity modeling. We study the problem of spatiotemporal activity modeling, which aims at jointly learning representations of different regions, time units (e.g., hours) and activities (i.e., keywords). We propose a Meta Multimodal Embedding (MME) method that can capture rich semantics of regions, time units and activities from geo-textual data. More importantly, it can explicitly learn temporally transferable priors for different regions in a geographical space, which create specialized activity models for different time periods. We further extend the MME model to a Spatial-Aware MME model that incorporates geographical influence between nearby regions. Our experiments on two real world datasets demonstrate that our proposed methods can better answering region prediction queries on nonstationary geo-textual data. Doctor of Philosophy 2020-03-18T06:52:21Z 2020-03-18T06:52:21Z 2020 Thesis-Doctor of Philosophy Liu, Y. (2020). Latent representation models for mining geo-spatial data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/137358 10.32657/10356/137358 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |