PGeotopic: a distributed solution for mining geographical topic models

Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the eff...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhao, Kaiqi, Cong, Gao, Li, Xiucheng
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/162122
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-162122
record_format dspace
spelling sg-ntu-dr.10356-1621222022-10-04T08:11:46Z PGeotopic: a distributed solution for mining geographical topic models Zhao, Kaiqi Cong, Gao Li, Xiucheng School of Computer Science and Engineering Engineering::Computer science and engineering Geographical Topic Model Distributed Machine Learning Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models - it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called {sf PGeoTopic}PGeoTopic, for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems. Ministry of Education (MOE) Nanyang Technological University This work was supported in part by a MOE Tier-2 grant MOE2016-T2-1-137, MOE Tier-1 grants RG114/19 and RG31/17, and an NTU ACE grant. 2022-10-04T08:11:45Z 2022-10-04T08:11:45Z 2020 Journal Article Zhao, K., Cong, G. & Li, X. (2020). PGeotopic: a distributed solution for mining geographical topic models. IEEE Transactions On Knowledge and Data Engineering, 34(2), 881-893. https://dx.doi.org/10.1109/TKDE.2020.2989142 1041-4347 https://hdl.handle.net/10356/162122 10.1109/TKDE.2020.2989142 2-s2.0-85123586100 2 34 881 893 en MOE2016-T2-1-137 RG114/19 RG31/17 IEEE Transactions on Knowledge and Data Engineering © 2020 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Geographical Topic Model
Distributed Machine Learning
spellingShingle Engineering::Computer science and engineering
Geographical Topic Model
Distributed Machine Learning
Zhao, Kaiqi
Cong, Gao
Li, Xiucheng
PGeotopic: a distributed solution for mining geographical topic models
description Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models - it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called {sf PGeoTopic}PGeoTopic, for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Zhao, Kaiqi
Cong, Gao
Li, Xiucheng
format Article
author Zhao, Kaiqi
Cong, Gao
Li, Xiucheng
author_sort Zhao, Kaiqi
title PGeotopic: a distributed solution for mining geographical topic models
title_short PGeotopic: a distributed solution for mining geographical topic models
title_full PGeotopic: a distributed solution for mining geographical topic models
title_fullStr PGeotopic: a distributed solution for mining geographical topic models
title_full_unstemmed PGeotopic: a distributed solution for mining geographical topic models
title_sort pgeotopic: a distributed solution for mining geographical topic models
publishDate 2022
url https://hdl.handle.net/10356/162122
_version_ 1746219680489537536