PGeotopic: a distributed solution for mining geographical topic models
Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the eff...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/162122 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-162122 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1621222022-10-04T08:11:46Z PGeotopic: a distributed solution for mining geographical topic models Zhao, Kaiqi Cong, Gao Li, Xiucheng School of Computer Science and Engineering Engineering::Computer science and engineering Geographical Topic Model Distributed Machine Learning Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models - it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called {sf PGeoTopic}PGeoTopic, for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems. Ministry of Education (MOE) Nanyang Technological University This work was supported in part by a MOE Tier-2 grant MOE2016-T2-1-137, MOE Tier-1 grants RG114/19 and RG31/17, and an NTU ACE grant. 2022-10-04T08:11:45Z 2022-10-04T08:11:45Z 2020 Journal Article Zhao, K., Cong, G. & Li, X. (2020). PGeotopic: a distributed solution for mining geographical topic models. IEEE Transactions On Knowledge and Data Engineering, 34(2), 881-893. https://dx.doi.org/10.1109/TKDE.2020.2989142 1041-4347 https://hdl.handle.net/10356/162122 10.1109/TKDE.2020.2989142 2-s2.0-85123586100 2 34 881 893 en MOE2016-T2-1-137 RG114/19 RG31/17 IEEE Transactions on Knowledge and Data Engineering © 2020 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Geographical Topic Model Distributed Machine Learning |
spellingShingle |
Engineering::Computer science and engineering Geographical Topic Model Distributed Machine Learning Zhao, Kaiqi Cong, Gao Li, Xiucheng PGeotopic: a distributed solution for mining geographical topic models |
description |
Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models - it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called {sf PGeoTopic}PGeoTopic, for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Zhao, Kaiqi Cong, Gao Li, Xiucheng |
format |
Article |
author |
Zhao, Kaiqi Cong, Gao Li, Xiucheng |
author_sort |
Zhao, Kaiqi |
title |
PGeotopic: a distributed solution for mining geographical topic models |
title_short |
PGeotopic: a distributed solution for mining geographical topic models |
title_full |
PGeotopic: a distributed solution for mining geographical topic models |
title_fullStr |
PGeotopic: a distributed solution for mining geographical topic models |
title_full_unstemmed |
PGeotopic: a distributed solution for mining geographical topic models |
title_sort |
pgeotopic: a distributed solution for mining geographical topic models |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/162122 |
_version_ |
1746219680489537536 |