PGeotopic: a distributed solution for mining geographical topic models
Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the eff...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/162122 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models - it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called {sf PGeoTopic}PGeoTopic, for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems. |
---|