Exploiting user and venue characteristics for fine-grained tweet geolocation

Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or c...

Full description

Saved in:
Bibliographic Details
Main Authors: CHONG, Wen Haw, LIM, Ee Peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4077
https://ink.library.smu.edu.sg/context/sis_research/article/5080/viewcontent/Exploit_Fine_Grained_Tweet_Geolocation_2017_afv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5080
record_format dspace
spelling sg-smu-ink.sis_research-50802020-03-25T05:24:07Z Exploiting user and venue characteristics for fine-grained tweet geolocation CHONG, Wen Haw LIM, Ee Peng Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or cities/neighborhoods. First, we conduct empirical analysis to uncover venue and user characteristics for improving geolocation. For venues, we observe spatial homophily, in which venues near each other have more similar tweet content (i.e., text representations) compared to venues further apart. For users, we observe that they are spatially focused and more likely to visit venues near their previous visits. We also find that a substantial proportion of users post one or more geocoded tweet(s), thus providing their location history data. We then propose geolocation models that exploit spatial homophily and spatial focus characteristics plus posting time information. Our models rank candidate venues of test tweets such that the actual posting venue is ranked high. To better tune model parameters, we introduce a learning-to-rank framework. Our best model significantly outperforms state-of-the-art baselines. Furthermore, we show that tweets without any location-indicative words can be geolocated meaningfully as well. 2018-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4077 info:doi/10.1145/3156667 https://ink.library.smu.edu.sg/context/sis_research/article/5080/viewcontent/Exploit_Fine_Grained_Tweet_Geolocation_2017_afv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Tweet geolocation Learning to rank Spatial homophily Spatial focus Databases and Information Systems Numerical Analysis and Scientific Computing Social Media
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Tweet geolocation
Learning to rank
Spatial homophily
Spatial focus
Databases and Information Systems
Numerical Analysis and Scientific Computing
Social Media
spellingShingle Tweet geolocation
Learning to rank
Spatial homophily
Spatial focus
Databases and Information Systems
Numerical Analysis and Scientific Computing
Social Media
CHONG, Wen Haw
LIM, Ee Peng
Exploiting user and venue characteristics for fine-grained tweet geolocation
description Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or cities/neighborhoods. First, we conduct empirical analysis to uncover venue and user characteristics for improving geolocation. For venues, we observe spatial homophily, in which venues near each other have more similar tweet content (i.e., text representations) compared to venues further apart. For users, we observe that they are spatially focused and more likely to visit venues near their previous visits. We also find that a substantial proportion of users post one or more geocoded tweet(s), thus providing their location history data. We then propose geolocation models that exploit spatial homophily and spatial focus characteristics plus posting time information. Our models rank candidate venues of test tweets such that the actual posting venue is ranked high. To better tune model parameters, we introduce a learning-to-rank framework. Our best model significantly outperforms state-of-the-art baselines. Furthermore, we show that tweets without any location-indicative words can be geolocated meaningfully as well.
format text
author CHONG, Wen Haw
LIM, Ee Peng
author_facet CHONG, Wen Haw
LIM, Ee Peng
author_sort CHONG, Wen Haw
title Exploiting user and venue characteristics for fine-grained tweet geolocation
title_short Exploiting user and venue characteristics for fine-grained tweet geolocation
title_full Exploiting user and venue characteristics for fine-grained tweet geolocation
title_fullStr Exploiting user and venue characteristics for fine-grained tweet geolocation
title_full_unstemmed Exploiting user and venue characteristics for fine-grained tweet geolocation
title_sort exploiting user and venue characteristics for fine-grained tweet geolocation
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/4077
https://ink.library.smu.edu.sg/context/sis_research/article/5080/viewcontent/Exploit_Fine_Grained_Tweet_Geolocation_2017_afv.pdf
_version_ 1770574262174744576