Extracting fine-grained location with temporal awareness in tweets: A two-stage approach
Twitter has attracted billions of users for life logging and sharing activities and opinions. In their tweets, users often reveal their location information and short-term visiting histories or plans. Capturing user's short-term activities could benefit many applications for providing the right...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/85144 http://hdl.handle.net/10220/43663 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-85144 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-851442020-03-07T11:48:54Z Extracting fine-grained location with temporal awareness in tweets: A two-stage approach Li, Chenliang Sun, Aixin School of Computer Science and Engineering POI extraction Tweet location Twitter has attracted billions of users for life logging and sharing activities and opinions. In their tweets, users often reveal their location information and short-term visiting histories or plans. Capturing user's short-term activities could benefit many applications for providing the right context at the right time and location. In this paper we are interested in extracting locations mentioned in tweets at fine-grained granularity, with temporal awareness. Specifically, we recognize the points-of-interest (POIs) mentioned in a tweet and predict whether the user has visited, is currently at, or will soon visit the mentioned POIs. A POI can be a restaurant, a shopping mall, a bookstore, or any other fine-grained location. Our proposed framework, named TS-Petar (Two-Stage POI Extractor with Temporal Awareness), consists of two main components: a POI inventory and a two-stage time-aware POI tagger. The POI inventory is built by exploiting the crowd wisdom of the Foursquare community. It contains both POIs' formal names and their informal abbreviations, commonly observed in Foursquare check-ins. The time-aware POI tagger, based on the Conditional Random Field (CRF) model, is devised to disambiguate the POI mentions and to resolve their associated temporal awareness accordingly. Three sets of contextual features (linguistic, temporal, and inventory features) and two labeling schema features (OP and BILOU schemas) are explored for the time-aware POI extraction task. Our empirical study shows that the subtask of POI disambiguation and the subtask of temporal awareness resolution call for different feature settings for best performance. We have also evaluated the proposed TS-Petar against several strong baseline methods. The experimental results demonstrate that the two-stage approach achieves the best accuracy and outperforms all baseline methods in terms of both effectiveness and efficiency. MOE (Min. of Education, S’pore) Accepted version 2017-08-31T03:59:10Z 2019-12-06T15:58:02Z 2017-08-31T03:59:10Z 2019-12-06T15:58:02Z 2017 Journal Article Li, C., & Sun, A. (2017). Extracting fine-grained location with temporal awareness in tweets: A two-stage approach. Journal of the Association for Information Science and Technology, 68(7), 1652-1670. 2330-1635 https://hdl.handle.net/10356/85144 http://hdl.handle.net/10220/43663 10.1002/asi.23816 en Journal of the Association for Information Science and Technology © 2017 Association for Information Science and Technology (ASIS&T). This is the author created version of a work that has been peer reviewed and accepted for publication by Journal of the Association for Information Science and Technology, Association for Information Science and Technology (ASIS&T). It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1002/asi.23816]. 29 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
POI extraction Tweet location |
spellingShingle |
POI extraction Tweet location Li, Chenliang Sun, Aixin Extracting fine-grained location with temporal awareness in tweets: A two-stage approach |
description |
Twitter has attracted billions of users for life logging and sharing activities and opinions. In their tweets, users often reveal their location information and short-term visiting histories or plans. Capturing user's short-term activities could benefit many applications for providing the right context at the right time and location. In this paper we are interested in extracting locations mentioned in tweets at fine-grained granularity, with temporal awareness. Specifically, we recognize the points-of-interest (POIs) mentioned in a tweet and predict whether the user has visited, is currently at, or will soon visit the mentioned POIs. A POI can be a restaurant, a shopping mall, a bookstore, or any other fine-grained location. Our proposed framework, named TS-Petar (Two-Stage POI Extractor with Temporal Awareness), consists of two main components: a POI inventory and a two-stage time-aware POI tagger. The POI inventory is built by exploiting the crowd wisdom of the Foursquare community. It contains both POIs' formal names and their informal abbreviations, commonly observed in Foursquare check-ins. The time-aware POI tagger, based on the Conditional Random Field (CRF) model, is devised to disambiguate the POI mentions and to resolve their associated temporal awareness accordingly. Three sets of contextual features (linguistic, temporal, and inventory features) and two labeling schema features (OP and BILOU schemas) are explored for the time-aware POI extraction task. Our empirical study shows that the subtask of POI disambiguation and the subtask of temporal awareness resolution call for different feature settings for best performance. We have also evaluated the proposed TS-Petar against several strong baseline methods. The experimental results demonstrate that the two-stage approach achieves the best accuracy and outperforms all baseline methods in terms of both effectiveness and efficiency. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Li, Chenliang Sun, Aixin |
format |
Article |
author |
Li, Chenliang Sun, Aixin |
author_sort |
Li, Chenliang |
title |
Extracting fine-grained location with temporal awareness in tweets: A two-stage approach |
title_short |
Extracting fine-grained location with temporal awareness in tweets: A two-stage approach |
title_full |
Extracting fine-grained location with temporal awareness in tweets: A two-stage approach |
title_fullStr |
Extracting fine-grained location with temporal awareness in tweets: A two-stage approach |
title_full_unstemmed |
Extracting fine-grained location with temporal awareness in tweets: A two-stage approach |
title_sort |
extracting fine-grained location with temporal awareness in tweets: a two-stage approach |
publishDate |
2017 |
url |
https://hdl.handle.net/10356/85144 http://hdl.handle.net/10220/43663 |
_version_ |
1681038378376626176 |