TopicSketch: Real-time Bursty Topic Detection from Twitter

Twitter has become one of the largest microblogging platforms for users around the world to share anything happening around them with friends and beyond. A bursty topic in Twitter is one that triggers a surge of relevant tweets within a short period of time, which often reflects important events of...

Full description

Saved in:
Bibliographic Details
Main Authors: XIE, Wei, ZHU, Feida, Jing JIANG, LIM, Ee-Peng, WANG, Ke
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3200
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4201
record_format dspace
spelling sg-smu-ink.sis_research-42012020-04-24T09:48:15Z TopicSketch: Real-time Bursty Topic Detection from Twitter XIE, Wei ZHU, Feida Jing JIANG, LIM, Ee-Peng WANG, Ke Twitter has become one of the largest microblogging platforms for users around the world to share anything happening around them with friends and beyond. A bursty topic in Twitter is one that triggers a surge of relevant tweets within a short period of time, which often reflects important events of mass interest. How to leverage Twitter for early detection of bursty topics has therefore become an important research problem with immense practical value. Despite the wealth of research work on topic modelling and analysis in Twitter, it remains a challenge to detect bursty topics in real-time. As existing methods can hardly scale to handle the task with the tweet stream in real-time, we propose in this paper TopicSketch, a sketch-based topic model together with a set of techniques to achieve real-time detection. We evaluate our solution on a tweet stream with over 30 million tweets. Our experiment results show both efficiency and effectiveness of our approach. Especially it is also demonstrated that TopicSketch on a single machine can potentially handle hundreds of millions tweets per day, which is on the same scale of the total number of daily tweets in Twitter, and present bursty events in finer-granularity. 2016-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3200 info:doi/10.1109/TKDE.2016.2556661 http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Realtime TopicSketch Tweet stream bursty topic Twitter Real-time systems Databases and Information Systems Numerical Analysis and Scientific Computing Social Media
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Realtime
TopicSketch
Tweet stream
bursty topic
Twitter
Real-time systems
Databases and Information Systems
Numerical Analysis and Scientific Computing
Social Media
spellingShingle Realtime
TopicSketch
Tweet stream
bursty topic
Twitter
Real-time systems
Databases and Information Systems
Numerical Analysis and Scientific Computing
Social Media
XIE, Wei
ZHU, Feida
Jing JIANG,
LIM, Ee-Peng
WANG, Ke
TopicSketch: Real-time Bursty Topic Detection from Twitter
description Twitter has become one of the largest microblogging platforms for users around the world to share anything happening around them with friends and beyond. A bursty topic in Twitter is one that triggers a surge of relevant tweets within a short period of time, which often reflects important events of mass interest. How to leverage Twitter for early detection of bursty topics has therefore become an important research problem with immense practical value. Despite the wealth of research work on topic modelling and analysis in Twitter, it remains a challenge to detect bursty topics in real-time. As existing methods can hardly scale to handle the task with the tweet stream in real-time, we propose in this paper TopicSketch, a sketch-based topic model together with a set of techniques to achieve real-time detection. We evaluate our solution on a tweet stream with over 30 million tweets. Our experiment results show both efficiency and effectiveness of our approach. Especially it is also demonstrated that TopicSketch on a single machine can potentially handle hundreds of millions tweets per day, which is on the same scale of the total number of daily tweets in Twitter, and present bursty events in finer-granularity.
format text
author XIE, Wei
ZHU, Feida
Jing JIANG,
LIM, Ee-Peng
WANG, Ke
author_facet XIE, Wei
ZHU, Feida
Jing JIANG,
LIM, Ee-Peng
WANG, Ke
author_sort XIE, Wei
title TopicSketch: Real-time Bursty Topic Detection from Twitter
title_short TopicSketch: Real-time Bursty Topic Detection from Twitter
title_full TopicSketch: Real-time Bursty Topic Detection from Twitter
title_fullStr TopicSketch: Real-time Bursty Topic Detection from Twitter
title_full_unstemmed TopicSketch: Real-time Bursty Topic Detection from Twitter
title_sort topicsketch: real-time bursty topic detection from twitter
publisher Institutional Knowledge at Singapore Management University
publishDate 2016
url https://ink.library.smu.edu.sg/sis_research/3200
_version_ 1770572976381493248