Twitter-LDA

Latent Dirichlet Allocation (LDA) has been widely used in textual analysis. The original LDA is used to find hidden "topics" in the documents, where a topic is a subject like "arts" or "education" that is discussed in the documents. The original setting in LDA, where ea...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHAO, Wayne Xin, JIANG, Jing, WENG, Jianshu, HE, Jing, LIM, Ee Peng, YAN, Hongfei, LI, Xiaoming
Format: text
Published: Institutional Knowledge at Singapore Management University 2011
Subjects:
Online Access:https://ink.library.smu.edu.sg/researchdata/12
https://github.com/minghui/Twitter-LDA
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Description
Summary:Latent Dirichlet Allocation (LDA) has been widely used in textual analysis. The original LDA is used to find hidden "topics" in the documents, where a topic is a subject like "arts" or "education" that is discussed in the documents. The original setting in LDA, where each word has a topic label, may not work well with Twitter as tweets are short and a single tweet is more likely to talk about one topic. Hence, Twitter-LDA (T-LDA) has been proposed to address this issue. T-LDA also addresses the noisy nature of tweets, where it captures background words in tweets. As experiments in [7] have shown that T-LDA could capture more meaningful topics than LDA in Microblogs. The original setting in Latent Dirichlet Allocation (LDA), where each word has a topic label, may not work well with Twitter as tweets are short and a single tweet is more likely to talk about one topic. Hence, Twitter-LDA (T-LDA) has been proposed to address this issue. T-LDA also addresses the noisy nature of tweets, where it captures background words in tweets.