Lifetime lexical variation in social media

As the rapid growth of online social media attracts a large number of Internet users, the large volume of content generated by these users also provides us with an opportunity to study the lexical variation of people of different ages. In this paper, we present a latent variable model that jointly m...

Full description

Saved in:
Bibliographic Details
Main Authors: LIAO, Lizi, JIANG, Jing, DING, Ying, HUANG, Heyan, LIM, Ee-peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7713
https://ink.library.smu.edu.sg/context/sis_research/article/8716/viewcontent/8942_Article_Text_12470_1_2_20201228.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:As the rapid growth of online social media attracts a large number of Internet users, the large volume of content generated by these users also provides us with an opportunity to study the lexical variation of people of different ages. In this paper, we present a latent variable model that jointly models the lexical content of tweets and Twitter users' ages. Our model inherently assumes that a topic has not only a word distribution but also an age distribution. We propose a Gibbs-EM algorithm to perform inference on our model. Empirical evaluation shows that our model can learn meaningful age-specific topics such as "school" for teenagers and "health" for older people. Our model can also be used for age prediction and performs better than a number of baseline methods.