Affinity-driven blog cascade analysis and prediction

Information propagation within the blogosphere is of much importance in implementing policies, marketing research, launching new products, and other applications. In this paper, we take a microscopic view of the information propagation pattern in blogosphere by investigating blog cascade affinity. A...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Hui, Sun, Aixin, Cui, Jiangtao, Bhowmick, Sourav S.
Other Authors: School of Computer Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/98213
http://hdl.handle.net/10220/17314
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-98213
record_format dspace
spelling sg-ntu-dr.10356-982132020-05-28T07:17:16Z Affinity-driven blog cascade analysis and prediction Li, Hui Sun, Aixin Cui, Jiangtao Bhowmick, Sourav S. School of Computer Engineering DRNTU::Engineering::Computer science and engineering Information propagation within the blogosphere is of much importance in implementing policies, marketing research, launching new products, and other applications. In this paper, we take a microscopic view of the information propagation pattern in blogosphere by investigating blog cascade affinity. A blog cascade is a group of posts linked together discussing about the same topic, and cascade affinity refers to the phenomenon of a blog’s inclination to join a specific cascade. We identify and analyze an array of macroscopic and microscopic content-oblivious features that may affect a blogger’s cascade joining behavior and utilize these features to predict cascade affinity of blogs. Based on these features, we present two non-probabilistic and probabilistic strategies, namely support vector machine (SVM) classification-based approach and Bipartite Markov Random Field-based (BiMRF) approach, respectively, to predict the probability of blogs’ affinity to a cascade and rank them accordingly. Evaluated on a real dataset consisting of 873,496 posts, our experimental results demonstrate that our prediction strategy can generate high quality results ( F1 -measure of 72.5 % for SVM and 71.1 % for BiMRF) comparing with the approaches using traditional or singular features only such as elapsed time, number of participants which is around 11.2 and 8.9 %, respectively. Our experiments also showed that among all features identified, the number of quasi-friends is the most important factor affecting bloggers’ inclination to join cascades. 2013-11-05T07:42:26Z 2019-12-06T19:52:07Z 2013-11-05T07:42:26Z 2019-12-06T19:52:07Z 2013 2013 Journal Article Li, H., Bhowmick, S. S., Sun, A., & Cui, J. (2013). Affinity-driven blog cascade analysis and prediction. Data mining and knowledge discovery. https://hdl.handle.net/10356/98213 http://hdl.handle.net/10220/17314 10.1007/s10618-013-0307-0 en Data mining and knowledge discovery
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Li, Hui
Sun, Aixin
Cui, Jiangtao
Bhowmick, Sourav S.
Affinity-driven blog cascade analysis and prediction
description Information propagation within the blogosphere is of much importance in implementing policies, marketing research, launching new products, and other applications. In this paper, we take a microscopic view of the information propagation pattern in blogosphere by investigating blog cascade affinity. A blog cascade is a group of posts linked together discussing about the same topic, and cascade affinity refers to the phenomenon of a blog’s inclination to join a specific cascade. We identify and analyze an array of macroscopic and microscopic content-oblivious features that may affect a blogger’s cascade joining behavior and utilize these features to predict cascade affinity of blogs. Based on these features, we present two non-probabilistic and probabilistic strategies, namely support vector machine (SVM) classification-based approach and Bipartite Markov Random Field-based (BiMRF) approach, respectively, to predict the probability of blogs’ affinity to a cascade and rank them accordingly. Evaluated on a real dataset consisting of 873,496 posts, our experimental results demonstrate that our prediction strategy can generate high quality results ( F1 -measure of 72.5 % for SVM and 71.1 % for BiMRF) comparing with the approaches using traditional or singular features only such as elapsed time, number of participants which is around 11.2 and 8.9 %, respectively. Our experiments also showed that among all features identified, the number of quasi-friends is the most important factor affecting bloggers’ inclination to join cascades.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Li, Hui
Sun, Aixin
Cui, Jiangtao
Bhowmick, Sourav S.
format Article
author Li, Hui
Sun, Aixin
Cui, Jiangtao
Bhowmick, Sourav S.
author_sort Li, Hui
title Affinity-driven blog cascade analysis and prediction
title_short Affinity-driven blog cascade analysis and prediction
title_full Affinity-driven blog cascade analysis and prediction
title_fullStr Affinity-driven blog cascade analysis and prediction
title_full_unstemmed Affinity-driven blog cascade analysis and prediction
title_sort affinity-driven blog cascade analysis and prediction
publishDate 2013
url https://hdl.handle.net/10356/98213
http://hdl.handle.net/10220/17314
_version_ 1681057394101059584