Predicting the popularity of Web 2.0 items based on user comments

In the current Web 2.0 era, the popularity of Web resources fluctuates ephemerally, based on trends and social interest. As a result, content-based relevance signals are insufficient to meet users' constantly evolving information needs in searching for Web 2.0 items. Incorporating future popula...

Full description

Saved in:
Bibliographic Details
Main Authors: HE, Xiangnan, Gao, Ming, KAN, Min-Yen, LIU, Yiqun, SUGIYAMA, Kazunari
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4228
https://ink.library.smu.edu.sg/context/sis_research/article/5231/viewcontent/sigir2014_he.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:In the current Web 2.0 era, the popularity of Web resources fluctuates ephemerally, based on trends and social interest. As a result, content-based relevance signals are insufficient to meet users' constantly evolving information needs in searching for Web 2.0 items. Incorporating future popularity into ranking is one way to counter this. However, predicting popularity as a third party (as in the case of general search engines) is difficult in practice, due to their limited access to item view histories. To enable popularity prediction externally without excessive crawling, we propose an alternative solution by leveraging user comments, which are more accessible than view counts. Due to the sparsity of comments, traditional solutions that are solely based on view histories do not perform well. To deal with this sparsity, we mine comments to recover additional signal, such as social influence. By modeling comments as a time-aware bipartite graph, we propose a regularization-based ranking algorithm that accounts for temporal, social influence and current popularity factors to predict the future popularity of items. Experimental results on three real-world datasets - crawled from YouTube, Flickr and Last.fm - show that our method consistently outperforms competitive baselines in several evaluation tasks.