Topic-driven reader comments summarization
Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking mechanisms for comments (e.g., by recency or by user...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2013
|
Online Access: | https://hdl.handle.net/10356/97966 http://hdl.handle.net/10220/12257 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-97966 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-979662020-05-28T07:19:24Z Topic-driven reader comments summarization Ma, Zongyang Sun, Aixin Yuan, Quan Cong, Gao School of Computer Engineering International conference on Information and knowledge management (21st : 2012 : Maui, USA) Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking mechanisms for comments (e.g., by recency or by user rating) fail to offer an overall picture of topics discussed in comments. In this paper, we first propose to study Topic-driven Reader Comments Summarization (Torcs) problem. We observe that many news articles from a news stream are related to each other; so are their comments. Hence, news articles and their associated comments provide context information for user commenting. To implicitly capture the context information, we propose two topic models to address the Torcs problem, namely, Master-Slave Topic Model (MSTM) and Extended Master-Slave Topic Model (EXTM). Both models treat a news article as a master document and each of its comments as a slave document. MSTM model constrains that the topics discussed in comments have to be derived from the commenting news article. On the other hand, EXTM model allows generating words of comments using both the topics derived from the commenting news article, and the topics derived from all comments themselves. Both models are used to group comments into topic clusters. We then use two ranking mechanisms Maximal Marginal Relevance (MMR) and Rating & Length (RL) to select a few most representative comments from each comment cluster. To evaluate the two models, we conducted experiments on 1005 Yahoo! News articles with more than one million comments. Our experimental results show that EXTM significantly outperforms MSTM by perplexity. Through a user study, we also confirm that the comment summary generated by EXTM achieves better intra-cluster topic cohesion and inter-cluster topic diversity. 2013-07-25T07:03:35Z 2019-12-06T19:48:52Z 2013-07-25T07:03:35Z 2019-12-06T19:48:52Z 2012 2012 Conference Paper Ma, Z., Sun, A., Yuan, Q., & Cong, G. (2012). Topic-driven reader comments summarization. Proceedings of the 21st ACM international conference on Information and knowledge management. https://hdl.handle.net/10356/97966 http://hdl.handle.net/10220/12257 10.1145/2396761.2396798 en © 2012 ACM. |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
description |
Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking mechanisms for comments (e.g., by recency or by user rating) fail to offer an overall picture of topics discussed in comments. In this paper, we first propose to study Topic-driven Reader Comments Summarization (Torcs) problem. We observe that many news articles from a news stream are related to each other; so are their comments. Hence, news articles and their associated comments provide context information for user commenting. To implicitly capture the context information, we propose two topic models to address the Torcs problem, namely, Master-Slave Topic Model (MSTM) and Extended Master-Slave Topic Model (EXTM). Both models treat a news article as a master document and each of its comments as a slave document. MSTM model constrains that the topics discussed in comments have to be derived from the commenting news article. On the other hand, EXTM model allows generating words of comments using both the topics derived from the commenting news article, and the topics derived from all comments themselves. Both models are used to group comments into topic clusters. We then use two ranking mechanisms Maximal Marginal Relevance (MMR) and Rating & Length (RL) to select a few most representative comments from each comment cluster. To evaluate the two models, we conducted experiments on 1005 Yahoo! News articles with more than one million comments. Our experimental results show that EXTM significantly outperforms MSTM by perplexity. Through a user study, we also confirm that the comment summary generated by EXTM achieves better intra-cluster topic cohesion and inter-cluster topic diversity. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Ma, Zongyang Sun, Aixin Yuan, Quan Cong, Gao |
format |
Conference or Workshop Item |
author |
Ma, Zongyang Sun, Aixin Yuan, Quan Cong, Gao |
spellingShingle |
Ma, Zongyang Sun, Aixin Yuan, Quan Cong, Gao Topic-driven reader comments summarization |
author_sort |
Ma, Zongyang |
title |
Topic-driven reader comments summarization |
title_short |
Topic-driven reader comments summarization |
title_full |
Topic-driven reader comments summarization |
title_fullStr |
Topic-driven reader comments summarization |
title_full_unstemmed |
Topic-driven reader comments summarization |
title_sort |
topic-driven reader comments summarization |
publishDate |
2013 |
url |
https://hdl.handle.net/10356/97966 http://hdl.handle.net/10220/12257 |
_version_ |
1681058875342585856 |