Comments-oriented document summarization: Understanding documents with readers' feedback

Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented document summarization and aim to summarize a Web...

Full description

Saved in:
Bibliographic Details
Main Authors: HU, Meishan, SUN, Aixin, LIM, Ee Peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2008
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/330
https://ink.library.smu.edu.sg/context/sis_research/article/1329/viewcontent/sun_sigir08.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1329
record_format dspace
spelling sg-smu-ink.sis_research-13292018-06-22T03:28:11Z Comments-oriented document summarization: Understanding documents with readers' feedback HU, Meishan SUN, Aixin LIM, Ee Peng Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented document summarization and aim to summarize a Web document (e.g., a blog post) by considering not only its content, but also the comments left by its readers. We identify three relations (namely, topic, quotation, and mention) by which comments can be linked to one another, and model the relations in three graphs. The importance of each comment is then scored by: (i) graph-based method, where the three graphs are merged into a multi-relation graph; (ii) tensor-based method, where the three graphs are used to construct a 3rd-order tensor. To generate a comments-oriented summary, we extract sentences from the given Web document using either feature-biased approach or uniform-document approach. The former scores sentences to bias keywords derived from comments; while the latter scores sentences uniformly with comments. In our experiments using a set of blog posts with manually labeled sentences, our proposed summarization methods utilizing comments showed significant improvement over those not using comments. The methods using feature-biased sentence extraction approach were observed to outperform that using uniform-document approach. 2008-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/330 info:doi/10.1145/1390334.1390385 https://ink.library.smu.edu.sg/context/sis_research/article/1329/viewcontent/sun_sigir08.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Blog Comments Document summarization Graph-based scoring Tensor-based scoring Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Blog
Comments
Document summarization
Graph-based scoring
Tensor-based scoring
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Blog
Comments
Document summarization
Graph-based scoring
Tensor-based scoring
Databases and Information Systems
Numerical Analysis and Scientific Computing
HU, Meishan
SUN, Aixin
LIM, Ee Peng
Comments-oriented document summarization: Understanding documents with readers' feedback
description Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and summarization. In this paper, we study the problem of comments-oriented document summarization and aim to summarize a Web document (e.g., a blog post) by considering not only its content, but also the comments left by its readers. We identify three relations (namely, topic, quotation, and mention) by which comments can be linked to one another, and model the relations in three graphs. The importance of each comment is then scored by: (i) graph-based method, where the three graphs are merged into a multi-relation graph; (ii) tensor-based method, where the three graphs are used to construct a 3rd-order tensor. To generate a comments-oriented summary, we extract sentences from the given Web document using either feature-biased approach or uniform-document approach. The former scores sentences to bias keywords derived from comments; while the latter scores sentences uniformly with comments. In our experiments using a set of blog posts with manually labeled sentences, our proposed summarization methods utilizing comments showed significant improvement over those not using comments. The methods using feature-biased sentence extraction approach were observed to outperform that using uniform-document approach.
format text
author HU, Meishan
SUN, Aixin
LIM, Ee Peng
author_facet HU, Meishan
SUN, Aixin
LIM, Ee Peng
author_sort HU, Meishan
title Comments-oriented document summarization: Understanding documents with readers' feedback
title_short Comments-oriented document summarization: Understanding documents with readers' feedback
title_full Comments-oriented document summarization: Understanding documents with readers' feedback
title_fullStr Comments-oriented document summarization: Understanding documents with readers' feedback
title_full_unstemmed Comments-oriented document summarization: Understanding documents with readers' feedback
title_sort comments-oriented document summarization: understanding documents with readers' feedback
publisher Institutional Knowledge at Singapore Management University
publishDate 2008
url https://ink.library.smu.edu.sg/sis_research/330
https://ink.library.smu.edu.sg/context/sis_research/article/1329/viewcontent/sun_sigir08.pdf
_version_ 1770570388310327296