Automatic multi document summarization approaches

Problem statement: Text summarization can be of different nature ranging from indicative summary that identifies the topics of the document to informative summary which is meant to represent the concise description of the original document, providing an idea of what the whole content of document is...

Full description

Saved in:
Bibliographic Details
Main Authors: Kumar, Yogan Jaya, Salim, Naomie
Format: Article
Published: 2012
Subjects:
Online Access:http://eprints.utm.my/id/eprint/46633/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Description
Summary:Problem statement: Text summarization can be of different nature ranging from indicative summary that identifies the topics of the document to informative summary which is meant to represent the concise description of the original document, providing an idea of what the whole content of document is all about. Approach: Single document summary seems to capture both the information well but it has not been the case for multi document summary where the overall comprehensive quality in presenting informative summary often lacks. It is found that most of the existing methods tend to focus on sentence scoring and less consideration is given to the contextual information content in multiple documents. Results: In this study, some survey on multi document summarization approaches has been presented. We will direct our focus notably on four well known approaches to multi document summarization namely the feature based method, cluster based method, graph based method and knowledge based method. The general ideas behind these methods have been described. Conclusion: Besides the general idea and concept, we discuss the benefits and limitations concerning these methods. With the aim of enhancing multi document summarization, specifically news documents, a novel type of approach is outlined to be developed in the future, taking into account the generic components of a news story in order to generate a better summary.