Genetic semantic graph approach for multidocument abstractive summarization
The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relati...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2015
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/61392/ http://icdipc2015.sdiwc.us/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
id |
my.utm.61392 |
---|---|
record_format |
eprints |
spelling |
my.utm.613922017-08-02T01:09:32Z http://eprints.utm.my/id/eprint/61392/ Genetic semantic graph approach for multidocument abstractive summarization Khan, Atif Salim, Naomie Kumar, Yogan Jaya QA Mathematics The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relationships between sentences. These methods may fail in determining redundant sentences that are semantically equivalent. This paper introduces a genetic semantic graph based approach for multi-document abstractive summarization. Semantic graph from the document set is constructed in such a way that the graph nodes represent the predicate argument structures (PASs), extracted automatically by employing semantic role labeling (SRL); and the edges of graph correspond to semantic similarity weight determined from PAS-to-PAS semantic similarity, and PAS-to-document set relationship. The PAS-to-document set relationship is represented by different features, weighted and optimized by genetic algorithm. The salient graph nodes (PASs) are ranked based on modified graph based ranking algorithm. In order to reduce redundancy, we utilize maximal marginal relevance (MMR) to re-ranks the PASs and use language generation to generate summary sentences from the top ranked PASs. Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach performs better than other summarization systems. 2015 Conference or Workshop Item PeerReviewed Khan, Atif and Salim, Naomie and Kumar, Yogan Jaya (2015) Genetic semantic graph approach for multidocument abstractive summarization. In: Digital Information Processing and Communications (ICDIPC), 2015 Fifth International Conference, 7-9 Oct, 2015, Switzerland. http://icdipc2015.sdiwc.us/ |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
QA Mathematics |
spellingShingle |
QA Mathematics Khan, Atif Salim, Naomie Kumar, Yogan Jaya Genetic semantic graph approach for multidocument abstractive summarization |
description |
The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relationships between sentences. These methods may fail in determining redundant sentences that are semantically equivalent. This paper introduces a genetic semantic graph based approach for multi-document abstractive summarization. Semantic graph from the document set is constructed in such a way that the graph nodes represent the predicate argument structures (PASs), extracted automatically by employing semantic role labeling (SRL); and the edges of graph correspond to semantic similarity weight determined from PAS-to-PAS semantic similarity, and PAS-to-document set relationship. The PAS-to-document set relationship is represented by different features, weighted and optimized by genetic algorithm. The salient graph nodes (PASs) are ranked based on modified graph based ranking algorithm. In order to reduce redundancy, we utilize maximal marginal relevance (MMR) to re-ranks the PASs and use language generation to generate summary sentences from the top ranked PASs. Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach performs better than other summarization systems. |
format |
Conference or Workshop Item |
author |
Khan, Atif Salim, Naomie Kumar, Yogan Jaya |
author_facet |
Khan, Atif Salim, Naomie Kumar, Yogan Jaya |
author_sort |
Khan, Atif |
title |
Genetic semantic graph approach for multidocument abstractive summarization |
title_short |
Genetic semantic graph approach for multidocument abstractive summarization |
title_full |
Genetic semantic graph approach for multidocument abstractive summarization |
title_fullStr |
Genetic semantic graph approach for multidocument abstractive summarization |
title_full_unstemmed |
Genetic semantic graph approach for multidocument abstractive summarization |
title_sort |
genetic semantic graph approach for multidocument abstractive summarization |
publishDate |
2015 |
url |
http://eprints.utm.my/id/eprint/61392/ http://icdipc2015.sdiwc.us/ |
_version_ |
1643655157269921792 |