Genetic semantic graph approach for multidocument abstractive summarization

The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relati...

Full description

Saved in:
Bibliographic Details
Main Authors: Khan, Atif, Salim, Naomie, Kumar, Yogan Jaya
Format: Conference or Workshop Item
Published: 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/61392/
http://icdipc2015.sdiwc.us/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
id my.utm.61392
record_format eprints
spelling my.utm.613922017-08-02T01:09:32Z http://eprints.utm.my/id/eprint/61392/ Genetic semantic graph approach for multidocument abstractive summarization Khan, Atif Salim, Naomie Kumar, Yogan Jaya QA Mathematics The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relationships between sentences. These methods may fail in determining redundant sentences that are semantically equivalent. This paper introduces a genetic semantic graph based approach for multi-document abstractive summarization. Semantic graph from the document set is constructed in such a way that the graph nodes represent the predicate argument structures (PASs), extracted automatically by employing semantic role labeling (SRL); and the edges of graph correspond to semantic similarity weight determined from PAS-to-PAS semantic similarity, and PAS-to-document set relationship. The PAS-to-document set relationship is represented by different features, weighted and optimized by genetic algorithm. The salient graph nodes (PASs) are ranked based on modified graph based ranking algorithm. In order to reduce redundancy, we utilize maximal marginal relevance (MMR) to re-ranks the PASs and use language generation to generate summary sentences from the top ranked PASs. Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach performs better than other summarization systems. 2015 Conference or Workshop Item PeerReviewed Khan, Atif and Salim, Naomie and Kumar, Yogan Jaya (2015) Genetic semantic graph approach for multidocument abstractive summarization. In: Digital Information Processing and Communications (ICDIPC), 2015 Fifth International Conference, 7-9 Oct, 2015, Switzerland. http://icdipc2015.sdiwc.us/
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA Mathematics
spellingShingle QA Mathematics
Khan, Atif
Salim, Naomie
Kumar, Yogan Jaya
Genetic semantic graph approach for multidocument abstractive summarization
description The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relationships between sentences. These methods may fail in determining redundant sentences that are semantically equivalent. This paper introduces a genetic semantic graph based approach for multi-document abstractive summarization. Semantic graph from the document set is constructed in such a way that the graph nodes represent the predicate argument structures (PASs), extracted automatically by employing semantic role labeling (SRL); and the edges of graph correspond to semantic similarity weight determined from PAS-to-PAS semantic similarity, and PAS-to-document set relationship. The PAS-to-document set relationship is represented by different features, weighted and optimized by genetic algorithm. The salient graph nodes (PASs) are ranked based on modified graph based ranking algorithm. In order to reduce redundancy, we utilize maximal marginal relevance (MMR) to re-ranks the PASs and use language generation to generate summary sentences from the top ranked PASs. Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach performs better than other summarization systems.
format Conference or Workshop Item
author Khan, Atif
Salim, Naomie
Kumar, Yogan Jaya
author_facet Khan, Atif
Salim, Naomie
Kumar, Yogan Jaya
author_sort Khan, Atif
title Genetic semantic graph approach for multidocument abstractive summarization
title_short Genetic semantic graph approach for multidocument abstractive summarization
title_full Genetic semantic graph approach for multidocument abstractive summarization
title_fullStr Genetic semantic graph approach for multidocument abstractive summarization
title_full_unstemmed Genetic semantic graph approach for multidocument abstractive summarization
title_sort genetic semantic graph approach for multidocument abstractive summarization
publishDate 2015
url http://eprints.utm.my/id/eprint/61392/
http://icdipc2015.sdiwc.us/
_version_ 1643655157269921792