Hybrid differential evolution based automatic single document text summarization

Automatic single document text summarization is a process of condensing an input text document. In this process, a summary extraction approach summarizes a document by extracting the most informative sentences in a document. To select such sentences, a sentence scoring approach is used to assign a s...

Full description

Saved in:
Bibliographic Details
Main Author: Mohammed Ali Abuobieda, Albaraa Abuobieda
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://eprints.utm.my/id/eprint/38967/5/AlbaraaAbuobiedaPFSKSM2013.pdf
http://eprints.utm.my/id/eprint/38967/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.38967
record_format eprints
spelling my.utm.389672017-06-22T02:47:20Z http://eprints.utm.my/id/eprint/38967/ Hybrid differential evolution based automatic single document text summarization Mohammed Ali Abuobieda, Albaraa Abuobieda QA75 Electronic computers. Computer science Automatic single document text summarization is a process of condensing an input text document. In this process, a summary extraction approach summarizes a document by extracting the most informative sentences in a document. To select such sentences, a sentence scoring approach is used to assign a score for each input sentence before ranking them accordingly. Based on user defined summary ratio, only top ranked sentences are selected to be part of the summary and selecting the most informative sentences is a challenge for extractive based automatic text summarization researchers. Thus, this research proposed extraction based automatic single document text summarization methods by investigating a single meta-heuristic evolutionary algorithm called Differential Evolution (DE) to generate high quality summaries. The DE algorithm is used (i) to find out the best feature weight score to discriminate between important and non-important features, (ii) to perform as a cluster machine learning method using Normalized Google Distance and Jaccard similarity measures to generate a highly diversed summary, (iii) to employ opposition-based learning (OBL) approach to improve the performance of the DE algorithm and (iv) to develop a hybrid model used to investigate the adavantages of the combination of feature weighting, diversity and OBL approaches. To evaluate the proposed methods, the standard dataset from Document Understanding Conference (DUC) 2002 and the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as the standard evaluation measurement toolkit were used. Experimental results showed that the hybrid models as well as all the proposed individual methods performed well for text summarization as compared to four benchmark methods: Microsoft Word, Copernic, the best DUC 2002, the worst DUC 2002 summarizers and a human against another human summarizer. In addition, the proposed methods in the DE algorithm outperformed Genetic Algorithm and fuzzy swarm diversity based methods evolutionary based algorithms. The results of the experiments have proven that the proposed hybrid models generate better quality text-summaries. 2013-09 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/38967/5/AlbaraaAbuobiedaPFSKSM2013.pdf Mohammed Ali Abuobieda, Albaraa Abuobieda (2013) Hybrid differential evolution based automatic single document text summarization. PhD thesis, Universiti Teknologi Malaysia, Faculty of Computing.
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Mohammed Ali Abuobieda, Albaraa Abuobieda
Hybrid differential evolution based automatic single document text summarization
description Automatic single document text summarization is a process of condensing an input text document. In this process, a summary extraction approach summarizes a document by extracting the most informative sentences in a document. To select such sentences, a sentence scoring approach is used to assign a score for each input sentence before ranking them accordingly. Based on user defined summary ratio, only top ranked sentences are selected to be part of the summary and selecting the most informative sentences is a challenge for extractive based automatic text summarization researchers. Thus, this research proposed extraction based automatic single document text summarization methods by investigating a single meta-heuristic evolutionary algorithm called Differential Evolution (DE) to generate high quality summaries. The DE algorithm is used (i) to find out the best feature weight score to discriminate between important and non-important features, (ii) to perform as a cluster machine learning method using Normalized Google Distance and Jaccard similarity measures to generate a highly diversed summary, (iii) to employ opposition-based learning (OBL) approach to improve the performance of the DE algorithm and (iv) to develop a hybrid model used to investigate the adavantages of the combination of feature weighting, diversity and OBL approaches. To evaluate the proposed methods, the standard dataset from Document Understanding Conference (DUC) 2002 and the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as the standard evaluation measurement toolkit were used. Experimental results showed that the hybrid models as well as all the proposed individual methods performed well for text summarization as compared to four benchmark methods: Microsoft Word, Copernic, the best DUC 2002, the worst DUC 2002 summarizers and a human against another human summarizer. In addition, the proposed methods in the DE algorithm outperformed Genetic Algorithm and fuzzy swarm diversity based methods evolutionary based algorithms. The results of the experiments have proven that the proposed hybrid models generate better quality text-summaries.
format Thesis
author Mohammed Ali Abuobieda, Albaraa Abuobieda
author_facet Mohammed Ali Abuobieda, Albaraa Abuobieda
author_sort Mohammed Ali Abuobieda, Albaraa Abuobieda
title Hybrid differential evolution based automatic single document text summarization
title_short Hybrid differential evolution based automatic single document text summarization
title_full Hybrid differential evolution based automatic single document text summarization
title_fullStr Hybrid differential evolution based automatic single document text summarization
title_full_unstemmed Hybrid differential evolution based automatic single document text summarization
title_sort hybrid differential evolution based automatic single document text summarization
publishDate 2013
url http://eprints.utm.my/id/eprint/38967/5/AlbaraaAbuobiedaPFSKSM2013.pdf
http://eprints.utm.my/id/eprint/38967/
_version_ 1643650289050320896