Hybrid differential evolution based automatic single document text summarization
Automatic single document text summarization is a process of condensing an input text document. In this process, a summary extraction approach summarizes a document by extracting the most informative sentences in a document. To select such sentences, a sentence scoring approach is used to assign a s...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/38967/5/AlbaraaAbuobiedaPFSKSM2013.pdf http://eprints.utm.my/id/eprint/38967/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
Language: | English |
id |
my.utm.38967 |
---|---|
record_format |
eprints |
spelling |
my.utm.389672017-06-22T02:47:20Z http://eprints.utm.my/id/eprint/38967/ Hybrid differential evolution based automatic single document text summarization Mohammed Ali Abuobieda, Albaraa Abuobieda QA75 Electronic computers. Computer science Automatic single document text summarization is a process of condensing an input text document. In this process, a summary extraction approach summarizes a document by extracting the most informative sentences in a document. To select such sentences, a sentence scoring approach is used to assign a score for each input sentence before ranking them accordingly. Based on user defined summary ratio, only top ranked sentences are selected to be part of the summary and selecting the most informative sentences is a challenge for extractive based automatic text summarization researchers. Thus, this research proposed extraction based automatic single document text summarization methods by investigating a single meta-heuristic evolutionary algorithm called Differential Evolution (DE) to generate high quality summaries. The DE algorithm is used (i) to find out the best feature weight score to discriminate between important and non-important features, (ii) to perform as a cluster machine learning method using Normalized Google Distance and Jaccard similarity measures to generate a highly diversed summary, (iii) to employ opposition-based learning (OBL) approach to improve the performance of the DE algorithm and (iv) to develop a hybrid model used to investigate the adavantages of the combination of feature weighting, diversity and OBL approaches. To evaluate the proposed methods, the standard dataset from Document Understanding Conference (DUC) 2002 and the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as the standard evaluation measurement toolkit were used. Experimental results showed that the hybrid models as well as all the proposed individual methods performed well for text summarization as compared to four benchmark methods: Microsoft Word, Copernic, the best DUC 2002, the worst DUC 2002 summarizers and a human against another human summarizer. In addition, the proposed methods in the DE algorithm outperformed Genetic Algorithm and fuzzy swarm diversity based methods evolutionary based algorithms. The results of the experiments have proven that the proposed hybrid models generate better quality text-summaries. 2013-09 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/38967/5/AlbaraaAbuobiedaPFSKSM2013.pdf Mohammed Ali Abuobieda, Albaraa Abuobieda (2013) Hybrid differential evolution based automatic single document text summarization. PhD thesis, Universiti Teknologi Malaysia, Faculty of Computing. |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Mohammed Ali Abuobieda, Albaraa Abuobieda Hybrid differential evolution based automatic single document text summarization |
description |
Automatic single document text summarization is a process of condensing an input text document. In this process, a summary extraction approach summarizes a document by extracting the most informative sentences in a document. To select such sentences, a sentence scoring approach is used to assign a score for each input sentence before ranking them accordingly. Based on user defined summary ratio, only top ranked sentences are selected to be part of the summary and selecting the most informative sentences is a challenge for extractive based automatic text summarization researchers. Thus, this research proposed extraction based automatic single document text summarization methods by investigating a single meta-heuristic evolutionary algorithm called Differential Evolution (DE) to generate high quality summaries. The DE algorithm is used (i) to find out the best feature weight score to discriminate between important and non-important features, (ii) to perform as a cluster machine learning method using Normalized Google Distance and Jaccard similarity measures to generate a highly diversed summary, (iii) to employ opposition-based learning (OBL) approach to improve the performance of the DE algorithm and (iv) to develop a hybrid model used to investigate the adavantages of the combination of feature weighting, diversity and OBL approaches. To evaluate the proposed methods, the standard dataset from Document Understanding Conference (DUC) 2002 and the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as the standard evaluation measurement toolkit were used. Experimental results showed that the hybrid models as well as all the proposed individual methods performed well for text summarization as compared to four benchmark methods: Microsoft Word, Copernic, the best DUC 2002, the worst DUC 2002 summarizers and a human against another human summarizer. In addition, the proposed methods in the DE algorithm outperformed Genetic Algorithm and fuzzy swarm diversity based methods evolutionary based algorithms. The results of the experiments have proven that the proposed hybrid models generate better quality text-summaries. |
format |
Thesis |
author |
Mohammed Ali Abuobieda, Albaraa Abuobieda |
author_facet |
Mohammed Ali Abuobieda, Albaraa Abuobieda |
author_sort |
Mohammed Ali Abuobieda, Albaraa Abuobieda |
title |
Hybrid differential evolution based automatic single document text summarization |
title_short |
Hybrid differential evolution based automatic single document text summarization |
title_full |
Hybrid differential evolution based automatic single document text summarization |
title_fullStr |
Hybrid differential evolution based automatic single document text summarization |
title_full_unstemmed |
Hybrid differential evolution based automatic single document text summarization |
title_sort |
hybrid differential evolution based automatic single document text summarization |
publishDate |
2013 |
url |
http://eprints.utm.my/id/eprint/38967/5/AlbaraaAbuobiedaPFSKSM2013.pdf http://eprints.utm.my/id/eprint/38967/ |
_version_ |
1643650289050320896 |