Swarm based features selection for text summarization

The features are the main entries in text summarization. Treating all features equally causes poor summary generation. In this paper, we investigate the effect of the feature structure on the features selection using particle swarm optimization. The particle swarm optimization is trained using DUC...

Full description

Saved in:
Bibliographic Details
Main Authors: Binwahlan, Mohammed Salem, Salim, Naomie, Suanmali, Ladda
Format: Article
Language:English
Published: International Journal of Computer Science and Network Security 2009
Subjects:
Online Access:http://eprints.utm.my/id/eprint/11825/1/NaomieSalim2009_SwarmBasedFeaturesSelectionFor.pdf
http://eprints.utm.my/id/eprint/11825/
http://paper.ijcsns.org/07_book/200901/20090125.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
Description
Summary:The features are the main entries in text summarization. Treating all features equally causes poor summary generation. In this paper, we investigate the effect of the feature structure on the features selection using particle swarm optimization. The particle swarm optimization is trained using DUC 2002 data to learn the weight of each feature. The features used are different in terms of the structure, where some features were formed as combination of more than one feature while others as simple or individual feature. Therefore the determining of the effectiveness of each type of features could lead to mechanism to differentiate between the features having high importance and those having low importance. We assume that the combined features have higher priority of getting selection more than the simple features. In each iteration, the particle swarm optimization selects some features, then corresponding weights of those features are used to score the sentences and the top ranking sentences are selected as summary. The selected features of each best summary are used in calculation of the final features weights. The experimental results shown that the simple features are less effective than the combined features