MULTIDOCUMENT ABSTRACTIVE SUMMARIZATION USING ABSTRACT MEANING REPRESENTATION FOR INDONESIAN LANGUAGE

Automatic summarization is needed to facilitate the distribution of information. Abstractive summarization for English by using graphs Abstract Meaning Representation (AMR) can capture the structure of predicates in combining information and causing more coherent summarization. However, this summ...

Full description

Saved in:
Bibliographic Details
Main Author: Severina, Verena
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/40136
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Automatic summarization is needed to facilitate the distribution of information. Abstractive summarization for English by using graphs Abstract Meaning Representation (AMR) can capture the structure of predicates in combining information and causing more coherent summarization. However, this summarization method is designed to summarize one document in English. In this final project AMR-based summaries for multidocumentary language in Indonesia are implemented. There are two challenges in this final project, namely representing the source document to the Indonesian AMR graph and adaptation for multi-document. For graph representation Abstract Meaning Representation Indonesian is designed by a set of rules and dictionaries. Graf Abstract Meaning Representation is selected as a summary graph by performing feature extraction, applying Integer Linear Programming (ILP), and determining parameters with a perceptron. Multidocument summarization is made by making sentence selection which will be summarized with Agglomerative Hierarchical Clustering which selects one sentence from each cluster. Experiments were carried out to determine the number of epochs carried out for weight learning used for graph selection, and determine the threshold of the distance between cluster members produced by Agglomerative Hierarchical Clustering. The results of the tests performed were ROUGE-1 and ROUGE-2. From the tests performed the results of the summarization are highest using 5% threshold clustering and 1 time epoch.