MULTIDOCUMENT ABSTRACTIVE SUMMARIZATION USING ABSTRACT MEANING REPRESENTATION FOR INDONESIAN LANGUAGE
Automatic summarization is needed to facilitate the distribution of information. Abstractive summarization for English by using graphs Abstract Meaning Representation (AMR) can capture the structure of predicates in combining information and causing more coherent summarization. However, this summ...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/40136 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Automatic summarization is needed to facilitate the distribution of information. Abstractive
summarization for English by using graphs Abstract Meaning Representation (AMR) can
capture the structure of predicates in combining information and causing more coherent
summarization. However, this summarization method is designed to summarize one document
in English. In this final project AMR-based summaries for multidocumentary language in
Indonesia are implemented.
There are two challenges in this final project, namely representing the source document to the
Indonesian AMR graph and adaptation for multi-document. For graph representation Abstract
Meaning Representation Indonesian is designed by a set of rules and dictionaries. Graf Abstract
Meaning Representation is selected as a summary graph by performing feature extraction,
applying Integer Linear Programming (ILP), and determining parameters with a perceptron.
Multidocument summarization is made by making sentence selection which will be
summarized with Agglomerative Hierarchical Clustering which selects one sentence from each
cluster.
Experiments were carried out to determine the number of epochs carried out for weight learning
used for graph selection, and determine the threshold of the distance between cluster members
produced by Agglomerative Hierarchical Clustering. The results of the tests performed were
ROUGE-1 and ROUGE-2. From the tests performed the results of the summarization are
highest using 5% threshold clustering and 1 time epoch. |
---|