INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
Semantic role labeling (SRL) which is currently available in Indonesian still has the limitation of producing argument labeling for only one predicate for each sentence and can only identify a predicate consisting of one word. The quality of the existing SRL corpus is also still not good so that...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/65820 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Semantic role labeling (SRL) which is currently available in Indonesian still has the limitation
of producing argument labeling for only one predicate for each sentence and can only identify
a predicate consisting of one word. The quality of the existing SRL corpus is also still not good
so that it will cause confusion when the annotator wants to add and validate the SRL corpus.
These things have an impact on the performance of the SRL-based automatic summary of
Indonesian news articles. Therefore, this final project will develop a span-based SRL model
and use biaffine scoring which opens the limits that have been described and applies it to an
automatic summary system for Indonesian news articles.
The SRL model built is span-based and can produce argument span labeling against multiple
predicates in the same output structure. The model can also accept span as a predicate so that
it can identify a predicate consisting of more than one word. The SRL model uses biaffine
scoring in calculating the score of the argument and label predicate pairs. The construction of
the SRL corpus begins with analyzing and making labeling guidelines for 200 predicates and
the SRL corpus consists of 3681 sentences. The SRL model is then used in the news article
automatic summary system.
Experiments were carried out to determine the configuration of the SRL model that resulted in
the best labeling and also to determine the appropriate summary alternative after opening the
boundaries of the SRL model. The best SRL model resulted in F1 scores of 0.798 and 0.698
for test data 1 and test data 2 and the best summary system configuration resulted in F1 scores
ROUGE-1, ROUGE-2 and ROUGE-L 0.3213, 0.1526 and 0.2967, respectively.
|
---|