INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION

Semantic role labeling (SRL) which is currently available in Indonesian still has the limitation of producing argument labeling for only one predicate for each sentence and can only identify a predicate consisting of one word. The quality of the existing SRL corpus is also still not good so that...

Full description

Saved in:

Bibliographic Details
Main Author:	Gojali, Felicia
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/65820
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:65820
spelling	id-itb.:658202022-06-25T03:41:31ZINDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION Gojali, Felicia Indonesia Final Project span-based semantic role labeling, SRL labeling guidelines, automatic summary system INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/65820 Semantic role labeling (SRL) which is currently available in Indonesian still has the limitation of producing argument labeling for only one predicate for each sentence and can only identify a predicate consisting of one word. The quality of the existing SRL corpus is also still not good so that it will cause confusion when the annotator wants to add and validate the SRL corpus. These things have an impact on the performance of the SRL-based automatic summary of Indonesian news articles. Therefore, this final project will develop a span-based SRL model and use biaffine scoring which opens the limits that have been described and applies it to an automatic summary system for Indonesian news articles. The SRL model built is span-based and can produce argument span labeling against multiple predicates in the same output structure. The model can also accept span as a predicate so that it can identify a predicate consisting of more than one word. The SRL model uses biaffine scoring in calculating the score of the argument and label predicate pairs. The construction of the SRL corpus begins with analyzing and making labeling guidelines for 200 predicates and the SRL corpus consists of 3681 sentences. The SRL model is then used in the news article automatic summary system. Experiments were carried out to determine the configuration of the SRL model that resulted in the best labeling and also to determine the appropriate summary alternative after opening the boundaries of the SRL model. The best SRL model resulted in F1 scores of 0.798 and 0.698 for test data 1 and test data 2 and the best summary system configuration resulted in F1 scores ROUGE-1, ROUGE-2 and ROUGE-L 0.3213, 0.1526 and 0.2967, respectively. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Semantic role labeling (SRL) which is currently available in Indonesian still has the limitation of producing argument labeling for only one predicate for each sentence and can only identify a predicate consisting of one word. The quality of the existing SRL corpus is also still not good so that it will cause confusion when the annotator wants to add and validate the SRL corpus. These things have an impact on the performance of the SRL-based automatic summary of Indonesian news articles. Therefore, this final project will develop a span-based SRL model and use biaffine scoring which opens the limits that have been described and applies it to an automatic summary system for Indonesian news articles. The SRL model built is span-based and can produce argument span labeling against multiple predicates in the same output structure. The model can also accept span as a predicate so that it can identify a predicate consisting of more than one word. The SRL model uses biaffine scoring in calculating the score of the argument and label predicate pairs. The construction of the SRL corpus begins with analyzing and making labeling guidelines for 200 predicates and the SRL corpus consists of 3681 sentences. The SRL model is then used in the news article automatic summary system. Experiments were carried out to determine the configuration of the SRL model that resulted in the best labeling and also to determine the appropriate summary alternative after opening the boundaries of the SRL model. The best SRL model resulted in F1 scores of 0.798 and 0.698 for test data 1 and test data 2 and the best summary system configuration resulted in F1 scores ROUGE-1, ROUGE-2 and ROUGE-L 0.3213, 0.1526 and 0.2967, respectively.
format	Final Project
author	Gojali, Felicia
spellingShingle	Gojali, Felicia INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
author_facet	Gojali, Felicia
author_sort	Gojali, Felicia
title	INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
title_short	INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
title_full	INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
title_fullStr	INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
title_full_unstemmed	INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION
title_sort	indonesian semantic role labeling for single document summarization
url	https://digilib.itb.ac.id/gdl/view/65820
_version_	1822932861004546048

INDONESIAN SEMANTIC ROLE LABELING FOR SINGLE DOCUMENT SUMMARIZATION

Similar Items