CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN

Semantic role labeling is an approach in semantic analysis that understands semantic relationships in sentences, such as who does what to whom, where, when, etc. The currently available semantic role labeling (SRL) model in Indonesian still has difficulty getting good results due to the lack of anno...

Full description

Saved in:
Bibliographic Details
Main Author: Haqi, Bariza
Format: Final Project
Language:Indonesia
Subjects:
Online Access:https://digilib.itb.ac.id/gdl/view/82458
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Semantic role labeling is an approach in semantic analysis that understands semantic relationships in sentences, such as who does what to whom, where, when, etc. The currently available semantic role labeling (SRL) model in Indonesian still has difficulty getting good results due to the lack of annotation corpus required for training, compared to the English SRL model. Therefore, in this Thesis, an SRL model was developed by applying cross-lingual transfer. The cross-lingual transfer method can be applied to overcome the poor performance of the SRL model due to the small Indonesian annotation corpus by utilizing the English annotation corpus, which has huge numbers. This method requires a multilingual model and a dataset with two different languages but the same domain. The multilingual models used in this Thesis are XLM-R and mT5 with base and large sizes. The datasets used are Universal PropBank and Gojali's data for the Indonesian dataset and CoNLL-2012 for the English dataset. Testing was carried out to prove the performance of the SRL model produced using test data from Universal PropBank Indonesia and Gojali's data. Of all the models produced, the XLM-R large model which applies cross-lingual transfer has the best performance. The model produces an F1 score of 0.916 for Gojali data alone and 0.858 for the combination of Universal PropBank Indonesia data and Gojali data.