CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN

Semantic role labeling is an approach in semantic analysis that understands semantic relationships in sentences, such as who does what to whom, where, when, etc. The currently available semantic role labeling (SRL) model in Indonesian still has difficulty getting good results due to the lack of anno...

Full description

Saved in:
Bibliographic Details
Main Author: Haqi, Bariza
Format: Final Project
Language:Indonesia
Subjects:
Online Access:https://digilib.itb.ac.id/gdl/view/82458
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:82458
spelling id-itb.:824582024-07-08T13:49:07ZCROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN Haqi, Bariza Teknik (Rekayasa, enjinering dan kegiatan berkaitan) Indonesia Final Project transformer-based semantic role labeling, cross-lingual transfer INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/82458 Semantic role labeling is an approach in semantic analysis that understands semantic relationships in sentences, such as who does what to whom, where, when, etc. The currently available semantic role labeling (SRL) model in Indonesian still has difficulty getting good results due to the lack of annotation corpus required for training, compared to the English SRL model. Therefore, in this Thesis, an SRL model was developed by applying cross-lingual transfer. The cross-lingual transfer method can be applied to overcome the poor performance of the SRL model due to the small Indonesian annotation corpus by utilizing the English annotation corpus, which has huge numbers. This method requires a multilingual model and a dataset with two different languages but the same domain. The multilingual models used in this Thesis are XLM-R and mT5 with base and large sizes. The datasets used are Universal PropBank and Gojali's data for the Indonesian dataset and CoNLL-2012 for the English dataset. Testing was carried out to prove the performance of the SRL model produced using test data from Universal PropBank Indonesia and Gojali's data. Of all the models produced, the XLM-R large model which applies cross-lingual transfer has the best performance. The model produces an F1 score of 0.916 for Gojali data alone and 0.858 for the combination of Universal PropBank Indonesia data and Gojali data. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
topic Teknik (Rekayasa, enjinering dan kegiatan berkaitan)
spellingShingle Teknik (Rekayasa, enjinering dan kegiatan berkaitan)
Haqi, Bariza
CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN
description Semantic role labeling is an approach in semantic analysis that understands semantic relationships in sentences, such as who does what to whom, where, when, etc. The currently available semantic role labeling (SRL) model in Indonesian still has difficulty getting good results due to the lack of annotation corpus required for training, compared to the English SRL model. Therefore, in this Thesis, an SRL model was developed by applying cross-lingual transfer. The cross-lingual transfer method can be applied to overcome the poor performance of the SRL model due to the small Indonesian annotation corpus by utilizing the English annotation corpus, which has huge numbers. This method requires a multilingual model and a dataset with two different languages but the same domain. The multilingual models used in this Thesis are XLM-R and mT5 with base and large sizes. The datasets used are Universal PropBank and Gojali's data for the Indonesian dataset and CoNLL-2012 for the English dataset. Testing was carried out to prove the performance of the SRL model produced using test data from Universal PropBank Indonesia and Gojali's data. Of all the models produced, the XLM-R large model which applies cross-lingual transfer has the best performance. The model produces an F1 score of 0.916 for Gojali data alone and 0.858 for the combination of Universal PropBank Indonesia data and Gojali data.
format Final Project
author Haqi, Bariza
author_facet Haqi, Bariza
author_sort Haqi, Bariza
title CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN
title_short CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN
title_full CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN
title_fullStr CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN
title_full_unstemmed CROSS-LINGUAL TRANSFER FOR SEMANTIC ROLE LABELING IN INDONESIAN
title_sort cross-lingual transfer for semantic role labeling in indonesian
url https://digilib.itb.ac.id/gdl/view/82458
_version_ 1822997710075068416