FEW-SHOT LEARNING IN INDONESIAN LANGUAGE DOMAIN TEXT CLASSIFICATION
Lack of labelled data is a long-standing problem on natural language processing (NLP) field, particularly in low-resource languages, such as Indonesian. Transfer learning via pre-trained transformer-based language model (LM) has been a common approach to address this. The two most popular types of p...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/68650 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Lack of labelled data is a long-standing problem on natural language processing (NLP) field, particularly in low-resource languages, such as Indonesian. Transfer learning via pre-trained transformer-based language model (LM) has been a common approach to address this. The two most popular types of pretrained models are encoder only models like BERT and decoder-only models like GPT. Standard finetuning is the defacto method to use in transfer learning. There is another method however, that can be used in very low amount of data setting, namely the few-shot learning method.
To understand the effectiveness of applying pre-trained LMs on the low-resource setting, a comprehensive study on prompt-based few-shot learning methodologies on IndoNLU as an existing Indonesian natural language understanding benchmark has been done. Three different methods were tested, standard finetuning and two
few-shot learning methods, namely prompt-based finetuning (LM-BFF), and few-shot in-context learning. The language models tested are divided into three
category, multilingual models with XGLM and XLM-R, English monolingual models with GPT-Neo, and Indonesian monolingual models with IndoBERT and IndoGPT.
It is found that in-context learning using a multilingual decoder model, XGLM outperforms the English GPT-Neo models. Prompt-based tuning using LM-BFF with XLM-R was also shown to generally outperform the in-context learning method with a difference of up to ~20 F1-macro-average scores. |
---|