TRANSFER LEARNING USING POST-TRAINING FOR INDONESIAN ASPECT-BASED SENTIMEN ANALYSIS

ABSTRACT TRANSFER LEARNING USING POST-TRAINING FOR INDONESIAN ASPECT-BASED SENTIMEN ANALYSIS By I Putu Eka Surya Aditya NIM: 23530053 (Master’s Program in Informatics) Aspect-based sentimen analysis has an important role in business development because it makes it easier for business people...

Full description

Saved in:
Bibliographic Details
Main Author: Putu Eka Surya Aditya, I
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/63359
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:ABSTRACT TRANSFER LEARNING USING POST-TRAINING FOR INDONESIAN ASPECT-BASED SENTIMEN ANALYSIS By I Putu Eka Surya Aditya NIM: 23530053 (Master’s Program in Informatics) Aspect-based sentimen analysis has an important role in business development because it makes it easier for business people to evaluate feedback from customers for every aspect of the service. In recent years, pre-trained language models such as ELMo, BERT, XLM-R and XLNet have achieved great success in natural language processing (NLP) tasks especially aspect-based sentimen analysis. For Indonesian, there have been several studies on aspect-based sentimen analysis tasks. The latest research by (Azhar and Khodra, 2020) used mBERT as a pretrained language model and succeeded in achieving the best kinerjance for hoteldomain review data. The approach used by (Azhar and Khodra, 2020) is the use of auxiliary sentences adapted from research by (Sun et al., 2019). There is another approach that also achieves good kinerjance on aspect-based sentimen analysis tasks, namely post-training by (Xu et al., 2019). In the research (Xu et al., 2019) conducted post-training for the aspect-based sentimen analysis task and joint posttraining on the Review Reading Comprehension (RRC) task. In this study, a test was conducted to see the effect of using post-training and joint post-training on an aspect-based sentimen classification task using a different pre-trained language model from previous research. In this study, three pre-trained language models were used, namely: BERT (mBERT and IndoBERT), XLM-R, and XLNet (XLNet English and XLNet Malay). For the problem-solving approach, two approaches are used, namely the use of auxiliary sentences (Sun et al., 2019) and post-training/joint post-training (Xu et al., 2019). The data used in this study is divided into three types of data, namely data for posttraining, data for joint post-training and data for training and testing. The data for post-training are hotel reviews without labels (unsupervised), data for joint posttraining are car review data, and data for training and testing are the same as the data used in the study (Azhar and Khodra, 2020). The test results show that IndoBERT has a better performance than the baseline model (mBERT) either with or without a post-training approach. The post-training approach on XLM-R achieved the best performance with an F1-score of 0.9875 on Test 1 data and 0.9614 on Test II data. The model outperformed the baseline (mBERT without post-training) by 1.04% in Test data 1 and 2.92% in test data 2. iv This is because XLM-R is trained with much more parameters and a much larger dictionary size than mBERT. The test results also show the performance of the posttraining model outperformed the post-training joint model on all pre-trained language models. The model in this thesis achieves the best performance on the Indonesian language hotel review data (HoASA). Keywords: aspect-based sentimen analysis, NLP, pre-trained language model, IndoBERT, XLM-R, XLNet, auxiliary sentences, post-training, joint post-training.