SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT
Aspect based sentiment analysis is a technique used to analyze sentiment polarity on a text based on its aspect. Aspect based sentiment analysis can be applied on text review to find out the writer’s impression about a certain aspect. The amount of data available can become a problem for aspect b...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/69769 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:69769 |
---|---|
spelling |
id-itb.:697692022-11-28T07:38:11ZSENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT Adhiwijna, Abner Indonesia Final Project aspect based sentiment analysis, BERT, sentence-pair, imbalanced data INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/69769 Aspect based sentiment analysis is a technique used to analyze sentiment polarity on a text based on its aspect. Aspect based sentiment analysis can be applied on text review to find out the writer’s impression about a certain aspect. The amount of data available can become a problem for aspect based sentiment analysis on a specific domain. When building a machine learning model for aspect based sentiment analysis, it is usually required to train a model for each aspect. This has the drawback of needing a lot of time training the model and the division of dataset based on its aspect. Another problem that can be found is the imbalance of the data on its sentiment class. This work uses fine-tuning on a BERT model to build a machine learning model for aspect based sentiment analysis that’s not separated per aspect. This is achieved by using the sentence-pair technique, while the imbalanced data problem will be solved by using the oversampling technique. This work attempts to find the optimal hyperparameter for a machine learning model for aspect based sentiment analysis on a fine-tuned BERT model. Testing results shows that the best hyperparameter is found with the value of learning rate = 2e-5, batch size = 16, and epochs = 5 when oversampling is applied and learning rate = 2e-5, batch size = 16, and epochs = 8 when oversampling is not applied. The F1-score of the model with the optimal hyperparameter are 0.912 and 0.909 for when oversampling is applied and not applied, respectively. In addition, oversampling results in better F1-score value in general when the batch size is smaller and the learning rate is higher. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Aspect based sentiment analysis is a technique used to analyze sentiment polarity
on a text based on its aspect. Aspect based sentiment analysis can be applied on text
review to find out the writer’s impression about a certain aspect. The amount of
data available can become a problem for aspect based sentiment analysis on a
specific domain. When building a machine learning model for aspect based
sentiment analysis, it is usually required to train a model for each aspect. This has
the drawback of needing a lot of time training the model and the division of dataset
based on its aspect. Another problem that can be found is the imbalance of the data
on its sentiment class.
This work uses fine-tuning on a BERT model to build a machine learning model
for aspect based sentiment analysis that’s not separated per aspect. This is achieved
by using the sentence-pair technique, while the imbalanced data problem will be
solved by using the oversampling technique. This work attempts to find the optimal
hyperparameter for a machine learning model for aspect based sentiment analysis
on a fine-tuned BERT model.
Testing results shows that the best hyperparameter is found with the value of
learning rate = 2e-5, batch size = 16, and epochs = 5 when oversampling is applied
and learning rate = 2e-5, batch size = 16, and epochs = 8 when oversampling is not
applied. The F1-score of the model with the optimal hyperparameter are 0.912 and
0.909 for when oversampling is applied and not applied, respectively. In addition,
oversampling results in better F1-score value in general when the batch size is
smaller and the learning rate is higher. |
format |
Final Project |
author |
Adhiwijna, Abner |
spellingShingle |
Adhiwijna, Abner SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT |
author_facet |
Adhiwijna, Abner |
author_sort |
Adhiwijna, Abner |
title |
SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT |
title_short |
SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT |
title_full |
SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT |
title_fullStr |
SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT |
title_full_unstemmed |
SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT |
title_sort |
sentiment based aspect classification on hotel domain by fine-tuning bert |
url |
https://digilib.itb.ac.id/gdl/view/69769 |
_version_ |
1822006130001313792 |