SENTIMENT BASED ASPECT CLASSIFICATION ON HOTEL DOMAIN BY FINE-TUNING BERT

Aspect based sentiment analysis is a technique used to analyze sentiment polarity on a text based on its aspect. Aspect based sentiment analysis can be applied on text review to find out the writer’s impression about a certain aspect. The amount of data available can become a problem for aspect b...

Full description

Saved in:
Bibliographic Details
Main Author: Adhiwijna, Abner
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/69769
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Aspect based sentiment analysis is a technique used to analyze sentiment polarity on a text based on its aspect. Aspect based sentiment analysis can be applied on text review to find out the writer’s impression about a certain aspect. The amount of data available can become a problem for aspect based sentiment analysis on a specific domain. When building a machine learning model for aspect based sentiment analysis, it is usually required to train a model for each aspect. This has the drawback of needing a lot of time training the model and the division of dataset based on its aspect. Another problem that can be found is the imbalance of the data on its sentiment class. This work uses fine-tuning on a BERT model to build a machine learning model for aspect based sentiment analysis that’s not separated per aspect. This is achieved by using the sentence-pair technique, while the imbalanced data problem will be solved by using the oversampling technique. This work attempts to find the optimal hyperparameter for a machine learning model for aspect based sentiment analysis on a fine-tuned BERT model. Testing results shows that the best hyperparameter is found with the value of learning rate = 2e-5, batch size = 16, and epochs = 5 when oversampling is applied and learning rate = 2e-5, batch size = 16, and epochs = 8 when oversampling is not applied. The F1-score of the model with the optimal hyperparameter are 0.912 and 0.909 for when oversampling is applied and not applied, respectively. In addition, oversampling results in better F1-score value in general when the batch size is smaller and the learning rate is higher.