ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS

Aspect-based sentiment analysis (ABSA) from product or service reviews is one of the ways to measure customer satisfaction. The double embeddings and coupled multi-layer attentions approach yield better performance than the best research in SemEval 2016 task 5 for aspect and opinion terms extraction...

Full description

Saved in:
Bibliographic Details
Main Author: Fernando, Jordhy
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/39582
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:39582
spelling id-itb.:395822019-06-27T09:50:55ZASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS Fernando, Jordhy Indonesia Final Project aspect and opinion terms extraction, coupled multi-layer attentions, double embeddings. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39582 Aspect-based sentiment analysis (ABSA) from product or service reviews is one of the ways to measure customer satisfaction. The double embeddings and coupled multi-layer attentions approach yield better performance than the best research in SemEval 2016 task 5 for aspect and opinion terms extraction. This thesis adapted both approaches to perform aspect and opinion terms extraction for Indonesian hotel reviews. The double embeddings approach was adapted by trying various types of word embeddings used and by using Indonesian resources to train the word embeddings. The Indonesian resources used to train the word embeddings are the Indonesian Wikipedia corpus and Indonesian hotel reviews. The coupled multi-layer attentions approach was adapted by trying variations of RNN used in the model. The experiments were conducted using 5000 hotel reviews that are divided divided into 3000 reviews for training data, 1000 reviews for validation data, and 1000 reviews for test data. Based on the experimental results, the best configuration of the model for word embeddings type, type of RNN, number of hidden units, number of coupled attentions layer, number of tensors, and dropout rates respectively are double embeddings, BiLSTM, 50, 2, 20, and 0.5. The F1-measure scores for the token level and entity level for the test data are 0.914 and 0.90, better than the baseline model used, namely Bidirectional Long Short-Term Memory with Conditional Random Field (BiLSTM-CRF), which gets 0.895 dan 0.885 F1-measure scores for token level and entity level respectively. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Aspect-based sentiment analysis (ABSA) from product or service reviews is one of the ways to measure customer satisfaction. The double embeddings and coupled multi-layer attentions approach yield better performance than the best research in SemEval 2016 task 5 for aspect and opinion terms extraction. This thesis adapted both approaches to perform aspect and opinion terms extraction for Indonesian hotel reviews. The double embeddings approach was adapted by trying various types of word embeddings used and by using Indonesian resources to train the word embeddings. The Indonesian resources used to train the word embeddings are the Indonesian Wikipedia corpus and Indonesian hotel reviews. The coupled multi-layer attentions approach was adapted by trying variations of RNN used in the model. The experiments were conducted using 5000 hotel reviews that are divided divided into 3000 reviews for training data, 1000 reviews for validation data, and 1000 reviews for test data. Based on the experimental results, the best configuration of the model for word embeddings type, type of RNN, number of hidden units, number of coupled attentions layer, number of tensors, and dropout rates respectively are double embeddings, BiLSTM, 50, 2, 20, and 0.5. The F1-measure scores for the token level and entity level for the test data are 0.914 and 0.90, better than the baseline model used, namely Bidirectional Long Short-Term Memory with Conditional Random Field (BiLSTM-CRF), which gets 0.895 dan 0.885 F1-measure scores for token level and entity level respectively.
format Final Project
author Fernando, Jordhy
spellingShingle Fernando, Jordhy
ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
author_facet Fernando, Jordhy
author_sort Fernando, Jordhy
title ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_short ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_full ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_fullStr ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_full_unstemmed ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_sort aspect and opinion terms extraction using double embeddings and attention mechanism for indonesian text reviews
url https://digilib.itb.ac.id/gdl/view/39582
_version_ 1821997816845697024