ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS

Aspect-based sentiment analysis (ABSA) from product or service reviews is one of the ways to measure customer satisfaction. The double embeddings and coupled multi-layer attentions approach yield better performance than the best research in SemEval 2016 task 5 for aspect and opinion terms extraction...

Full description

Saved in:

Bibliographic Details
Main Author:	Fernando, Jordhy
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/39582
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:39582
spelling	id-itb.:395822019-06-27T09:50:55ZASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS Fernando, Jordhy Indonesia Final Project aspect and opinion terms extraction, coupled multi-layer attentions, double embeddings. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39582 Aspect-based sentiment analysis (ABSA) from product or service reviews is one of the ways to measure customer satisfaction. The double embeddings and coupled multi-layer attentions approach yield better performance than the best research in SemEval 2016 task 5 for aspect and opinion terms extraction. This thesis adapted both approaches to perform aspect and opinion terms extraction for Indonesian hotel reviews. The double embeddings approach was adapted by trying various types of word embeddings used and by using Indonesian resources to train the word embeddings. The Indonesian resources used to train the word embeddings are the Indonesian Wikipedia corpus and Indonesian hotel reviews. The coupled multi-layer attentions approach was adapted by trying variations of RNN used in the model. The experiments were conducted using 5000 hotel reviews that are divided divided into 3000 reviews for training data, 1000 reviews for validation data, and 1000 reviews for test data. Based on the experimental results, the best configuration of the model for word embeddings type, type of RNN, number of hidden units, number of coupled attentions layer, number of tensors, and dropout rates respectively are double embeddings, BiLSTM, 50, 2, 20, and 0.5. The F1-measure scores for the token level and entity level for the test data are 0.914 and 0.90, better than the baseline model used, namely Bidirectional Long Short-Term Memory with Conditional Random Field (BiLSTM-CRF), which gets 0.895 dan 0.885 F1-measure scores for token level and entity level respectively. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Aspect-based sentiment analysis (ABSA) from product or service reviews is one of the ways to measure customer satisfaction. The double embeddings and coupled multi-layer attentions approach yield better performance than the best research in SemEval 2016 task 5 for aspect and opinion terms extraction. This thesis adapted both approaches to perform aspect and opinion terms extraction for Indonesian hotel reviews. The double embeddings approach was adapted by trying various types of word embeddings used and by using Indonesian resources to train the word embeddings. The Indonesian resources used to train the word embeddings are the Indonesian Wikipedia corpus and Indonesian hotel reviews. The coupled multi-layer attentions approach was adapted by trying variations of RNN used in the model. The experiments were conducted using 5000 hotel reviews that are divided divided into 3000 reviews for training data, 1000 reviews for validation data, and 1000 reviews for test data. Based on the experimental results, the best configuration of the model for word embeddings type, type of RNN, number of hidden units, number of coupled attentions layer, number of tensors, and dropout rates respectively are double embeddings, BiLSTM, 50, 2, 20, and 0.5. The F1-measure scores for the token level and entity level for the test data are 0.914 and 0.90, better than the baseline model used, namely Bidirectional Long Short-Term Memory with Conditional Random Field (BiLSTM-CRF), which gets 0.895 dan 0.885 F1-measure scores for token level and entity level respectively.
format	Final Project
author	Fernando, Jordhy
spellingShingle	Fernando, Jordhy ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
author_facet	Fernando, Jordhy
author_sort	Fernando, Jordhy
title	ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_short	ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_full	ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_fullStr	ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_full_unstemmed	ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS
title_sort	aspect and opinion terms extraction using double embeddings and attention mechanism for indonesian text reviews
url	https://digilib.itb.ac.id/gdl/view/39582
_version_	1821997816845697024

ASPECT AND OPINION TERMS EXTRACTION USING DOUBLE EMBEDDINGS AND ATTENTION MECHANISM FOR INDONESIAN TEXT REVIEWS

Similar Items