SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS

Sentiment analysis aims to determine the sentiment polarity of opinions on text data. The feature representation of the text data has significant influence on sentiment analysis system performance. In addition to the lexical representation of bag of words and TF-IDF, word embedding as a representati...

Full description

Saved in:

Bibliographic Details
Main Author:	NAUFAL FARHAN (NIM : 13513049), AHMAD
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/20853
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:20853
spelling	id-itb.:208532017-10-09T10:28:08ZSENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS NAUFAL FARHAN (NIM : 13513049), AHMAD Indonesia Final Project INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/20853 Sentiment analysis aims to determine the sentiment polarity of opinions on text data. The feature representation of the text data has significant influence on sentiment analysis system performance. In addition to the lexical representation of bag of words and TF-IDF, word embedding as a representation of the word semantic features has been widely used in sentimental analysis research. However, the general word embedding only models semantically and does not take into account the sentiments of the word. <br /> <br /> <br /> <br /> Sentiment-specific word embedding (SSWE) is a representation that not only models semantically, but also takes into account the word sentiments. SSWE produces an n-dimensional feature vector model for each word in the training corpus. This model is obtained by training embedding through artificial neural networks and backpropagation training algorithms. Until now, Indonesian sentiment analysis research using SSWE has not been found. In this final project, an observation of the influence of SSWE on the classification of sentiment in Indonesian language is done. <br /> <br /> <br /> <br /> The corpus and dataset used were collected from TripAdvisor reviews. A total of 306,448 reviews used as SSWE corpus and 12,389 reviews for the train data. The results of the final project experiment stated that the use of SSWE improves sentiment classification performance rather than using Word2Vec. Using an artificial neural network classification model, F1-score generated by SSWE reached 0.7602 for the test set, and 0.7687 for 10-fold cross-validation. However, the F1-score of SSWE or Word2Vec was still below the F1-score generated by the TF-IDF feature baseline experiments that reached 0.8521 for the test set and 0.8492 for 10-fold cross-validation. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Sentiment analysis aims to determine the sentiment polarity of opinions on text data. The feature representation of the text data has significant influence on sentiment analysis system performance. In addition to the lexical representation of bag of words and TF-IDF, word embedding as a representation of the word semantic features has been widely used in sentimental analysis research. However, the general word embedding only models semantically and does not take into account the sentiments of the word. <br /> <br /> <br /> <br /> Sentiment-specific word embedding (SSWE) is a representation that not only models semantically, but also takes into account the word sentiments. SSWE produces an n-dimensional feature vector model for each word in the training corpus. This model is obtained by training embedding through artificial neural networks and backpropagation training algorithms. Until now, Indonesian sentiment analysis research using SSWE has not been found. In this final project, an observation of the influence of SSWE on the classification of sentiment in Indonesian language is done. <br /> <br /> <br /> <br /> The corpus and dataset used were collected from TripAdvisor reviews. A total of 306,448 reviews used as SSWE corpus and 12,389 reviews for the train data. The results of the final project experiment stated that the use of SSWE improves sentiment classification performance rather than using Word2Vec. Using an artificial neural network classification model, F1-score generated by SSWE reached 0.7602 for the test set, and 0.7687 for 10-fold cross-validation. However, the F1-score of SSWE or Word2Vec was still below the F1-score generated by the TF-IDF feature baseline experiments that reached 0.8521 for the test set and 0.8492 for 10-fold cross-validation.
format	Final Project
author	NAUFAL FARHAN (NIM : 13513049), AHMAD
spellingShingle	NAUFAL FARHAN (NIM : 13513049), AHMAD SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS
author_facet	NAUFAL FARHAN (NIM : 13513049), AHMAD
author_sort	NAUFAL FARHAN (NIM : 13513049), AHMAD
title	SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS
title_short	SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS
title_full	SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS
title_fullStr	SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS
title_full_unstemmed	SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS
title_sort	sentiment-specific word embedding effect on indonesian sentiment analysis
url	https://digilib.itb.ac.id/gdl/view/20853
_version_	1821120285734076416

SENTIMENT-SPECIFIC WORD EMBEDDING EFFECT ON INDONESIAN SENTIMENT ANALYSIS

Similar Items