INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR

Sentiment analysis as one of the fields in natural language processing can be done in several extraction levels i.e. document level, sentence level, and aspect level. This research focuses on document level. Document level sentiment analysis is carried out to extract sentiments or opinions regarding...

Full description

Saved in:
Bibliographic Details
Main Author: Ayu Putu Ari Crisdayanti, Ida
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/39113
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:39113
spelling id-itb.:391132019-06-24T08:49:42ZINDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR Ayu Putu Ari Crisdayanti, Ida Indonesia Final Project Sentiment Analysis, Document Level, DNN, Document Vector INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39113 Sentiment analysis as one of the fields in natural language processing can be done in several extraction levels i.e. document level, sentence level, and aspect level. This research focuses on document level. Document level sentiment analysis is carried out to extract sentiments or opinions regarding certain entities as a whole. Sentiment analysis problem can be solved using Deep Neural Network (DNN) approach. Some DNN topologies used in the experiment are Convolutional Neural Network (CNN), Gated Recurrent Neural Network (GRNN) i.e. Bi-LSTM and Bi-GRU, and Hierarchical Deep Neural Network (HDNN). In building the Indonesian sentiment analysis model, DNN requires document representation in the form of numerical vectors. Therefore, the experiment also includes effectiveness examination of the use of document representation vectors produced by paragraph vector and deep document embedding techniques. Both document representation techniques aim to extract the entire information context in document to maximize the perfomance of sentiment analysis model. The sentiment analysis experiment was conducted using two datasets i.e. TripAdvisor dataset from the research baseline and the datasets from Prosa.ai which is a collection of texts collected from Twitter, Zomato, Facebook, Instagram, and Qraved. From the experimental results, it is shown that the best DNN model for TripAdvisor dataset is CNN with f1-score of 0.8341. This model outperforms the best model in the research baseline. For Prosa dataset, the performance of all DNN models also outperforms baseline with each f1-score above 90%. In Prosa dataset, the use of document representation technique (paragraph vector) increases the f1-score of sentiment analysis model by 1.4648-2.4401%. Meanwhile, the use of paragraph vector for TripAdvisor dataset does not improve the model performance because many text documents are incomplete in training and test data. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Sentiment analysis as one of the fields in natural language processing can be done in several extraction levels i.e. document level, sentence level, and aspect level. This research focuses on document level. Document level sentiment analysis is carried out to extract sentiments or opinions regarding certain entities as a whole. Sentiment analysis problem can be solved using Deep Neural Network (DNN) approach. Some DNN topologies used in the experiment are Convolutional Neural Network (CNN), Gated Recurrent Neural Network (GRNN) i.e. Bi-LSTM and Bi-GRU, and Hierarchical Deep Neural Network (HDNN). In building the Indonesian sentiment analysis model, DNN requires document representation in the form of numerical vectors. Therefore, the experiment also includes effectiveness examination of the use of document representation vectors produced by paragraph vector and deep document embedding techniques. Both document representation techniques aim to extract the entire information context in document to maximize the perfomance of sentiment analysis model. The sentiment analysis experiment was conducted using two datasets i.e. TripAdvisor dataset from the research baseline and the datasets from Prosa.ai which is a collection of texts collected from Twitter, Zomato, Facebook, Instagram, and Qraved. From the experimental results, it is shown that the best DNN model for TripAdvisor dataset is CNN with f1-score of 0.8341. This model outperforms the best model in the research baseline. For Prosa dataset, the performance of all DNN models also outperforms baseline with each f1-score above 90%. In Prosa dataset, the use of document representation technique (paragraph vector) increases the f1-score of sentiment analysis model by 1.4648-2.4401%. Meanwhile, the use of paragraph vector for TripAdvisor dataset does not improve the model performance because many text documents are incomplete in training and test data.
format Final Project
author Ayu Putu Ari Crisdayanti, Ida
spellingShingle Ayu Putu Ari Crisdayanti, Ida
INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR
author_facet Ayu Putu Ari Crisdayanti, Ida
author_sort Ayu Putu Ari Crisdayanti, Ida
title INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR
title_short INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR
title_full INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR
title_fullStr INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR
title_full_unstemmed INDONESIAN SENTIMENT ANALYSIS USING DEEP NEURAL NETWORK AND DOCUMENT REPRESENTATION VECTOR
title_sort indonesian sentiment analysis using deep neural network and document representation vector
url https://digilib.itb.ac.id/gdl/view/39113
_version_ 1822925201447321600