CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL

In human interaction, emotion is one aspect that has a fundamental role in influencing the information conveyed. The existing studies and researches on Indonesian emotion recognition systems model the emotion in utterance level which considers utterances as independent entities. However, in the n...

Full description

Saved in:
Bibliographic Details
Main Author: Nurul Izzah Adma, Aisyah
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/56346
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:56346
spelling id-itb.:563462021-06-22T07:44:25ZCONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL Nurul Izzah Adma, Aisyah Indonesia Final Project emotion, conversation, contextual, acoustic, lexical, Indonesian INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/56346 In human interaction, emotion is one aspect that has a fundamental role in influencing the information conveyed. The existing studies and researches on Indonesian emotion recognition systems model the emotion in utterance level which considers utterances as independent entities. However, in the nature of emotion recognition, the relation among utterances affects the emotional context. Human can recognize an emotion abstraction from consecutive utterances (termed conversation) which may have changes or transitions of emotion. Therefore, an experiment was carried out in order to build a conversational emotion recognition system in Indonesian. To build a conversational emotion recognition system, a conversation emotion corpus is needed. The appropriate corpus and usable for conversational-based modeling is not yet available. In this study, a new emotion corpus was built which was obtained by acquiring data from 46 podcast shows. The emotion corpus that was built consisted of 2003 conversations and 10822 utterances that had labels among 6 emotional classes: happy, sad, angry, disgusted, afraid, and surprised. The conversational emotion recognition system in Indonesian was built through experiments involving the Recurrent Neural Network (RNN) algorithm to capture information among consecutive utterances. Conversational emotion recognition learning is carried out based on acoustic features and lexical features. In the experiment, the process of finding the best features and modeling techniques is carried out to produce a model that provides the most optimal performance. The model was evaluated based on the recognition of emotion to the conversation data. The feature-level context-dependent combined model which is built by the combination of acoustic and lexical features has the best performance with an Fmeasure of 0.5817 for 6 emotion classes and 0.7252 for 4 emotion classes. The decision-level context-dependent combined model gives an F-measure of 0.5578 for 6 emotion classes and 0.6924 for 4 emotion classes. Moreover, in the experiment, we obtained a feature-level context-independent combined model, a decision-level context-independent combined model, a context-independent acoustic model, a context-dependent acoustic model, a context-independent lexical model, and a context-dependent lexical model for each of the 6 emotion classes and 4 emotion classes. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description In human interaction, emotion is one aspect that has a fundamental role in influencing the information conveyed. The existing studies and researches on Indonesian emotion recognition systems model the emotion in utterance level which considers utterances as independent entities. However, in the nature of emotion recognition, the relation among utterances affects the emotional context. Human can recognize an emotion abstraction from consecutive utterances (termed conversation) which may have changes or transitions of emotion. Therefore, an experiment was carried out in order to build a conversational emotion recognition system in Indonesian. To build a conversational emotion recognition system, a conversation emotion corpus is needed. The appropriate corpus and usable for conversational-based modeling is not yet available. In this study, a new emotion corpus was built which was obtained by acquiring data from 46 podcast shows. The emotion corpus that was built consisted of 2003 conversations and 10822 utterances that had labels among 6 emotional classes: happy, sad, angry, disgusted, afraid, and surprised. The conversational emotion recognition system in Indonesian was built through experiments involving the Recurrent Neural Network (RNN) algorithm to capture information among consecutive utterances. Conversational emotion recognition learning is carried out based on acoustic features and lexical features. In the experiment, the process of finding the best features and modeling techniques is carried out to produce a model that provides the most optimal performance. The model was evaluated based on the recognition of emotion to the conversation data. The feature-level context-dependent combined model which is built by the combination of acoustic and lexical features has the best performance with an Fmeasure of 0.5817 for 6 emotion classes and 0.7252 for 4 emotion classes. The decision-level context-dependent combined model gives an F-measure of 0.5578 for 6 emotion classes and 0.6924 for 4 emotion classes. Moreover, in the experiment, we obtained a feature-level context-independent combined model, a decision-level context-independent combined model, a context-independent acoustic model, a context-dependent acoustic model, a context-independent lexical model, and a context-dependent lexical model for each of the 6 emotion classes and 4 emotion classes.
format Final Project
author Nurul Izzah Adma, Aisyah
spellingShingle Nurul Izzah Adma, Aisyah
CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
author_facet Nurul Izzah Adma, Aisyah
author_sort Nurul Izzah Adma, Aisyah
title CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
title_short CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
title_full CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
title_fullStr CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
title_full_unstemmed CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
title_sort conversational speech emotion recognition from indonesian spoken language using recurrent neural network based model
url https://digilib.itb.ac.id/gdl/view/56346
_version_ 1822930165851750400