CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL
In human interaction, emotion is one aspect that has a fundamental role in influencing the information conveyed. The existing studies and researches on Indonesian emotion recognition systems model the emotion in utterance level which considers utterances as independent entities. However, in the n...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/56346 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:56346 |
---|---|
spelling |
id-itb.:563462021-06-22T07:44:25ZCONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL Nurul Izzah Adma, Aisyah Indonesia Final Project emotion, conversation, contextual, acoustic, lexical, Indonesian INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/56346 In human interaction, emotion is one aspect that has a fundamental role in influencing the information conveyed. The existing studies and researches on Indonesian emotion recognition systems model the emotion in utterance level which considers utterances as independent entities. However, in the nature of emotion recognition, the relation among utterances affects the emotional context. Human can recognize an emotion abstraction from consecutive utterances (termed conversation) which may have changes or transitions of emotion. Therefore, an experiment was carried out in order to build a conversational emotion recognition system in Indonesian. To build a conversational emotion recognition system, a conversation emotion corpus is needed. The appropriate corpus and usable for conversational-based modeling is not yet available. In this study, a new emotion corpus was built which was obtained by acquiring data from 46 podcast shows. The emotion corpus that was built consisted of 2003 conversations and 10822 utterances that had labels among 6 emotional classes: happy, sad, angry, disgusted, afraid, and surprised. The conversational emotion recognition system in Indonesian was built through experiments involving the Recurrent Neural Network (RNN) algorithm to capture information among consecutive utterances. Conversational emotion recognition learning is carried out based on acoustic features and lexical features. In the experiment, the process of finding the best features and modeling techniques is carried out to produce a model that provides the most optimal performance. The model was evaluated based on the recognition of emotion to the conversation data. The feature-level context-dependent combined model which is built by the combination of acoustic and lexical features has the best performance with an Fmeasure of 0.5817 for 6 emotion classes and 0.7252 for 4 emotion classes. The decision-level context-dependent combined model gives an F-measure of 0.5578 for 6 emotion classes and 0.6924 for 4 emotion classes. Moreover, in the experiment, we obtained a feature-level context-independent combined model, a decision-level context-independent combined model, a context-independent acoustic model, a context-dependent acoustic model, a context-independent lexical model, and a context-dependent lexical model for each of the 6 emotion classes and 4 emotion classes. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
In human interaction, emotion is one aspect that has a fundamental role in
influencing the information conveyed. The existing studies and researches on
Indonesian emotion recognition systems model the emotion in utterance level
which considers utterances as independent entities. However, in the nature of
emotion recognition, the relation among utterances affects the emotional context.
Human can recognize an emotion abstraction from consecutive utterances (termed
conversation) which may have changes or transitions of emotion. Therefore, an
experiment was carried out in order to build a conversational emotion recognition
system in Indonesian.
To build a conversational emotion recognition system, a conversation emotion
corpus is needed. The appropriate corpus and usable for conversational-based
modeling is not yet available. In this study, a new emotion corpus was built which
was obtained by acquiring data from 46 podcast shows. The emotion corpus that
was built consisted of 2003 conversations and 10822 utterances that had labels
among 6 emotional classes: happy, sad, angry, disgusted, afraid, and surprised.
The conversational emotion recognition system in Indonesian was built through
experiments involving the Recurrent Neural Network (RNN) algorithm to capture
information among consecutive utterances. Conversational emotion recognition
learning is carried out based on acoustic features and lexical features. In the
experiment, the process of finding the best features and modeling techniques is
carried out to produce a model that provides the most optimal performance.
The model was evaluated based on the recognition of emotion to the conversation
data. The feature-level context-dependent combined model which is built by the
combination of acoustic and lexical features has the best performance with an Fmeasure of 0.5817 for 6 emotion classes and 0.7252 for 4 emotion classes. The
decision-level context-dependent combined model gives an F-measure of 0.5578
for 6 emotion classes and 0.6924 for 4 emotion classes. Moreover, in the
experiment, we obtained a feature-level context-independent combined model, a
decision-level context-independent combined model, a context-independent
acoustic model, a context-dependent acoustic model, a context-independent lexical
model, and a context-dependent lexical model for each of the 6 emotion classes and
4 emotion classes. |
format |
Final Project |
author |
Nurul Izzah Adma, Aisyah |
spellingShingle |
Nurul Izzah Adma, Aisyah CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL |
author_facet |
Nurul Izzah Adma, Aisyah |
author_sort |
Nurul Izzah Adma, Aisyah |
title |
CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL |
title_short |
CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL |
title_full |
CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL |
title_fullStr |
CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL |
title_full_unstemmed |
CONVERSATIONAL SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING RECURRENT NEURAL NETWORK BASED MODEL |
title_sort |
conversational speech emotion recognition from indonesian spoken language using recurrent neural network based model |
url |
https://digilib.itb.ac.id/gdl/view/56346 |
_version_ |
1822930165851750400 |