Speech emotion recognition using deep neural networks on multilingual databases
The research community's ever-increasing interest in studying human-computer interactions (HCI), systems deducing, and identifying a speech signal's emotional aspects has emerged as a hot research topic. Speech Emotion Recognition (SER) has brought the development of automated and intellig...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Book Chapter |
Language: | English English English |
Published: |
Springer
2021
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/88878/1/Paper_110.pdf http://irep.iium.edu.my/88878/7/88878_Speech%20emotion%20recognition.pdf http://irep.iium.edu.my/88878/13/88878_Speech%20emotion%20recognition%20using%20deep%20neural_SCOPUS.pdf http://irep.iium.edu.my/88878/ https://link.springer.com/book/10.1007%2F978-3-030-70917-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Islam Antarabangsa Malaysia |
Language: | English English English |
Summary: | The research community's ever-increasing interest in studying human-computer interactions (HCI), systems deducing, and identifying a speech signal's emotional aspects has emerged as a hot research topic. Speech Emotion Recognition (SER) has brought the development of automated and intelligent analysis of human ut-terances to reality. Typically, an SER system focuses on extracting the features from speech signals such as pitch frequency, formant features, energy-related and spectral features, tailing it with a classification quest to understand the underlying emotion. The key issues pivotal for a successful SER system are driven by the proper selection of proper emotional feature extraction techniques. In this paper, Mel-frequency Cepstral Coefficient (MFCC) and Teager Energy Operator (TEO) along with a new proposed Feature Fusion of MFCC and TEO referred to as Teager-MFCC (TMFCC) is examined over a multilingual database consisting of English, German and Hindi languages. Deep Neural Networks have been used to classify the different emotions considered, happy, sad, angry, and neutral. Eval-uation results show that the proposed fusion TMFCC with a recognition rate of 92.7% outperforms TEO and MFCC. With TEO and MFCC configurations, the recognition rate has been found as 88.5% and 90.0%, respectively. |
---|