Enhancing stress speech classification through the fusion of emotional datasets utilizing MFCCs with CNN

Stress classification involves categorizing an individual's perceived stress state. One approach involves analyzing human speech due to its non-invasive nature, offering advantages over traditional methods requiring intrusive procedures. Presently, two main types of datasets are used in th...

Full description

Saved in:
Bibliographic Details
Main Authors: Zainal, Nur Aishah, Asnawi, Ani Liza
Format: Proceeding Paper
Language:English
English
Published: IEEE 2024
Subjects:
Online Access:http://irep.iium.edu.my/114504/7/114504_Enhancing%20Stress%20Speech%20Classification%20Through%20the%20Fusion%20of%20Emotional%20Datasets%20Utilizing%20MFCCs%20with%20CNN.pdf
http://irep.iium.edu.my/114504/1/Enhancing%20Stress%20Speech%20Classification%20Through%20the%20Fusion%20of%20Emotional%20Datasets%20Utilizing%20MFCCs%20with%20CNN.pdf
http://irep.iium.edu.my/114504/
https://ieeexplore.ieee.org/abstract/document/10652270
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Islam Antarabangsa Malaysia
Language: English
English
Description
Summary:Stress classification involves categorizing an individual's perceived stress state. One approach involves analyzing human speech due to its non-invasive nature, offering advantages over traditional methods requiring intrusive procedures. Presently, two main types of datasets are used in this research field: scripted and unscripted. Scripted datasets feature staged performances by actors depicting emotions, while unscripted datasets capture natural reactions, though acquiring them poses challenges and requires collaboration with experts. Convolutional Neural Networks (CNNs) have been favored for stress classification, but they require substantial data points per class. Alternatively, traditional machine learning classifiers have shown promising with smaller datasets, though their accuracy rates often fall short. This study fused two scripted datasets, RAVDESS and TESS, to enhance stress classification. Utilizing Mel-frequency Cepstral Coefficients (MFCCs) alongside CNNs proved vital in highlighting stress attributes for effective classification with 94.5% accuracy and surpassed the previous studies.