Speech-based depression detection for Bahasa Malaysia female speakers using deep learning
Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, society, and the economy. Traditional clinical diagnosis methods are subjective and require extensive participation of experts. Furthermore, the severe shortage in psychiatrists’ ratio per population i...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Penerbit UTM Press
2021
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/94038/7/94038_Speech-based%20depression%20detection%20for%20Bahasa%20Malaysia.pdf http://irep.iium.edu.my/94038/ https://elektrika.utm.my/index.php/ELEKTRIKA_Journal/article/view/318/195 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Islam Antarabangsa Malaysia |
Language: | English |
Summary: | Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, society, and the
economy. Traditional clinical diagnosis methods are subjective and require extensive participation of experts. Furthermore,
the severe shortage in psychiatrists’ ratio per population in Malaysia imposes patients’ delay in seeking treatment and poor
compliance to follow-up. Besides, the social stigma of visiting psychiatric clinics also prevents patients from seeking early
treatment. Automatic depression detection using speech signals is a promising depression biometric because it is fast,
convenient, and non-invasive. This research attempts to develop an end-to-end deep learning model to classify depression from
female Bahasa Malaysia speech using our dataset. Depression status was identified by the Patient Health Questionnaire 9, the
Malay Beck Depression Inventory-II, and subjects’ declaration of Major Depressive Disorder diagnosis by a trained clinician.
The dataset consists of 110 female participants. We provided a detailed implementation of deep learning models using raw
audio input. Multiple combinations of speech types were analyzed using various deep neural network models. After performing
hyperparameters tunning, raw audio input from female read and spontaneous speech combination using AttCRNN model
achieved an accuracy of 91%. |
---|