Speech emotion recognition using WaveNet

Speech emotion recognition is known to be a challenging and complex task for machine learning models. Two challenges that are faced when doing speech emotion recognition are 1) human emotions are hard to distinguished and 2) detection of emotion could only be captured at specific moments in an utter...

全面介紹

Saved in:

書目詳細資料
主要作者:	Nurul Sabrina Mohammed Riduwan
其他作者:	Jagath C Rajapakse
格式:	Final Year Project
語言:	English
出版:	Nanyang Technological University 2022
主題:	Engineering::Computer science and engineering
在線閱讀:	https://hdl.handle.net/10356/156592
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	sg-ntu-dr.10356-156592
record_format	dspace
spelling	sg-ntu-dr.10356-1565922022-04-21T00:28:52Z Speech emotion recognition using WaveNet Nurul Sabrina Mohammed Riduwan Jagath C Rajapakse School of Computer Science and Engineering ASJagath@ntu.edu.sg Engineering::Computer science and engineering Speech emotion recognition is known to be a challenging and complex task for machine learning models. Two challenges that are faced when doing speech emotion recognition are 1) human emotions are hard to distinguished and 2) detection of emotion could only be captured at specific moments in an utterance. Hereby, this paper proposes a Speech Emotion Recognition (SER) architecture inspired by WaveNet architecture. This architecture does not rely neither on tedious pre-processing nor the recurrent layers. The novelty of our approach uses both speech waveforms and audio features as inputs, usage on casual dilated convolutions for capturing temporal dependencies and the use of self-attention mechanism. Self-attention permit inputs to interact with each other to pay close attention on the valuable parts of the input to learn the connection between them. We illustrate improved performances SER with our model on EMO-DB datasets over the existing base-line models. Index Term: speech emotion recognition, self-attention, deep learning, computational paralinguistics Bachelor of Engineering (Computer Science) 2022-04-21T00:28:52Z 2022-04-21T00:28:52Z 2022 Final Year Project (FYP) Nurul Sabrina Mohammed Riduwan (2022). Speech emotion recognition using WaveNet. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156592 https://hdl.handle.net/10356/156592 en SCSE21-0421 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Nurul Sabrina Mohammed Riduwan Speech emotion recognition using WaveNet
description	Speech emotion recognition is known to be a challenging and complex task for machine learning models. Two challenges that are faced when doing speech emotion recognition are 1) human emotions are hard to distinguished and 2) detection of emotion could only be captured at specific moments in an utterance. Hereby, this paper proposes a Speech Emotion Recognition (SER) architecture inspired by WaveNet architecture. This architecture does not rely neither on tedious pre-processing nor the recurrent layers. The novelty of our approach uses both speech waveforms and audio features as inputs, usage on casual dilated convolutions for capturing temporal dependencies and the use of self-attention mechanism. Self-attention permit inputs to interact with each other to pay close attention on the valuable parts of the input to learn the connection between them. We illustrate improved performances SER with our model on EMO-DB datasets over the existing base-line models. Index Term: speech emotion recognition, self-attention, deep learning, computational paralinguistics
author2	Jagath C Rajapakse
author_facet	Jagath C Rajapakse Nurul Sabrina Mohammed Riduwan
format	Final Year Project
author	Nurul Sabrina Mohammed Riduwan
author_sort	Nurul Sabrina Mohammed Riduwan
title	Speech emotion recognition using WaveNet
title_short	Speech emotion recognition using WaveNet
title_full	Speech emotion recognition using WaveNet
title_fullStr	Speech emotion recognition using WaveNet
title_full_unstemmed	Speech emotion recognition using WaveNet
title_sort	speech emotion recognition using wavenet
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/156592
_version_	1731235748223385600

Speech emotion recognition using WaveNet

相似書籍