Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS

Human computer interaction is moving towards giving computers the ability to adapt and give feedback in accordance to a user's emotion. Initial researches on multimodal emotion recognition shows that combining both vocal and facial signals performed better compared to using physiological signal...

Full description

Saved in:

Bibliographic Details
Main Authors:	Dy, Marc Lanze Ivan C., Espinoza, Ivan Vener L., Go, Paul Patrick V., Mendez, Charles Martin M.
Format:	text
Language:	English
Published:	Animo Repository 2010
Subjects:	Human-computer interaction Pattern recognition systems Computer vision Artificial intelligence
Online Access:	https://animorepository.dlsu.edu.ph/etd_bachelors/14653
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

id	oai:animorepository.dlsu.edu.ph:etd_bachelors-15295
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etd_bachelors-152952021-11-13T05:40:27Z Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS Dy, Marc Lanze Ivan C. Espinoza, Ivan Vener L. Go, Paul Patrick V. Mendez, Charles Martin M. Human computer interaction is moving towards giving computers the ability to adapt and give feedback in accordance to a user's emotion. Initial researches on multimodal emotion recognition shows that combining both vocal and facial signals performed better compared to using physiological signals. In addition, majority of the emotion corpus used on both unimodal and multimodal systems were modeled based on acted data using actors that tend to exaggerate emotions. This study improves the accuracy of single modality systems by developing a multimodal emotion recognition system through vocal and facial expressions using a spontaneous emotion corpus. FilMED2, which contains spontaneous television clips from reality television shows, is the corpus used in this study. The clips contain discrete emotion labels where they are only labeled as happiness, sadness, anger, fear and neutral. The system makes use of the facial feature points and prosodic features which include pitch and energy that will undergo machine learning for classification. SVM is the machine learning technique used for classification and was first tested on each modality for both acted and spontaneous corpus. The acted corpus yielded higher results as compared to when using the spontaneous corpus for both modalities. Both modalities were then combined using decision-level fusion. Using solely the face gave 60% accuracy while using solely the voice gave 32% accuracy. Combining both results with a weight-distribution of 75% face and 25% voice gave an accuracy rate of 80%. 2010-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_bachelors/14653 Bachelor's Theses English Animo Repository Human-computer interaction Pattern recognition systems Computer vision Artificial intelligence
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
topic	Human-computer interaction Pattern recognition systems Computer vision Artificial intelligence
spellingShingle	Human-computer interaction Pattern recognition systems Computer vision Artificial intelligence Dy, Marc Lanze Ivan C. Espinoza, Ivan Vener L. Go, Paul Patrick V. Mendez, Charles Martin M. Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
description	Human computer interaction is moving towards giving computers the ability to adapt and give feedback in accordance to a user's emotion. Initial researches on multimodal emotion recognition shows that combining both vocal and facial signals performed better compared to using physiological signals. In addition, majority of the emotion corpus used on both unimodal and multimodal systems were modeled based on acted data using actors that tend to exaggerate emotions. This study improves the accuracy of single modality systems by developing a multimodal emotion recognition system through vocal and facial expressions using a spontaneous emotion corpus. FilMED2, which contains spontaneous television clips from reality television shows, is the corpus used in this study. The clips contain discrete emotion labels where they are only labeled as happiness, sadness, anger, fear and neutral. The system makes use of the facial feature points and prosodic features which include pitch and energy that will undergo machine learning for classification. SVM is the machine learning technique used for classification and was first tested on each modality for both acted and spontaneous corpus. The acted corpus yielded higher results as compared to when using the spontaneous corpus for both modalities. Both modalities were then combined using decision-level fusion. Using solely the face gave 60% accuracy while using solely the voice gave 32% accuracy. Combining both results with a weight-distribution of 75% face and 25% voice gave an accuracy rate of 80%.
format	text
author	Dy, Marc Lanze Ivan C. Espinoza, Ivan Vener L. Go, Paul Patrick V. Mendez, Charles Martin M.
author_facet	Dy, Marc Lanze Ivan C. Espinoza, Ivan Vener L. Go, Paul Patrick V. Mendez, Charles Martin M.
author_sort	Dy, Marc Lanze Ivan C.
title	Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
title_short	Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
title_full	Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
title_fullStr	Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
title_full_unstemmed	Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
title_sort	multimodal emotion recognition system for spontaneous vocal and facial signals: smerfs
publisher	Animo Repository
publishDate	2010
url	https://animorepository.dlsu.edu.ph/etd_bachelors/14653
_version_	1718382643566870528

Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS

Similar Items