Real-time multimodal affect recognition in laughter episodes

Emotion recognition is a widely studied subject, due to its importance in human interaction and decision making. The recognition of emotion in laughter is particularly important as laughter can identify non-basic affective states such as distress, anxiety, and boredom. Existing systems are unable to...

Full description

Saved in:

Bibliographic Details
Main Author:	Santos, Jose Miguel
Format:	text
Language:	English
Published:	Animo Repository 2013
Subjects:	Emotion recognition Laughter
Online Access:	https://animorepository.dlsu.edu.ph/etd_masteral/4383
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

id	oai:animorepository.dlsu.edu.ph:etd_masteral-11221
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etd_masteral-112212021-01-18T07:45:54Z Real-time multimodal affect recognition in laughter episodes Santos, Jose Miguel Emotion recognition is a widely studied subject, due to its importance in human interaction and decision making. The recognition of emotion in laughter is particularly important as laughter can identify non-basic affective states such as distress, anxiety, and boredom. Existing systems are unable to classify the emotion of laughter in real-time, however. This research proposes a real-time multimodal affect recognition system for laughter episodes, using facial expressions and voiced laughter as modalities. The system takes a video stream as input. The video stream can be either a web camera with a microphone attached for audio, or a video file. As laughter take place over a period of time, rather than frame-by-frame, the system will segment the stream into different windows of 1.62 seconds in length. Within the window, image and audio data are extracted, and the AUs in the apex of the window are detected. At the end of each window, the pitch and MFCC values of the audio data collected within the window are computed, and decision-level fusion is applied to the audio and face features. The resulting features are then be passed to the emotion recognition model, which then produces the final valence and arousal values of the window. The emotion recognition model was able to achieve a correlation coefficient of 0.68 for valence and 0.61 for arousal using the Semaine corpus, and 0.75 for valence and 0.83 for arousal using the Pinoy Laughter 2 corpus. The overhead for the whole emotion recognition process is 610.98 ms, however the overhead will be hard to completely eliminate due to the high number of processes required to perform emotion recognition. 2013-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/4383 Master's Theses English Animo Repository Emotion recognition Laughter
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
topic	Emotion recognition Laughter
spellingShingle	Emotion recognition Laughter Santos, Jose Miguel Real-time multimodal affect recognition in laughter episodes
description	Emotion recognition is a widely studied subject, due to its importance in human interaction and decision making. The recognition of emotion in laughter is particularly important as laughter can identify non-basic affective states such as distress, anxiety, and boredom. Existing systems are unable to classify the emotion of laughter in real-time, however. This research proposes a real-time multimodal affect recognition system for laughter episodes, using facial expressions and voiced laughter as modalities. The system takes a video stream as input. The video stream can be either a web camera with a microphone attached for audio, or a video file. As laughter take place over a period of time, rather than frame-by-frame, the system will segment the stream into different windows of 1.62 seconds in length. Within the window, image and audio data are extracted, and the AUs in the apex of the window are detected. At the end of each window, the pitch and MFCC values of the audio data collected within the window are computed, and decision-level fusion is applied to the audio and face features. The resulting features are then be passed to the emotion recognition model, which then produces the final valence and arousal values of the window. The emotion recognition model was able to achieve a correlation coefficient of 0.68 for valence and 0.61 for arousal using the Semaine corpus, and 0.75 for valence and 0.83 for arousal using the Pinoy Laughter 2 corpus. The overhead for the whole emotion recognition process is 610.98 ms, however the overhead will be hard to completely eliminate due to the high number of processes required to perform emotion recognition.
format	text
author	Santos, Jose Miguel
author_facet	Santos, Jose Miguel
author_sort	Santos, Jose Miguel
title	Real-time multimodal affect recognition in laughter episodes
title_short	Real-time multimodal affect recognition in laughter episodes
title_full	Real-time multimodal affect recognition in laughter episodes
title_fullStr	Real-time multimodal affect recognition in laughter episodes
title_full_unstemmed	Real-time multimodal affect recognition in laughter episodes
title_sort	real-time multimodal affect recognition in laughter episodes
publisher	Animo Repository
publishDate	2013
url	https://animorepository.dlsu.edu.ph/etd_masteral/4383
_version_	1772834470449119232

Real-time multimodal affect recognition in laughter episodes

Similar Items