Real-time multimodal affect recognition in laughter episodes

Emotion recognition is a widely studied subject, due to its importance in human interaction and decision making. The recognition of emotion in laughter is particularly important as laughter can identify non-basic affective states such as distress, anxiety, and boredom. Existing systems are unable to...

Full description

Saved in:
Bibliographic Details
Main Author: Santos, Jose Miguel
Format: text
Language:English
Published: Animo Repository 2013
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/4383
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_masteral-11221
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_masteral-112212021-01-18T07:45:54Z Real-time multimodal affect recognition in laughter episodes Santos, Jose Miguel Emotion recognition is a widely studied subject, due to its importance in human interaction and decision making. The recognition of emotion in laughter is particularly important as laughter can identify non-basic affective states such as distress, anxiety, and boredom. Existing systems are unable to classify the emotion of laughter in real-time, however. This research proposes a real-time multimodal affect recognition system for laughter episodes, using facial expressions and voiced laughter as modalities. The system takes a video stream as input. The video stream can be either a web camera with a microphone attached for audio, or a video file. As laughter take place over a period of time, rather than frame-by-frame, the system will segment the stream into different windows of 1.62 seconds in length. Within the window, image and audio data are extracted, and the AUs in the apex of the window are detected. At the end of each window, the pitch and MFCC values of the audio data collected within the window are computed, and decision-level fusion is applied to the audio and face features. The resulting features are then be passed to the emotion recognition model, which then produces the final valence and arousal values of the window. The emotion recognition model was able to achieve a correlation coefficient of 0.68 for valence and 0.61 for arousal using the Semaine corpus, and 0.75 for valence and 0.83 for arousal using the Pinoy Laughter 2 corpus. The overhead for the whole emotion recognition process is 610.98 ms, however the overhead will be hard to completely eliminate due to the high number of processes required to perform emotion recognition. 2013-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_masteral/4383 Master's Theses English Animo Repository Emotion recognition Laughter
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Emotion recognition
Laughter
spellingShingle Emotion recognition
Laughter
Santos, Jose Miguel
Real-time multimodal affect recognition in laughter episodes
description Emotion recognition is a widely studied subject, due to its importance in human interaction and decision making. The recognition of emotion in laughter is particularly important as laughter can identify non-basic affective states such as distress, anxiety, and boredom. Existing systems are unable to classify the emotion of laughter in real-time, however. This research proposes a real-time multimodal affect recognition system for laughter episodes, using facial expressions and voiced laughter as modalities. The system takes a video stream as input. The video stream can be either a web camera with a microphone attached for audio, or a video file. As laughter take place over a period of time, rather than frame-by-frame, the system will segment the stream into different windows of 1.62 seconds in length. Within the window, image and audio data are extracted, and the AUs in the apex of the window are detected. At the end of each window, the pitch and MFCC values of the audio data collected within the window are computed, and decision-level fusion is applied to the audio and face features. The resulting features are then be passed to the emotion recognition model, which then produces the final valence and arousal values of the window. The emotion recognition model was able to achieve a correlation coefficient of 0.68 for valence and 0.61 for arousal using the Semaine corpus, and 0.75 for valence and 0.83 for arousal using the Pinoy Laughter 2 corpus. The overhead for the whole emotion recognition process is 610.98 ms, however the overhead will be hard to completely eliminate due to the high number of processes required to perform emotion recognition.
format text
author Santos, Jose Miguel
author_facet Santos, Jose Miguel
author_sort Santos, Jose Miguel
title Real-time multimodal affect recognition in laughter episodes
title_short Real-time multimodal affect recognition in laughter episodes
title_full Real-time multimodal affect recognition in laughter episodes
title_fullStr Real-time multimodal affect recognition in laughter episodes
title_full_unstemmed Real-time multimodal affect recognition in laughter episodes
title_sort real-time multimodal affect recognition in laughter episodes
publisher Animo Repository
publishDate 2013
url https://animorepository.dlsu.edu.ph/etd_masteral/4383
_version_ 1772834470449119232