Robust representation and recognition of facial emotions

Facial Emotion detection under natural conditions is an interesting topic with a wide range of potential applications like human-computer interaction. Although there is significant research progress in this field, there are still challenges related to real-world unconstrained situations. One essenti...

Full description

Saved in:

Bibliographic Details
Main Author:	Shojaeilangari, Seyedehsamaneh
Other Authors:	Teoh Eam Khwang
Format:	Theses and Dissertations
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation
Online Access:	https://hdl.handle.net/10356/62922
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-62922
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation Shojaeilangari, Seyedehsamaneh Robust representation and recognition of facial emotions
description	Facial Emotion detection under natural conditions is an interesting topic with a wide range of potential applications like human-computer interaction. Although there is significant research progress in this field, there are still challenges related to real-world unconstrained situations. One essential challenge is to find pose invariant spatio-temporal volumetric features to analyze the video sequence efficiently. Another important issue is how to deal with noisy and imperfect data recorded in uncontrolled environments such as illumination variations, partial occlusion, and head movements. The focus of this research is to develop a robust system for facial expression recognition as a dynamic event in natural situations. Two strategies have been proposed in this research to address the uncontrolled environments related challenges: Robust representation framework: we propose a novel spatio-temporal descriptor based on Optical Flow (OF) components which is very distinctive and also pose-invariant.  Robust recognition framework: we explored the effectiveness of sparse representation obtained by supervised learning a set of basis (dictionary). Extreme Sparse Learning (ESL) is proposed to jointly learn a dictionary and a nonlinear classification model to robustly detect the facial expression in real-world natural situations. The proposed approach combines the discriminative power of the Extreme Learning Machine (ELM) with the reconstruction property of the sparse representation to deal with noisy signal and imperfect data recorded in natural settings. Since the facial feature extraction performance is highly dependent on facial pose, we propose a novel spatio-temporal descriptor which is robust to facial pose variations. However, the feature encoding may fail in the presence of extreme head pose variations, where some parts of the face are not visible in the recorded images. To address this problem and also dealing with illumination variations and occlusion, we suggested following the idea of sparse representation where the noisy data can be reconstructed from the clean data provided by the dictionary of the sparse representation. While the sparse representation approach has the ability to enhance noisy data using a dictionary learned from clean data, it is not sufficient because the end goal is to correctly recognize the facial expression. In a sparse-representation-based classification task, the desired dictionary should have both representational ability and discriminative power. Since separating the classification training from dictionary learning may cause the learned dictionary to be sub-optimal for the classification task, we propose to jointly learn a dictionary and classification model. In other words, in contrast with most existing schemes that attempt to update the dictionary and classifier parameters alternately by iteratively solving each sub-problem, we propose to solve them simultaneously. This joint dictionary learning and classifier training can be expected to result in a dictionary that is both reconstructive and discriminative for a robust recognition system. To the best of our knowledge, this is the only work that attempts to simultaneously learn the sparse representation of the signal and train a nonlinear classifier to be discriminative for sparse codes. The proposed method jointly learns a single dictionary and also an optimal nonlinear classifier. We have performed extensive experiments on both acted and spontaneous emotion databases to evaluate the effectiveness of the proposed feature extraction and classification schemes under different scenarios. Our results clearly demonstrate the robustness of the proposed emotion recognition framework, especially in challenging scenarios that involve illumination changes, occlusion, and head pose variations.
author2	Teoh Eam Khwang
author_facet	Teoh Eam Khwang Shojaeilangari, Seyedehsamaneh
format	Theses and Dissertations
author	Shojaeilangari, Seyedehsamaneh
author_sort	Shojaeilangari, Seyedehsamaneh
title	Robust representation and recognition of facial emotions
title_short	Robust representation and recognition of facial emotions
title_full	Robust representation and recognition of facial emotions
title_fullStr	Robust representation and recognition of facial emotions
title_full_unstemmed	Robust representation and recognition of facial emotions
title_sort	robust representation and recognition of facial emotions
publishDate	2015
url	https://hdl.handle.net/10356/62922
_version_	1772827412631912448
spelling	sg-ntu-dr.10356-629222023-07-04T16:31:49Z Robust representation and recognition of facial emotions Shojaeilangari, Seyedehsamaneh Teoh Eam Khwang Yau Wei Yun School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Control and instrumentation Facial Emotion detection under natural conditions is an interesting topic with a wide range of potential applications like human-computer interaction. Although there is significant research progress in this field, there are still challenges related to real-world unconstrained situations. One essential challenge is to find pose invariant spatio-temporal volumetric features to analyze the video sequence efficiently. Another important issue is how to deal with noisy and imperfect data recorded in uncontrolled environments such as illumination variations, partial occlusion, and head movements. The focus of this research is to develop a robust system for facial expression recognition as a dynamic event in natural situations. Two strategies have been proposed in this research to address the uncontrolled environments related challenges: Robust representation framework: we propose a novel spatio-temporal descriptor based on Optical Flow (OF) components which is very distinctive and also pose-invariant.  Robust recognition framework: we explored the effectiveness of sparse representation obtained by supervised learning a set of basis (dictionary). Extreme Sparse Learning (ESL) is proposed to jointly learn a dictionary and a nonlinear classification model to robustly detect the facial expression in real-world natural situations. The proposed approach combines the discriminative power of the Extreme Learning Machine (ELM) with the reconstruction property of the sparse representation to deal with noisy signal and imperfect data recorded in natural settings. Since the facial feature extraction performance is highly dependent on facial pose, we propose a novel spatio-temporal descriptor which is robust to facial pose variations. However, the feature encoding may fail in the presence of extreme head pose variations, where some parts of the face are not visible in the recorded images. To address this problem and also dealing with illumination variations and occlusion, we suggested following the idea of sparse representation where the noisy data can be reconstructed from the clean data provided by the dictionary of the sparse representation. While the sparse representation approach has the ability to enhance noisy data using a dictionary learned from clean data, it is not sufficient because the end goal is to correctly recognize the facial expression. In a sparse-representation-based classification task, the desired dictionary should have both representational ability and discriminative power. Since separating the classification training from dictionary learning may cause the learned dictionary to be sub-optimal for the classification task, we propose to jointly learn a dictionary and classification model. In other words, in contrast with most existing schemes that attempt to update the dictionary and classifier parameters alternately by iteratively solving each sub-problem, we propose to solve them simultaneously. This joint dictionary learning and classifier training can be expected to result in a dictionary that is both reconstructive and discriminative for a robust recognition system. To the best of our knowledge, this is the only work that attempts to simultaneously learn the sparse representation of the signal and train a nonlinear classifier to be discriminative for sparse codes. The proposed method jointly learns a single dictionary and also an optimal nonlinear classifier. We have performed extensive experiments on both acted and spontaneous emotion databases to evaluate the effectiveness of the proposed feature extraction and classification schemes under different scenarios. Our results clearly demonstrate the robustness of the proposed emotion recognition framework, especially in challenging scenarios that involve illumination changes, occlusion, and head pose variations. DOCTOR OF PHILOSOPHY (EEE) 2015-05-04T02:18:13Z 2015-05-04T02:18:13Z 2014 2014 Thesis Shojaeilangari, S. (2014). Robust representation and recognition of facial emotions. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/62922 10.32657/10356/62922 en 139 p. application/pdf

Robust representation and recognition of facial emotions

Similar Items