Visual recognition using deep learning (emotion recognition using artificial intelligence)

Emotion Recognition or Facial Expression Recognition (FER) is a tough task as different people express their emotion differently as a result of different ages, gender, etc., especially when it is conducted under an unconstrained environment where problems such as complex background, head pose variat...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Jian
Other Authors: Yap Kim Hui
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/157946
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Emotion Recognition or Facial Expression Recognition (FER) is a tough task as different people express their emotion differently as a result of different ages, gender, etc., especially when it is conducted under an unconstrained environment where problems such as complex background, head pose variation and occlusion hinder the network learning useful features. Classical approaches to FER rely on hand-crafted features, which attained satisfactory recognition rate on lab-controlled dataset where the face is centralized in the image and occupies almost the whole image. However, when the data become noisy, such method could not accurately predict the expression. Recently, deep learning methods become popular and state-of-the-art performance was achieved even on challenging FER in the wild datasets such as FERPlus. In this project, a FER model is proposed, which uses ResNet18 as backbone and a local attention module based on Convolutional Block Attention Module (CBAM) is designed. The low-level features extracted after the first convolution block in ResNet18 are passed to the local attention module where important local features can be learnt. By combining the output feature maps from ResNet18 and local attention module, both holistic and local features are extracted and used for expression classification. Experiments had been carried out and the proposed model obtained reasonable recognition rate on FERPlus with 84.84%, RAF-DB with 86.92% and SFEW with 54.52%.