Visual recognition using deep learning (emotion recognition using artificial intelligence)
Emotion Recognition or Facial Expression Recognition (FER) is a tough task as different people express their emotion differently as a result of different ages, gender, etc., especially when it is conducted under an unconstrained environment where problems such as complex background, head pose variat...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/157946 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Emotion Recognition or Facial Expression Recognition (FER) is a tough task as different people express their emotion differently as a result of different ages, gender, etc., especially when it is conducted under an unconstrained environment where problems such as complex background, head pose variation and occlusion hinder the network learning useful features. Classical approaches to FER rely on hand-crafted features, which attained satisfactory recognition rate on lab-controlled dataset where the face is centralized in the image and occupies almost the whole image. However, when the data become noisy, such method could not accurately predict the expression. Recently, deep learning methods become popular and state-of-the-art performance was achieved even on challenging FER in the wild datasets such as FERPlus.
In this project, a FER model is proposed, which uses ResNet18 as backbone and a local attention module based on Convolutional Block Attention Module (CBAM) is designed. The low-level features extracted after the first convolution block in ResNet18 are passed to the local attention module where important local features can be learnt. By combining the output feature maps from ResNet18 and local attention module, both holistic and local features are extracted and used for expression classification. Experiments had been carried out and the proposed model obtained reasonable recognition rate on FERPlus with 84.84%, RAF-DB with 86.92% and SFEW with 54.52%. |
---|