Facial expression recognition using deep learning

Recognizing facial expressions is one of the fundamental computer vision applications. Many prior esearch studies have been conducted for more robust recognition performance. With the success of Vision Transformer (ViT) in many other areas, we found it remains challenging to apply it in the task of...

全面介紹

Saved in:
書目詳細資料
主要作者: Wang, Xiao Yi
其他作者: Yap Kim Hui
格式: Thesis-Master by Coursework
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/178999
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:Recognizing facial expressions is one of the fundamental computer vision applications. Many prior esearch studies have been conducted for more robust recognition performance. With the success of Vision Transformer (ViT) in many other areas, we found it remains challenging to apply it in the task of facial expression recognition. It limits the development of this research area due to the issue that the existing dataset is insufficient for the requirement of training a robust Vision Transformer. In this dissertation, in order to train a high-performance Vision Transformer for the facial expression recognition problem, three existing public datasets are merged into a new standard dataset with unified samples, and the sample size under each label reaches 20,000 by using data augmentation and other methods. We also implement a Vision Transformer and it is trained on our augmented dataset. Under the same parameter setting, we compare ViT with the other four baseline models and demonstrate its superiority. The optimal ViT configuration parameters are obtained by analyzing and comparing the training statistics with different configurations on our dataset and the testing results in a noisy test set. In addition, a real-time facial expression recognition prototype using the web camera and Single Shot Multibox Detector (SSD) face detection module is implemented for real-world evaluation.