Improving cross-domain face anti-spoofing under different source training data configurations

Facial Recognition (FR), popular for its user-friendly identity verification, encounters significant risks from face spoofing attacks or Presentation Attacks (PA). Developing face anti-spoofing systems is crucial for FR in security-sensitive environments. Current face anti-spoofing methods are effe...

Full description

Saved in:
Bibliographic Details
Main Author: Cai, Rizhao
Other Authors: Alex Chichung Kot
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181095
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Facial Recognition (FR), popular for its user-friendly identity verification, encounters significant risks from face spoofing attacks or Presentation Attacks (PA). Developing face anti-spoofing systems is crucial for FR in security-sensitive environments. Current face anti-spoofing methods are effective in the intra-domain evaluation, where the distribution of test data matches that of training data. However, the distributions differ in the cross-domain evaluation, and there is a significant performance decline in effectiveness. This decline results from exposure to new domain data from unknown distributions, which is unseen during training. The cross-domain problem is the main challenge in face anti-spoofing research. In this thesis, we study how to improve the face anti-spoofing models' cross-domain capability under different configurations of source training data. Based on the actual development situations, the training data can be configured differently and this thesis makes contributions in three different configurations. 1) Single-domain data configuration: the training data is from a limited set of distributions when the data capturing environments and camera sensors are limited. This is a common situation because collecting large-scale data is not easy. Under this configuration, we contribute a novel framework that utilizes reinforcement learning to mine fine-grain local features, and this framework fuses the global and the local features to get more discriminative features for anti-spoofing. 2) Multi-domain data configuration: the training data from multiple source domains are available. These data are from more complex distributions because the capturing environments and camera sensors are more diverse. To simulate this configuration, we collect multiple datasets, with each represented as one source domain. Based on multiple data domains, we propose a meta-learning-based method to learn our proposed Meta Pattern, serving as prior information that is generalized across different domains. 3) Continual-domain configuration: the training data is collected continually over time. The models are trained when newly collected data comes in but the model may forget knowledge learned from previous data, which is called the catastrophic forgetting problem. With this configuration, we continually fine-tune Vision Transformer models with our proposed novel adapter modules, which can improve cross-domain generalization capability and mitigate catastrophic forgetting. With the single-domain and multi-domain data configuration, we evaluate our proposed method on the existing cross-domain benchmarks. On these benchmarks, our methods show their effectiveness by surpassing the performance of state-of-the-art methods. With the continual-domain configuration, we design and contribute the first rehearsal-free domain continual protocols for face anti-spoofing. Experimental results show that our proposed method can outperform the state-of-the-art methods by achieving more generalization capability and less forgetting on different evaluation protocols by a clear margin.