Color and deep learning features in face recognition

Face recognition (FR) has been one of the most active research topics in computer vision for more than three decades. It has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc. A conventional FR system consists of f...

Full description

Saved in:
Bibliographic Details
Main Author: Lu, Ze
Other Authors: Kot Chichung, Alex
Format: Theses and Dissertations
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/75855
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-75855
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Lu, Ze
Color and deep learning features in face recognition
description Face recognition (FR) has been one of the most active research topics in computer vision for more than three decades. It has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc. A conventional FR system consists of four stages, face detection, face alignment, face representation and face matching. Compared with the other three, face representation significantly affects the performance of a FR system because it determines whether the FR system is robust to real world variations, such as illumination, pose, occlusion, etc. Most of the early FR works have been limited to using grayscale images. Recently, research efforts have been dedicated to incorporate color information into the feature extraction process for improving FR performance. Specifically, different feature representations are extracted from face images in a certain color space and then fused together for classification. Main challenges of such color FR tasks include how to construct an effective color space to represent color images, and how to fuse different feature representations extracted from face images. To tackle the challenge of color space construction, we propose a framework to derive an effective color space $LuC_{1}C_{2}$ from the fundamental color space $RGB$. For the fusion of color features, we propose a Color Channel Fusion (CCF) method, a Covariance Matrix Regularization (CMR) method, and a color face descriptor, Ternary Color Local Binary Patterns (TCLBP). More recently, Convolutional Neural Networks (CNNs) have been proven effective for extracting high-level visual features from face images. However, there exist some problems in CNN feature representations. For example, the generalization ability of pre-trained CNN features is limited when training and testing data has large differences, and the performance of CNNs drops dramatically when dealing with images of low resolution. Moreover, the feature fusion for different CNN architectures has not been thoroughly studied yet. To enhance the generalization ability of pre-trained CNN features, we investigate the combination of high-level CNN representations with low-level features, color pixel values, by score fusion. For different CNN architectures, we train a simplified ResNet model, ResNetShort, to fuse its features with those of VGG-Face by CMR. For low-resolution FR (LRFR), we propose a Deep Coupled ResNet model (DCR). Color in the machine vision system is defined by a combination of 3 color components specified by a color space. Existing color spaces are based on different criteria and their performance is not consistent on different datasets. This motivates us to propose a framework for constructing effective color spaces. The proposed color space, $LuC_{1}C_{2}$, consists of one luminance component and two chrominance components. The luminance component $Lu$ is selected among four luminance candidates from existing color models by analysing their R,G,B coefficients and the color sensor properties. The chrominance components are derived by the discriminant analysis and the covariance analysis. Experiments show that both hand-crafted and CNN feature representations extracted from the $LuC_{1}C_{2}$ images perform consistently better than those extracted from images of other color spaces. The fusion of multiple color features is important for achieving state-of-the-art FR performance. Existing color feature fusion methods either reduce the dimensionality of feature vectors in each color channel first and then concatenate all low-dimensional feature vectors, named as DR-Cat, or the vice versa, named as Cat-DR. In DR-Cat, existing methods simply reduce features in different color channels to the same number of dimensions and concatenate them. But the importance or reliability of features in different color channels is not the same. We propose a Color Channel Fusion (CCF) approach to select more features from more reliable and discriminative channels. Moreover, DR-Cat ignores the correlation information between different features while Cat-DR fully uses it. The correlation information estimated from the training data may not be reliable. We propose a Covariance Matrix Regularization (CMR) technique to regularize the feature correlation estimated from training data before using it to train the feature fusion model. In addition to the fusion of different color features, we also jointly consider three color channels during color feature extraction by proposing the Ternay Color LBP (TCLBP) descriptor. Besides intra-channel LBP features, we extract the inter-channel LBP features by encoding the spectral structure of R,G,B component images at the same location. CNNs have very large numbers of parameters that must be trained by millions of training examples. For every different application scenario, the common practice is to pre-train a CNN model on a very large dataset and then use it either as a fixed feature extractor or an initialization for fine-tuning on images from the application of interest. After successive convolutional layers in the CNN, high-level features are formed in the top layer. However, the low-level feature information might be lost in the top layer. The combination of high-level CNN features and low-level features can reduce the possible information loss in the top layer. Furthermore, low-level features depict basic characteristics of face images from the application of interest. We investigate the fusion of CNN features and the lowest-level features, color pixel values, to boost the generalization ability of pre-trained CNNs for different application scenarios. The two types of features are fused by the way of score fusion instead of feature-level fusion due to their big differences. To further improve the performance of CNNs, we train a simplified ResNet model, ResNetShort, and combine its features with those of VGG-Face by our proposed CMR technique. The two CNN models are trained from different face images by optimizing different loss functions through different architectures. This makes the learned discriminative information contained in ResNetShort features and VGG-Face features mutually complementary to each other thus better performance is achieved by the fusion of them. The FR performance of CNNs largely drops when CNNs are applied to face images of low-resolution. Existing CNN methods can not deal with different resolutions of probe images. We propose a deep coupled network which extracts coupled mappings from face images to tackle the resolution degradation problem of probe images.
author2 Kot Chichung, Alex
author_facet Kot Chichung, Alex
Lu, Ze
format Theses and Dissertations
author Lu, Ze
author_sort Lu, Ze
title Color and deep learning features in face recognition
title_short Color and deep learning features in face recognition
title_full Color and deep learning features in face recognition
title_fullStr Color and deep learning features in face recognition
title_full_unstemmed Color and deep learning features in face recognition
title_sort color and deep learning features in face recognition
publishDate 2018
url http://hdl.handle.net/10356/75855
_version_ 1772826467389931520
spelling sg-ntu-dr.10356-758552023-07-04T17:30:25Z Color and deep learning features in face recognition Lu, Ze Kot Chichung, Alex Jiang Xudong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Face recognition (FR) has been one of the most active research topics in computer vision for more than three decades. It has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc. A conventional FR system consists of four stages, face detection, face alignment, face representation and face matching. Compared with the other three, face representation significantly affects the performance of a FR system because it determines whether the FR system is robust to real world variations, such as illumination, pose, occlusion, etc. Most of the early FR works have been limited to using grayscale images. Recently, research efforts have been dedicated to incorporate color information into the feature extraction process for improving FR performance. Specifically, different feature representations are extracted from face images in a certain color space and then fused together for classification. Main challenges of such color FR tasks include how to construct an effective color space to represent color images, and how to fuse different feature representations extracted from face images. To tackle the challenge of color space construction, we propose a framework to derive an effective color space $LuC_{1}C_{2}$ from the fundamental color space $RGB$. For the fusion of color features, we propose a Color Channel Fusion (CCF) method, a Covariance Matrix Regularization (CMR) method, and a color face descriptor, Ternary Color Local Binary Patterns (TCLBP). More recently, Convolutional Neural Networks (CNNs) have been proven effective for extracting high-level visual features from face images. However, there exist some problems in CNN feature representations. For example, the generalization ability of pre-trained CNN features is limited when training and testing data has large differences, and the performance of CNNs drops dramatically when dealing with images of low resolution. Moreover, the feature fusion for different CNN architectures has not been thoroughly studied yet. To enhance the generalization ability of pre-trained CNN features, we investigate the combination of high-level CNN representations with low-level features, color pixel values, by score fusion. For different CNN architectures, we train a simplified ResNet model, ResNetShort, to fuse its features with those of VGG-Face by CMR. For low-resolution FR (LRFR), we propose a Deep Coupled ResNet model (DCR). Color in the machine vision system is defined by a combination of 3 color components specified by a color space. Existing color spaces are based on different criteria and their performance is not consistent on different datasets. This motivates us to propose a framework for constructing effective color spaces. The proposed color space, $LuC_{1}C_{2}$, consists of one luminance component and two chrominance components. The luminance component $Lu$ is selected among four luminance candidates from existing color models by analysing their R,G,B coefficients and the color sensor properties. The chrominance components are derived by the discriminant analysis and the covariance analysis. Experiments show that both hand-crafted and CNN feature representations extracted from the $LuC_{1}C_{2}$ images perform consistently better than those extracted from images of other color spaces. The fusion of multiple color features is important for achieving state-of-the-art FR performance. Existing color feature fusion methods either reduce the dimensionality of feature vectors in each color channel first and then concatenate all low-dimensional feature vectors, named as DR-Cat, or the vice versa, named as Cat-DR. In DR-Cat, existing methods simply reduce features in different color channels to the same number of dimensions and concatenate them. But the importance or reliability of features in different color channels is not the same. We propose a Color Channel Fusion (CCF) approach to select more features from more reliable and discriminative channels. Moreover, DR-Cat ignores the correlation information between different features while Cat-DR fully uses it. The correlation information estimated from the training data may not be reliable. We propose a Covariance Matrix Regularization (CMR) technique to regularize the feature correlation estimated from training data before using it to train the feature fusion model. In addition to the fusion of different color features, we also jointly consider three color channels during color feature extraction by proposing the Ternay Color LBP (TCLBP) descriptor. Besides intra-channel LBP features, we extract the inter-channel LBP features by encoding the spectral structure of R,G,B component images at the same location. CNNs have very large numbers of parameters that must be trained by millions of training examples. For every different application scenario, the common practice is to pre-train a CNN model on a very large dataset and then use it either as a fixed feature extractor or an initialization for fine-tuning on images from the application of interest. After successive convolutional layers in the CNN, high-level features are formed in the top layer. However, the low-level feature information might be lost in the top layer. The combination of high-level CNN features and low-level features can reduce the possible information loss in the top layer. Furthermore, low-level features depict basic characteristics of face images from the application of interest. We investigate the fusion of CNN features and the lowest-level features, color pixel values, to boost the generalization ability of pre-trained CNNs for different application scenarios. The two types of features are fused by the way of score fusion instead of feature-level fusion due to their big differences. To further improve the performance of CNNs, we train a simplified ResNet model, ResNetShort, and combine its features with those of VGG-Face by our proposed CMR technique. The two CNN models are trained from different face images by optimizing different loss functions through different architectures. This makes the learned discriminative information contained in ResNetShort features and VGG-Face features mutually complementary to each other thus better performance is achieved by the fusion of them. The FR performance of CNNs largely drops when CNNs are applied to face images of low-resolution. Existing CNN methods can not deal with different resolutions of probe images. We propose a deep coupled network which extracts coupled mappings from face images to tackle the resolution degradation problem of probe images. Doctor of Philosophy (EEE) 2018-06-20T04:06:39Z 2018-06-20T04:06:39Z 2018 Thesis Lu, Z. (2018). Color and deep learning features in face recognition. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/75855 10.32657/10356/75855 en 189 p. application/pdf