Color and deep learning features in face recognition

Face recognition (FR) has been one of the most active research topics in computer vision for more than three decades. It has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc. A conventional FR system consists of f...

Full description

Saved in:

Bibliographic Details
Main Author:	Lu, Ze
Other Authors:	Kot Chichung, Alex
Format:	Theses and Dissertations
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering::Electrical and electronic engineering
Online Access:	http://hdl.handle.net/10356/75855
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-75855
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering Lu, Ze Color and deep learning features in face recognition
description	Face recognition (FR) has been one of the most active research topics in computer vision for more than three decades. It has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc. A conventional FR system consists of four stages, face detection, face alignment, face representation and face matching. Compared with the other three, face representation significantly affects the performance of a FR system because it determines whether the FR system is robust to real world variations, such as illumination, pose, occlusion, etc. Most of the early FR works have been limited to using grayscale images. Recently, research efforts have been dedicated to incorporate color information into the feature extraction process for improving FR performance. Specifically, different feature representations are extracted from face images in a certain color space and then fused together for classification. Main challenges of such color FR tasks include how to construct an effective color space to represent color images, and how to fuse different feature representations extracted from face images. To tackle the challenge of color space construction, we propose a framework to derive an effective color space $LuC_{1}C_{2}$ from the fundamental color space $RGB$. For the fusion of color features, we propose a Color Channel Fusion (CCF) method, a Covariance Matrix Regularization (CMR) method, and a color face descriptor, Ternary Color Local Binary Patterns (TCLBP). More recently, Convolutional Neural Networks (CNNs) have been proven effective for extracting high-level visual features from face images. However, there exist some problems in CNN feature representations. For example, the generalization ability of pre-trained CNN features is limited when training and testing data has large differences, and the performance of CNNs drops dramatically when dealing with images of low resolution. Moreover, the feature fusion for different CNN architectures has not been thoroughly studied yet. To enhance the generalization ability of pre-trained CNN features, we investigate the combination of high-level CNN representations with low-level features, color pixel values, by score fusion. For different CNN architectures, we train a simplified ResNet model, ResNetShort, to fuse its features with those of VGG-Face by CMR. For low-resolution FR (LRFR), we propose a Deep Coupled ResNet model (DCR). Color in the machine vision system is defined by a combination of 3 color components specified by a color space. Existing color spaces are based on different criteria and their performance is not consistent on different datasets. This motivates us to propose a framework for constructing effective color spaces. The proposed color space, $LuC_{1}C_{2}$, consists of one luminance component and two chrominance components. The luminance component $Lu$ is selected among four luminance candidates from existing color models by analysing their R,G,B coefficients and the color sensor properties. The chrominance components are derived by the discriminant analysis and the covariance analysis. Experiments show that both hand-crafted and CNN feature representations extracted from the $LuC_{1}C_{2}$ images perform consistently better than those extracted from images of other color spaces. The fusion of multiple color features is important for achieving state-of-the-art FR performance. Existing color feature fusion methods either reduce the dimensionality of feature vectors in each color channel first and then concatenate all low-dimensional feature vectors, named as DR-Cat, or the vice versa, named as Cat-DR. In DR-Cat, existing methods simply reduce features in different color channels to the same number of dimensions and concatenate them. But the importance or reliability of features in different color channels is not the same. We propose a Color Channel Fusion (CCF) approach to select more features from more reliable and discriminative channels. Moreover, DR-Cat ignores the correlation information between different features while Cat-DR fully uses it. The correlation information estimated from the training data may not be reliable. We propose a Covariance Matrix Regularization (CMR) technique to regularize the feature correlation estimated from training data before using it to train the feature fusion model. In addition to the fusion of different color features, we also jointly consider three color channels during color feature extraction by proposing the Ternay Color LBP (TCLBP) descriptor. Besides intra-channel LBP features, we extract the inter-channel LBP features by encoding the spectral structure of R,G,B component images at the same location. CNNs have very large numbers of parameters that must be trained by millions of training examples. For every different application scenario, the common practice is to pre-train a CNN model on a very large dataset and then use it either as a fixed feature extractor or an initialization for fine-tuning on images from the application of interest. After successive convolutional layers in the CNN, high-level features are formed in the top layer. However, the low-level feature information might be lost in the top layer. The combination of high-level CNN features and low-level features can reduce the possible information loss in the top layer. Furthermore, low-level features depict basic characteristics of face images from the application of interest. We investigate the fusion of CNN features and the lowest-level features, color pixel values, to boost the generalization ability of pre-trained CNNs for different application scenarios. The two types of features are fused by the way of score fusion instead of feature-level fusion due to their big differences. To further improve the performance of CNNs, we train a simplified ResNet model, ResNetShort, and combine its features with those of VGG-Face by our proposed CMR technique. The two CNN models are trained from different face images by optimizing different loss functions through different architectures. This makes the learned discriminative information contained in ResNetShort features and VGG-Face features mutually complementary to each other thus better performance is achieved by the fusion of them. The FR performance of CNNs largely drops when CNNs are applied to face images of low-resolution. Existing CNN methods can not deal with different resolutions of probe images. We propose a deep coupled network which extracts coupled mappings from face images to tackle the resolution degradation problem of probe images.
author2	Kot Chichung, Alex
author_facet	Kot Chichung, Alex Lu, Ze
format	Theses and Dissertations
author	Lu, Ze
author_sort	Lu, Ze
title	Color and deep learning features in face recognition
title_short	Color and deep learning features in face recognition
title_full	Color and deep learning features in face recognition
title_fullStr	Color and deep learning features in face recognition
title_full_unstemmed	Color and deep learning features in face recognition
title_sort	color and deep learning features in face recognition
publishDate	2018
url	http://hdl.handle.net/10356/75855
_version_	1772826467389931520
spelling	sg-ntu-dr.10356-758552023-07-04T17:30:25Z Color and deep learning features in face recognition Lu, Ze Kot Chichung, Alex Jiang Xudong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Face recognition (FR) has been one of the most active research topics in computer vision for more than three decades. It has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc. A conventional FR system consists of four stages, face detection, face alignment, face representation and face matching. Compared with the other three, face representation significantly affects the performance of a FR system because it determines whether the FR system is robust to real world variations, such as illumination, pose, occlusion, etc. Most of the early FR works have been limited to using grayscale images. Recently, research efforts have been dedicated to incorporate color information into the feature extraction process for improving FR performance. Specifically, different feature representations are extracted from face images in a certain color space and then fused together for classification. Main challenges of such color FR tasks include how to construct an effective color space to represent color images, and how to fuse different feature representations extracted from face images. To tackle the challenge of color space construction, we propose a framework to derive an effective color space $LuC_{1}C_{2}$ from the fundamental color space $RGB$. For the fusion of color features, we propose a Color Channel Fusion (CCF) method, a Covariance Matrix Regularization (CMR) method, and a color face descriptor, Ternary Color Local Binary Patterns (TCLBP). More recently, Convolutional Neural Networks (CNNs) have been proven effective for extracting high-level visual features from face images. However, there exist some problems in CNN feature representations. For example, the generalization ability of pre-trained CNN features is limited when training and testing data has large differences, and the performance of CNNs drops dramatically when dealing with images of low resolution. Moreover, the feature fusion for different CNN architectures has not been thoroughly studied yet. To enhance the generalization ability of pre-trained CNN features, we investigate the combination of high-level CNN representations with low-level features, color pixel values, by score fusion. For different CNN architectures, we train a simplified ResNet model, ResNetShort, to fuse its features with those of VGG-Face by CMR. For low-resolution FR (LRFR), we propose a Deep Coupled ResNet model (DCR). Color in the machine vision system is defined by a combination of 3 color components specified by a color space. Existing color spaces are based on different criteria and their performance is not consistent on different datasets. This motivates us to propose a framework for constructing effective color spaces. The proposed color space, $LuC_{1}C_{2}$, consists of one luminance component and two chrominance components. The luminance component $Lu$ is selected among four luminance candidates from existing color models by analysing their R,G,B coefficients and the color sensor properties. The chrominance components are derived by the discriminant analysis and the covariance analysis. Experiments show that both hand-crafted and CNN feature representations extracted from the $LuC_{1}C_{2}$ images perform consistently better than those extracted from images of other color spaces. The fusion of multiple color features is important for achieving state-of-the-art FR performance. Existing color feature fusion methods either reduce the dimensionality of feature vectors in each color channel first and then concatenate all low-dimensional feature vectors, named as DR-Cat, or the vice versa, named as Cat-DR. In DR-Cat, existing methods simply reduce features in different color channels to the same number of dimensions and concatenate them. But the importance or reliability of features in different color channels is not the same. We propose a Color Channel Fusion (CCF) approach to select more features from more reliable and discriminative channels. Moreover, DR-Cat ignores the correlation information between different features while Cat-DR fully uses it. The correlation information estimated from the training data may not be reliable. We propose a Covariance Matrix Regularization (CMR) technique to regularize the feature correlation estimated from training data before using it to train the feature fusion model. In addition to the fusion of different color features, we also jointly consider three color channels during color feature extraction by proposing the Ternay Color LBP (TCLBP) descriptor. Besides intra-channel LBP features, we extract the inter-channel LBP features by encoding the spectral structure of R,G,B component images at the same location. CNNs have very large numbers of parameters that must be trained by millions of training examples. For every different application scenario, the common practice is to pre-train a CNN model on a very large dataset and then use it either as a fixed feature extractor or an initialization for fine-tuning on images from the application of interest. After successive convolutional layers in the CNN, high-level features are formed in the top layer. However, the low-level feature information might be lost in the top layer. The combination of high-level CNN features and low-level features can reduce the possible information loss in the top layer. Furthermore, low-level features depict basic characteristics of face images from the application of interest. We investigate the fusion of CNN features and the lowest-level features, color pixel values, to boost the generalization ability of pre-trained CNNs for different application scenarios. The two types of features are fused by the way of score fusion instead of feature-level fusion due to their big differences. To further improve the performance of CNNs, we train a simplified ResNet model, ResNetShort, and combine its features with those of VGG-Face by our proposed CMR technique. The two CNN models are trained from different face images by optimizing different loss functions through different architectures. This makes the learned discriminative information contained in ResNetShort features and VGG-Face features mutually complementary to each other thus better performance is achieved by the fusion of them. The FR performance of CNNs largely drops when CNNs are applied to face images of low-resolution. Existing CNN methods can not deal with different resolutions of probe images. We propose a deep coupled network which extracts coupled mappings from face images to tackle the resolution degradation problem of probe images. Doctor of Philosophy (EEE) 2018-06-20T04:06:39Z 2018-06-20T04:06:39Z 2018 Thesis Lu, Z. (2018). Color and deep learning features in face recognition. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/75855 10.32657/10356/75855 en 189 p. application/pdf

Color and deep learning features in face recognition

Similar Items