Cross Euclidean-to-Riemannian metric learning with application to face recognition from video

Riemannian manifolds have been widely employed for video representations in visual classification tasks including video-based face recognition. The success mainly derives from learning a discriminant Riemannian metric which encodes the non-linear geometry of the underlying Riemannian manifolds. In t...

Full description

Saved in:
Bibliographic Details
Main Authors: HUANG, Zhiwu, WANG, R., SHAN, S., VAN, Gool L
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6384
https://ink.library.smu.edu.sg/context/sis_research/article/7387/viewcontent/Cross_euclidean_to_riemannian_metric_learning_with_application_to_face_recognition_from_video.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Riemannian manifolds have been widely employed for video representations in visual classification tasks including video-based face recognition. The success mainly derives from learning a discriminant Riemannian metric which encodes the non-linear geometry of the underlying Riemannian manifolds. In this paper, we propose a novel metric learning framework to learn a distance metric across a Euclidean space and a Riemannian manifold to fuse average appearance and pattern variation of faces within one video. The proposed metric learning framework can handle three typical tasks of video-based face recognition: Video-to-Still, Still-to-Video and Video-to-Video settings. To accomplish this new framework, by exploiting typical Riemannian geometries for kernel embedding, we map the source Euclidean space and Riemannian manifold into a common Euclidean subspace, each through a corresponding high-dimensional Reproducing Kernel Hilbert Space (RKHS). With this mapping, the problem of learning a cross-view metric between the two source heterogeneous spaces can be converted to learning a single-view Euclidean distance metric in the target common Euclidean space. By learning information on heterogeneous data with the shared label, the discriminant metric in the common space improves face recognition from videos. Extensive experiments on four challenging video face databases demonstrate that the proposed framework has a clear advantage over the state-of-the-art methods in the three classical video-based face recognition scenarios.