Pose-robust face recognition

With the development of deep learning technology, smart photo gallery system which supports text-based searching becomes deployable on personal smart phones. In order to retrieve images containing specific person, it is necessary to find out who is contained inside each image. Above feature is a typ...

全面介紹

Saved in:
書目詳細資料
主要作者: Xie, Zhuofan
其他作者: Yap Kim Hui
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2020
主題:
在線閱讀:https://hdl.handle.net/10356/139988
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:With the development of deep learning technology, smart photo gallery system which supports text-based searching becomes deployable on personal smart phones. In order to retrieve images containing specific person, it is necessary to find out who is contained inside each image. Above feature is a typical face recognition task which usually contains detection stage and identification stage. Our work focus on the identification stage. At this stage, face detected is encoded into a vector for componence with those in the database to pick out the label of closest one as the identity. The problem is, encoded vectors of one’s frontal face image may differ a lot from those generated from profile face image although they are from the same person. This problem occurs mainly because it often contains much more front face images than profile face images in the training data. To solve this problem, we need to find a way to map profile vector into front vector of the same person . Specifically, it consists of two parts. First, a head rotation estimator is developed to get the yaw angle as the weight parameter, representing how much modification needed. Second, a light-weight CNN network is trained to learn the profile-front mapping. Images taken from different angles for groups of people are used to train the network. The goal is to minimize the Euclid distance between encoded profile face vectors and encoded front face vectors for each person. Our model achieves an accuracy of 97.4% tested on LFW dataset compared with 95.6% achieved by original FaceNet.