Search based face annotation

In the era of big data today, the rapid growth of facial images has posed a lot of research challenges and created a lot of opportunities for many real-world applications. An important emerging research problem in this area is automatic face annotation, which aims to automatically tag human names to...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Dayong
Other Authors: Hoi Chu Hong
Format: Theses and Dissertations
Language:English
Published: 2014
Subjects:
Online Access:https://hdl.handle.net/10356/59861
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In the era of big data today, the rapid growth of facial images has posed a lot of research challenges and created a lot of opportunities for many real-world applications. An important emerging research problem in this area is automatic face annotation, which aims to automatically tag human names to a query facial image. Unlike conventional “model-based face annotation” techniques, some recent emerging studies have attempted to tackle this problem in the framework “Search-based Face Annotation” (SBFA), which mines massive web facial images that are freely available on the world wide web (WWW) by exploring content-based image retrieval (CBIR) techniques. Due to the noisy nature of web images, the raw labels of web facial images are often noisy, in which some facial images are associated with incorrect/incomplete names. Such kind of raw facial image database is referred to as “weakly labeled web facial image database” in this thesis. This thesis investigates a comprehensive framework of the “Search-based Face Annotation” paradigm, which in general consists of three main stages: firstly, given a query facial image, it typically involves a pre-processing stage, including face detection, face alignment, and facial feature extraction; subsequently, the input facial image is aligned into a consistent position and is represented as a feature vector in the facial feature space; second, it retrieves the top-ranked similar instances of the query facial image from a large-scale weakly labeled web facial image database using content-based image retrieval techniques; and thirdly, it names the query image by mining the top-ranked similar images and the corresponding weak label information. In this thesis, we mainly focus on tackling three open challenging tasks: (i) how to effectively enhance the initial weak name labels, (ii) how to effectively exploit a short list of candidate facial images and their weak label information for the face name annotation task, and (iii) how to effectively boost the annotation performance of SBFA problem by adopting sophisticated machine learning techniques in a unified framework. This thesis investigates a family of effective and efficient algorithms to tackle the aforementioned challenges. In particular, the main contributions of this thesis are listed as follows: (1) To enhance the initial weak label information, we propose an Unsupervised Label Refinement (ULR) approach by adopting graph-based learning techniques. An effective optimization algorithm is proposed to solve the large-scale learning task and a Clustering-based Approximation (CBA) algorithm is proposed to improve the system scalability. (2) To fully mine the top-ranked similar images and the corresponding weak label information, we proposed a Weak Label Regularized Local Coordinate Coding (WLRLCC) algorithm, which exploits the principle of local coordinate coding to learn more discriminative facial features for efficient label propagation in a sparse reconstruction scheme. (3) By combining two different learning scheme “transductive learning” and “inductive learning” in the same framework based on the entropy information, we further proposed a “Unified Learning Framework for Auto Face Annotation” (UTIL), which efficiently improve the annotation performance without introduce much training effort. (4) To go beyond the limitation of single facial feature representation and general Euclidean distance space, we proposed a multi-model based “Learn to Name Faces” (L2NF) algorithm, which optimize the fusion of multiple annotation modalities in a learning to rank framework. (5) Based on search-based face annotation framework, we built a real world demonstration system: FANS, which is available at http://msm.cais.ntu.edu.sg/fans. It can be used to predict the celebrity name, or find the most similar celebrity image of the query images. In addition to the website interface, we also deploy an Android APP for the mobile devices, which popularizes the usage of our system. This thesis has constructed an extensive set of experiments by evaluating the performance of the proposed techniques in comparison to the state-of-the-art methods on large-scale real-world facial image databases. Encouraging results illustrate that the proposed techniques significantly boost the state-of-the-art annotation performance for automated face annotation.