Performance comparison on graph-based sparse coding methods for face representation
Face recognition has emerged as a major topic of research interest in the domain of computer vision. Among the many face recognition techniques available, sparse coding has received much attention since its soft assignment can achieve significantly better performance than conventional vecto...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/65114 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Face recognition has emerged as a major topic of research interest in the domain of
computer vision. Among the many face recognition techniques available, sparse
coding has received much attention since its soft assignment can achieve
significantly better performance than conventional vector quantization approaches.
The goal of this dissertation is to compare the performance of different graph-based
sparse coding methods.
Vector quantization approaches have a constraint that each feature can be assigned
only to the nearest center. While sparse coding does not have this constraint, it does
not work well for features located at the boundary of several clusters. Sparse coding
has a lower reconstruction error than vector quantization, but has a higher sensitivity
to small feature variations. In the literature, two models have been proposed to solve
this problem: Laplacian Sparse Coding and Smooth Sparse Coding. This dissertation
has two major contributions. Firstly, we compare the recognition rate, time
complexity and reconstruction error of both the above methods with that of the
original sparse coding technique. The impact of dictionary size, choice of distance
metric, and PCA dimension reduction on face recognition accuracy is also examined.
Secondly, we propose a new approach to realize sparsity in marginal regression by
introducing a weight threshold. Instead of setting an addition bound directly, we
utilize a weight ratio and a greedy pursuit approach to select the coefficient required.
Two face recognition datasets, YaleB and CMU-PIE, have been used as the
experimental data. We have found that both Laplacian sparse coding and smooth
sparse coding perform better than the original sparse coding method. In the YaleB
dataset, smooth sparse coding works better than Laplacian sparse coding. But in the
CMU-PIE dataset, Laplacian sparse coding performs better than smooth sparse coding. Smooth sparse coding has much higher reconstructed error than Laplacian
sparse coding. However, smooth sparse coding has much lower time complexity than
Laplacian sparse coding. The effect of energy reservation ratio and weight threshold
parameters on the performance of the smooth sparse coding technique was analyzed.
The value of weight threshold was found to have no significant effect on the results,
while the accuracy significantly drops when the energy reservation ratio exceeds
70%. For the experiments on the weight kernel using different distance metrics, the
results show that the KNN based binary distance metric performs worse than the heat
kernel and the cosine distance metrics. |
---|