Feature and classifier development for human detection

Computer vision has been gaining increasing popularity in this age of automation and advancing technology. The desire to automate a number of labour-intensive tasks and reduce dependence on humans has contributed to this. In particular, human detection has been looked into quite thoroughly in the re...

Full description

Saved in:
Bibliographic Details
Main Author: Amit Satpathy
Other Authors: Jiang Xudong
Format: Theses and Dissertations
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/10356/55735
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-55735
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Amit Satpathy
Feature and classifier development for human detection
description Computer vision has been gaining increasing popularity in this age of automation and advancing technology. The desire to automate a number of labour-intensive tasks and reduce dependence on humans has contributed to this. In particular, human detection has been looked into quite thoroughly in the recent years. Human detection is important in areas of surveillance, pedestrian detection, human-computer interaction etc where humans need to be detected very accurately. With accurate detection, post-processing of human actions can then be done effectively. For accurate detection, it is necessary that the features and classification framework adopted are robust and effective. In this thesis, we propose several features for human detection that make use of gradient and/or texture information present in still images for representation and a non-linear classification framework for human detection. We present comprehensive results to validate our proposed features and classification framework. We first investigate the limitations of Histogram of Oriented Gradients and Histogram of Gradients which are popular features used in human detection. Histogram of Gradients distinguishes a dark human from a bright background and vice versa. This increases the intra-class variations of humans. For the purpose of detection, this variation is insignificant and undesirable. Histogram of Oriented Gradients solves this issue of Histogram of Gradients by considering unsigned gradients which treats all gradients of opposite directions as gradients of a same orientation. However, for the same cell, Histogram of Oriented Gradients maps gradients of opposite directions to the same bin in a histogram. This causes some different structures to have the same feature representation. Analyzing these limitations, we propose Extended Histogram of Gradients. Extended Histogram of Gradients is a concatenation of 2 histograms derived from Histogram of Gradients. The first histogram is Histogram of Oriented Gradients which, in our work, is observed to be the sum of 2 halves of Histogram of Gradients. The second histogram is the Difference of Histogram of Gradient which is obtained by considering the absolute difference between the bins and their corresponding opposite direction bins of Histogram of Gradients. The concatenation of the 2 histograms produces the proposed Extended Histogram of Gradients. In addition to the proposed feature, we also propose an alternative normalization scheme for Histogram of Oriented Gradients and Extended Histogram of Gradients. In our work, we find that the default normalization scheme for Histogram of Oriented Gradients which uses clipped L2 normalization in the last step causes some different patterns to be similarly represented after clipping. This reduces the discriminative capabilities of the Histogram of Oriented Gradients. Furthermore, we also find that using the default normalization scheme does not effectively suppress any noisy gradient pixels with large magnitudes or abrupt intensity changes in the image for Extended Histogram of Gradients as typically, only the noise in the Histogram of Oriented Gradients is suppressed while the noise remains unsuppressed in the Difference of Histogram of Gradients. In this dissertation, we propose an alternative normalization scheme where clipped L2 normalization is first performed on Histogram of Gradients. The Histogram of Oriented Gradients and Extended Histogram of Gradients features are then computed from the normalized Histogram of Gradients. Local Binary and Ternary Patterns are another set of features that are highly used in texture classification and face detection. However, their applications in human detection are limited as they also differentiate a bright human against a dark background and vice versa which increases the intra-class variations of humans. Different objects have different shapes and textures. It is therefore desirable to represent objects using both texture and edge information. In order to be robust to illumination and contrast variations, Local Binary and Ternary Patterns do not differentiate between a weak contrast local pattern and a similar strong one. They only capture the object texture information. Object contours, which also contain discriminatory information, tend to be situated in strong contrast regions. Therefore, by totally discarding contrast information, the object contour may not be effectively represented by these descriptors. In this thesis, we address these issues of Local Binary and Ternary Patterns in human detection and propose new features, Discriminative Robust Local Binary and Ternary Patterns. The proposed features are a concatenation of 2 histograms - Robust Local Binary/Ternary Patterns and Difference of Local Binary/Ternary Patterns. The first histogram is obtained by summing the 2 halves of Local Binary/Ternary Pattern histograms. This histogram alleviates the intensity reversal problem of object and background. However, by doing so, there are some patterns that are misrepresented as the codes and their complements are merged in the same block to the same histogram bin. Hence, the second histogram, Difference of Local Binary/Ternary Patterns, is proposed which takes the absolute difference between the bins and their corresponding complement bins. By concatenating the 2 histograms, the misrepresentation of patterns is resolved. In addition, the proposed features do not completely ignore the contrast information of image patterns as the histograms are weighted by the gradient magnitudes at the corresponding pixels. The proposed features contain both edge and texture information which is desirable for object recognition. Existing classification frameworks use Support Vector Machines or boosting-based classifiers for classification. We examine limitations of such frameworks in this dissertation and also analyze the distribution of features in the feature space to determine an appropriate boundary for classification. As such, it is discovered that a hyper-quadratic boundary is appropriate for classification. Therefore, we propose a classification framework that includes a modified Minimum Mahalanobis Distance classifier and Asymmetric Principal Component Analysis for dimensionality reduction. For high-dimensional features, the estimated eigenvalues in some feature dimensions deviate greatly from that of the data population which results in overfitting of the Minimum Mahalanobis Distance classifier. Hence, there is a need to reduce feature dimensions to minimize the overfitting problem. Furthermore, training sets, usually, contain much fewer positive samples than the negative ones which results in the negative covariance matrix being more reliable than the positive covariance matrix. Using Principal Component Analysis is inefficient as the unreliable dimensions from the less reliable covariance matrix are not effectively removed. To tackle the problems of dimensionality reduction and the asymmetry issue of human training sets, we propose using Asymmetric Principal Component Analysis for dimension reduction. As a result, the projected features allow for a more robust classifier to be trained that less overfits the training data. We extend the use of our features to general visual object detection and image classification to demonstrate that the application is not only limited to human detection. Furthermore, using Extended Histogram of Gradients, we demonstrate that parts-based modeling improves the detection performance.
author2 Jiang Xudong
author_facet Jiang Xudong
Amit Satpathy
format Theses and Dissertations
author Amit Satpathy
author_sort Amit Satpathy
title Feature and classifier development for human detection
title_short Feature and classifier development for human detection
title_full Feature and classifier development for human detection
title_fullStr Feature and classifier development for human detection
title_full_unstemmed Feature and classifier development for human detection
title_sort feature and classifier development for human detection
publishDate 2014
url http://hdl.handle.net/10356/55735
_version_ 1772825214762090496
spelling sg-ntu-dr.10356-557352023-07-04T16:16:58Z Feature and classifier development for human detection Amit Satpathy Jiang Xudong School of Electrical and Electronic Engineering A*STAR Eng How-Lung DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Computer vision has been gaining increasing popularity in this age of automation and advancing technology. The desire to automate a number of labour-intensive tasks and reduce dependence on humans has contributed to this. In particular, human detection has been looked into quite thoroughly in the recent years. Human detection is important in areas of surveillance, pedestrian detection, human-computer interaction etc where humans need to be detected very accurately. With accurate detection, post-processing of human actions can then be done effectively. For accurate detection, it is necessary that the features and classification framework adopted are robust and effective. In this thesis, we propose several features for human detection that make use of gradient and/or texture information present in still images for representation and a non-linear classification framework for human detection. We present comprehensive results to validate our proposed features and classification framework. We first investigate the limitations of Histogram of Oriented Gradients and Histogram of Gradients which are popular features used in human detection. Histogram of Gradients distinguishes a dark human from a bright background and vice versa. This increases the intra-class variations of humans. For the purpose of detection, this variation is insignificant and undesirable. Histogram of Oriented Gradients solves this issue of Histogram of Gradients by considering unsigned gradients which treats all gradients of opposite directions as gradients of a same orientation. However, for the same cell, Histogram of Oriented Gradients maps gradients of opposite directions to the same bin in a histogram. This causes some different structures to have the same feature representation. Analyzing these limitations, we propose Extended Histogram of Gradients. Extended Histogram of Gradients is a concatenation of 2 histograms derived from Histogram of Gradients. The first histogram is Histogram of Oriented Gradients which, in our work, is observed to be the sum of 2 halves of Histogram of Gradients. The second histogram is the Difference of Histogram of Gradient which is obtained by considering the absolute difference between the bins and their corresponding opposite direction bins of Histogram of Gradients. The concatenation of the 2 histograms produces the proposed Extended Histogram of Gradients. In addition to the proposed feature, we also propose an alternative normalization scheme for Histogram of Oriented Gradients and Extended Histogram of Gradients. In our work, we find that the default normalization scheme for Histogram of Oriented Gradients which uses clipped L2 normalization in the last step causes some different patterns to be similarly represented after clipping. This reduces the discriminative capabilities of the Histogram of Oriented Gradients. Furthermore, we also find that using the default normalization scheme does not effectively suppress any noisy gradient pixels with large magnitudes or abrupt intensity changes in the image for Extended Histogram of Gradients as typically, only the noise in the Histogram of Oriented Gradients is suppressed while the noise remains unsuppressed in the Difference of Histogram of Gradients. In this dissertation, we propose an alternative normalization scheme where clipped L2 normalization is first performed on Histogram of Gradients. The Histogram of Oriented Gradients and Extended Histogram of Gradients features are then computed from the normalized Histogram of Gradients. Local Binary and Ternary Patterns are another set of features that are highly used in texture classification and face detection. However, their applications in human detection are limited as they also differentiate a bright human against a dark background and vice versa which increases the intra-class variations of humans. Different objects have different shapes and textures. It is therefore desirable to represent objects using both texture and edge information. In order to be robust to illumination and contrast variations, Local Binary and Ternary Patterns do not differentiate between a weak contrast local pattern and a similar strong one. They only capture the object texture information. Object contours, which also contain discriminatory information, tend to be situated in strong contrast regions. Therefore, by totally discarding contrast information, the object contour may not be effectively represented by these descriptors. In this thesis, we address these issues of Local Binary and Ternary Patterns in human detection and propose new features, Discriminative Robust Local Binary and Ternary Patterns. The proposed features are a concatenation of 2 histograms - Robust Local Binary/Ternary Patterns and Difference of Local Binary/Ternary Patterns. The first histogram is obtained by summing the 2 halves of Local Binary/Ternary Pattern histograms. This histogram alleviates the intensity reversal problem of object and background. However, by doing so, there are some patterns that are misrepresented as the codes and their complements are merged in the same block to the same histogram bin. Hence, the second histogram, Difference of Local Binary/Ternary Patterns, is proposed which takes the absolute difference between the bins and their corresponding complement bins. By concatenating the 2 histograms, the misrepresentation of patterns is resolved. In addition, the proposed features do not completely ignore the contrast information of image patterns as the histograms are weighted by the gradient magnitudes at the corresponding pixels. The proposed features contain both edge and texture information which is desirable for object recognition. Existing classification frameworks use Support Vector Machines or boosting-based classifiers for classification. We examine limitations of such frameworks in this dissertation and also analyze the distribution of features in the feature space to determine an appropriate boundary for classification. As such, it is discovered that a hyper-quadratic boundary is appropriate for classification. Therefore, we propose a classification framework that includes a modified Minimum Mahalanobis Distance classifier and Asymmetric Principal Component Analysis for dimensionality reduction. For high-dimensional features, the estimated eigenvalues in some feature dimensions deviate greatly from that of the data population which results in overfitting of the Minimum Mahalanobis Distance classifier. Hence, there is a need to reduce feature dimensions to minimize the overfitting problem. Furthermore, training sets, usually, contain much fewer positive samples than the negative ones which results in the negative covariance matrix being more reliable than the positive covariance matrix. Using Principal Component Analysis is inefficient as the unreliable dimensions from the less reliable covariance matrix are not effectively removed. To tackle the problems of dimensionality reduction and the asymmetry issue of human training sets, we propose using Asymmetric Principal Component Analysis for dimension reduction. As a result, the projected features allow for a more robust classifier to be trained that less overfits the training data. We extend the use of our features to general visual object detection and image classification to demonstrate that the application is not only limited to human detection. Furthermore, using Extended Histogram of Gradients, we demonstrate that parts-based modeling improves the detection performance. Doctor of Philosophy (EEE) 2014-03-24T04:15:20Z 2014-03-24T04:15:20Z 2014 2014 Thesis Amit Satpathy. (2014). Feature and classifier development for human detection. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/55735 en 162 p. application/pdf