Sparse representation for object recognition : understanding detectors and descriptors
Image category recognition is important to access visual information on the level of objects and scene types. Much importance has recently been placed on the detection and recognition of locally (weak) affine invariant region descriptors for object recognition. SIFT descriptors are well known for...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/16893 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Image category recognition is important to access visual information on the level of objects and scene
types. Much importance has recently been placed on the detection and recognition of locally (weak)
affine invariant region descriptors for object recognition. SIFT descriptors are well known for its
invariance to image scale and rotation, and are shown to provide robust matching across a substantial
range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination (Lowe
D. G., 2004). Therefore, the first section of the report, experiments on object recognition to test SIFT’s
properties and reliability were discussed. The recognition proceeds by visually matching the interest
points detected using Euclidean distance. These experiments used COIL (S Nene, 1996) dataset.
Results obtained from these experiments showed that SIFT descriptor is scale invariant by a scale
factor of 2 and it is distinctive enough for accurate object recognition.
On top of that, object recognition through image classification has become another growing area of
significance. The size of images is a concern in many applications. There is a need to find out the
smallest possible size of an object or image that could be used in applications which gives good
performance. A group of researchers from Massachusetts Institute of Technology have found that
many recognition tasks can be solved with images as small as 32X32 pixels (Antonio Torralba, 2007).
Thus, the second section of this report will present investigations conducted to find out the smallest
optimal size acceptable. The image classifications were conducted using PHOW descriptors and
Caltech-256 dataset. The outcomes of image classification suggested that 64X64pixels images are the
smallest size that could be used to attain correct grouping for simple images with plain background.
However for real images which are more complex, the smallest size that could be used to achieve
better classification is 128X128 pixels. Any images of sizes smaller than that will cause difficulty in
classifying them.
In the rest of this report, the term “32px” refers to the image of size “32X32 pixels”. Likewise, “64 px”
refers to “64x64 pixels”, “128px” refers to “128x128pixels” and so on. |
---|