A comparison of features in parts-based object recognition hierarchies
Summary (2 min read)
1 Introduction
- The human brain employs different kinds of interrelated representations and processes to recognize objects, depending on the familiarity of the object and the required level of recognition, which is defined by the current task.
- A parts-based representation is especially efficient for storing and categorizing novel objects, because the largest variance in unseen views of an object can be expected in the position and arrangement of parts, while each part of an object will be visible under a large variety of 3D object transformations.
- Here hierarchies of feature layers are used, like in the ventral visual pathway, where they combine specificity and invariance of features.
- In other holistic methods the receptive fields of the features cover the whole image.
- The approach selects features based on the maximization of mutual information for a single class.
2 Analytic Features
- To generalize from few training examples, parts-based recognition follows the notion that similar combinations of parts are specific for a certain category over a wide range of variations.
- So the authors need a reasonable feature selection strategy that evaluates which and how many views of a certain category a feature can separate from other categories and, based on those results, choose the subset of features that in combination can describe the whole scenario best.
- The maximum activated bin in this histogram is used to normalize the rotation of the patch in advance.
- A similar cluster step was also done in [15] to improve the generalization performance of the otherwise very specific SIFT descriptors.
- The feedforward hierarchy proposed in [3] is shown in Fig. 1a.
3 Results
- The authors tested the performance of the different feature types on the categorization scenario shown in Fig.
- For the different tests the authors then varied the number of used features and the number of training views that were used by a single layer perceptron (SLP), as the final classifier.
- C2-H is similar to C2-P and GRAY-P, and takes the lead when using a large number of views.
- The performance of SIFT-P and GRAY-P on cups(7) is very poor and does not improve with more training views.
- This is especially true for categories where the rotation in depth looks like rotation in plane (bottle(2), brush(4), phone(9), tool(10)).
4 Conclusion
- The authors evaluated the performance of different types of local feature when used in parts-based recognition.
- The biological motivated feedforward hierarchy in [3] is powerful in holistic recognition with a sufficient number of training examples, but the patches from the output layer are too general and therefore show weak performance in parts-based recognition.
- First features are used that extract the magnitudes for 8 different local gradient directions.
- This could be beneficial for both feature types.
- The most related work in the direction of analytic features was done in [16], where Ullman introduced invariance over viewpoint in his fragments approach, or in the work of Dorko et al. in [15], where highly informative clusters of SIFT descriptors are used.
Did you find this useful? Give us your feedback
Citations
4 citations
2 citations
Cites methods from "A comparison of features in parts-b..."
...Parts-based methods (Leibe et al., 2004, Hasler et al., 2007) dominate most of the current work in this field....
[...]
References
46,906 citations
14,562 citations
"A comparison of features in parts-b..." refers methods in this paper
...in [7] features obtained by principal component analysis (PCA) on gray-scale images were used to classify faces....
[...]
11,500 citations
"A comparison of features in parts-b..." refers methods in this paper
...In contrary to PCA other methods produce so called parts-based features like the nonnegative matrix factorization (NMF) proposed in [8] or a similar scheme proposed in [9] yielding more class-specific features....
[...]
9,604 citations
7,057 citations
"A comparison of features in parts-b..." refers background in this paper
...How well certain local descriptors can be re-detected under different image transformations, as scale, rotation and viewpoint changes, was investigated in [14]....
[...]
Related Papers (5)
Frequently Asked Questions (2)
Q2. What are the future works in "A comparison of features in parts-based object recognition hierarchies" ?
Besides the normalization of rotation for SIFT, it would be interesting to investigate other reasons for the differences in performance in future work. Since both approaches have not been applied to scenarios with multiple categories, the authors hope that their comparative study provides further helpful inside into parts-based 3D object recognition.