Isolated 3D object recognition through next view planning
Summary (2 min read)
Introduction
- A hierarchical knowledge representation scheme facilitates recognition and the planning process.
- A single view may not contain sufficient features to recognize the object unambiguously.
- A simple feature set is applicable for a larger class of objects than a model base specific complex feature set.
- The purpose of this paper is to investigate the use of suitably planned multiple views and two-dimensional (2-D) invariants for 3-D object recognition.
A. Relation with Other Work
- Tarabanis et al. [5] survey the field of sensor planning for vision tasks.
- S. Dutta Roy and S. Banerjee are with the Department of Computer Science and Engineering, Indian Institute of Technology, New Delhi-110 016, India (e-mail: sumantra@ee.iitd.ernet.in; suban@cse.iitd.ernet.in).
- The next view planning strategy acts on the basis of these hypotheses.
- The authors use a hierarchical knowledge representation scheme which not only ensures a low-order polynomial-time complexity of the hypothesis generation process, but also plays an important role in planning the next view.
- There are six aspects of the object shown, belonging to three classes.
A. Class Identification, Accounting for Uncertainty
- 2) Class Probability Calculations Using the Knowledge Representation Scheme: (2) P is 1 for those classes which have a link from feature-class fjk.
- The computation of(2) takesO(NC) time—this is done for each feature-class.
- Due to errors possible in the feature detection process, a degree of uncertainty is associated with the evidence.
- The summation reduces to one term,P pjrk.
B. Object Identification
- Based on the outcome of the class recognition scheme, the authors estimate the object probabilities as follows.
- A particular movement may preclude the occurrence of some aspects for a given class observed.
- Let cij and a ij represent the minimum angles necessary to move out of the current assumed aspect in the clockwise and counterclockwise directions, respectively, also known as Auxiliary Move.
- The authors construct search tree nodes corresponding to both moves.
- From these, the authors finally select one with the minimum total movement.
A. The Planning Process and Object Recognition
- In their object identification algorithm, aspect and object probabilities are initialized to theira priori values.
- Else, the algorithm initiates the search process to get the best distinguishing move to resolve the ambiguity associated with this view.
- All the above steps starting at (a) (b) (c) (d) (e) (f) planning scheme is global—its reactive nature incorporates all previous movements and observations both in the probability calculations (Section III-B) as well as in the planning process.
- The authors robust class recognition algorithm can recover from many feature detection errors at the class recognition phase itself (Section III-A-2).
- Let denote the angular extent of the smallest aspect observed so far.
B. Bounds on the Number of Observations
- It is instructive to consider bounds onTavg(n), the number of observations required to disambiguate between a set ofn aspects (corresponding to the initially observed class).
- An interesting case is observed in Fig. 10(c) and (f)—an opportunistic case when the number of steps with primary moves is less than the one with both primary and auxiliary moves.
- 3) Ordering of Feature Detectors:The third image in Fig. 9(a) shows advantage of their scheduling of feature detectors.
- 7) Average Number of Observations for a Given Number of Competing Aspects:.
A. Experiments with Model Base II
- The authors use the number of horizontal and vertical lines (hhvi), and the number of circles(hci) as features.
- The recognition scheme has the ability to correctly identify objects even when they have a large number of similar views.
- The primary result obtained in this paper is the use of signed distance ranking of fuzzy numbers obtaining Properties 3 and 4.
- The purpose of the critical path method (CPM) is to identify critical activities on the critical path so that resources may be concentrated on these activities in order to reduce project length time.
A. CPM in Crisp Case
- Thus, the activity(v1; v2) requires three days, whereas(v1; v3) requires four days.
- Let tv v be the processing time for each activity(vi; vj).
- The authors define the earliest event time for eventvi and the latest event time for eventvj astEv andtLv , respectively.
- Assume that the values oftv v , tEv , andtLv are already known.
Did you find this useful? Give us your feedback
Citations
398 citations
312 citations
Cites background from "Isolated 3D object recognition thro..."
...The various 3D active object recognition systems that have been proposed so far in the literature can be compared based on the following four main characteristics [267]:...
[...]
138 citations
Cites background or methods from "Isolated 3D object recognition thro..."
...Many active object recongition schemes are based on aspect graphs [55], [56], [14], [57], [58] (Section 2.3 describes these in detail)....
[...]
...They propose indirect searches to be more efficient as compared to direct searches for an object, in two cases....
[...]
...They report that in typical situations, indirect search provides up to about an 8-fold increase in efficiency....
[...]
...D object recognition uses a planning scheme in order to take the next view [57], [58], [59] (brief description in Sections 2.3 and 2.3)....
[...]
...D scene....
[...]
117 citations
115 citations
Cites background from "Isolated 3D object recognition thro..."
...Because no 3D data but only a single monocular image is available in many cases, the automation level of various industrial processes can be improved significantly if the pose of such objects can be determined reliably from a single image....
[...]
References
7,877 citations
"Isolated 3D object recognition thro..." refers background in this paper
...Introduction Most model based object recognition systems con sider the problem of recognizing objects from the image of a single view However a single view may not contain su cient features to recognize the object unam biguously In single view object recognition systems of ten need to use complex…...
[...]
3,571 citations
"Isolated 3D object recognition thro..." refers methods in this paper
...We represent a class as . We use Hough transform-based line and circle detectors [ 12 ]....
[...]
...1) Polyhedral Objects: We use as features, the number of horizontal and vertical lines , and the number of nonbackground segmented regions in an image . We represent a class as .W e use a Hough transform-based line detector [ 12 ]....
[...]
...For getting the number of regions in the image, we perform sequential labeling (connected components: pixel labeling) [ 12 ] on a thresholded gradient image....
[...]
2,238 citations
"Isolated 3D object recognition thro..." refers background in this paper
... [ 3 ] A. Zisserman, D. Forsyth, J. Mundy, C. Rothwell, J. Liu, and N. Pillow,...
[...]
...From [ 3 ] and [6] we have the following properties of binary operations....
[...]
...Model base-specific complex features such as 3-D invariants have been proposed only for special cases so far (e.g., [ 3 ])....
[...]
1,146 citations
Related Papers (5)
Frequently Asked Questions (14)
Q2. What is the next view planning strategy?
The next view planning strategy that this paper presents is reactive and on-line—the evidence obtained from each view is used in the hypothesis generation and the planning process.
Q3. How do the authors get the number of regions in an image?
For getting the number of regions in the image, the authors perform sequential labeling (connected components: pixel labeling) [12] on a thresholded gradient image.
Q4. What other features may be used to recognize objects?
While the authors use simple features for the purpose of illustration, one may use other features such as texture, color, specularities, and reflectance ratios.
Q5. What is the role of the sensor in the planning of a view?
With an active sensor, object recognition involves identification of a view of an object and if necessary, planning further views.
Q6. How many experiments have been done to demonstrate the effectiveness of using simple features and multiple views?
Over 100 experiments demonstrate the effectiveness of using simple features and multiple views even on a relatively complex class of objects with a high degree of ambiguity associated with a view of the object.
Q7. Why do they use a hierarchical representation scheme?
Due to the non-hierarchical nature of Hutchinson and Kak’s system [9], many redundant hypotheses are proposed, which have to be later removed through consistency checks.
Q8. What is the probability of a move until observation 3?
The sequence of moves until observation 3 could correspond to O4, O5, O6, and O7 with probabilities 0.877, 0.102, 0.014, and 0.007, respectively.
Q9. What is the role of the knowledge representation scheme in generating hypotheses?
The knowledge representation scheme should support an efficient mechanism to generate hypotheses on the basis of the evidence received.
Q10. How can the authors compute class Ci from their knowledge representation scheme?
The authors can computeP (Ci) from their knowledge representation scheme by considering each aspect node belonging to an object and testing if it has a link to node Ci; this takes O(NC + Na) time.
Q11. What is the probable aspect of the view?
If the view indeed corresponds to the most probable aspect at a particular stage, then their search process using primary and auxiliary moves is guaranteed to perform aspect resolution and uniquely identify the object in the following step, assuming no feature detection errors.
Q12. What is the overhead of tracking the region of interest?
Though Dickinson et al. [8] use Bayes nets for hypothesis generation, their system incurs the overhead of tracking the region of interest through successive frames.
Q13. What is the correct number for the first image?
In the first image in Fig. 13(b), due to the shadow of the wing on the fuselage of the aircraft, the feature detector detectsfour vertical lines instead of three, the correct number.
Q14. What is the way to recover from feature detection errors?
Their robust class recognition algorithm can recover from many feature detection errors at the class recognition phase itself (Section III-A-2).