Robust wide baseline stereo from maximally stable extremal regions
Summary (2 min read)
1 Introduction
- Finding reliable correspondences in two images of a scene taken from arbitrary viewpoints viewed with possibly different cameras and in different illumination conditions is a difficult and critical step towards fully automatic reconstruction of 3D scenes [5].
- Successful wide-baseline experiments on indoor and outdoor datasets presented in Section 4 demonstrate the potential of MSERs.
- Finding epipolar geometry consistent with the largest number of tentative correspondences is the final step of all wide-baseline algorithms.
- Baumberg [1] applied an iterative scheme originally proposed by Lindeberg and Garding to associate affine-invariant measurement regions with Harris interest points.
- Maximally Stable Extremal Regions are defined and their detection algorithm is described in Section 2.
2 Maximally Stable Extremal Regions
- The authors introduce a new type of image elements useful in wide-baseline matching — the Maximally Stable Extremal Regions.
- The concept can be explained informally as follows.
- Finally, intensity levels that are local minima of the rate of change of the area function are selected as thresholds producing maximally stable extremal regions.
- Every extremal region is a connected component of a 1even faster (but more complex) connected component algorithms exist with O(nα(n)) complexity, where α is the inverse Ackerman function; α(n) ≤ 4 for all practical n. thresholded image.
- The output of the MSER detector is not a binarized image.
3 The proposed robust wide-baseline algorithm
- As a first step, the DRs are detected - the MSERs computed on the intensity image (MSER+) and on the inverted image (MSER-).
- Smaller measurement regions are both more likely to satisfy the planarity condition and not to cross a discontinuity in depth or orientation.
- In all experiments, rotational invariants (based on complex moments) were used after applying a transformation that diagonalises the regions covariance matrix of the DR.
- First, an affine transformation between pairs of potentially corresponding DRs, i.e. the DRs consistent with the rough EG, is computed.
- Next, DR correspondences are pruned and only those with correlation of their transformed images above a threshold are selected.
4 Experiments
- The following experiments were conducted: Bookshelf, (Fig. 1).
- The part of the scene visible in both views covers a small fraction of the image.
- The regions matched on the box demonstrate performance on a non-planar surface.
- The final number of correspondences is given in the penultimate column ’fine EG’.
- The authors can see, that the precision of the estimated epipolar geometry is very high, much higher than the precision of the rough EG.
5 Conclusions
- In the paper, a new method for wide-baseline matching was proposed.
- The three main novelties are: the introduction of MSERs, robust matching of local features and the use of multiple scaled measurement regions.
- Another novelty of the approach is the use of a robust similarity measure for establishing tentative correspondences.
- The average distance from corresponding points to the epipolar line was below 0.09 of the inter-pixel distance.
- Test images included both outdoor and indoor scenes, some already used in published work.
Did you find this useful? Give us your feedback
Citations
46,906 citations
Cites background or methods from "Robust wide baseline stereo from ma..."
...In what appears to be the most affineinvariant method, Mikolajczyk (2002) has proposed and run detailed experiments with the Harris-affine detector....
[...]
...Matas et al. (2002) have shown that their maximally-stable extremal regions can produce large numbers of matching features with good stability....
[...]
43,540 citations
13,011 citations
12,449 citations
Cites background from "Robust wide baseline stereo from ma..."
...Key words: interest points, local features, feature description, camera calibration, object recognition PACS:...
[...]
6,938 citations
Additional excerpts
...The implementation details are given in [7]....
[...]
References
16,989 citations
"Robust wide baseline stereo from ma..." refers background in this paper
...Lowe [7] describes the ‘Scale Invariant Feature Transform’ approach which produces a scale and orientation-invariant characterisation of interest points....
[...]
...Recently, a whole class of stereo matching and object recognition algorithms with common structure has emerged [1,3,7,9,10,13,15,18,20,21]....
[...]
15,558 citations
14,282 citations
"Robust wide baseline stereo from ma..." refers background or methods in this paper
...Finding reliable correspondences in two images of a scene taken from arbitrary viewpoints viewed with possibly different cameras and in different illumination conditions is a difficult and critical step towards fully automatic reconstruction of 3D scenes [5]....
[...]
...After establishing the ‘rough EG’ the so-called ‘guided matching’ step is applied [2,5]....
[...]
4,983 citations
"Robust wide baseline stereo from ma..." refers background in this paper
...The structure of the above algorithm and of an efficient watershed algorithm [22] is essentially identical....
[...]
1,756 citations
"Robust wide baseline stereo from ma..." refers methods in this paper
...Since the influential paper by Schmid and Mohr [11] many image matching and wide-baseline stereo algorithms have been proposed, most commonly using Harris interest points as distinguished regions....
[...]
...Since the influential paper by Schmid and Mohr [16] many image matching and wide-baseline stereo algorithms have been proposed, most commonly using Harris interest points as DRs....
[...]
...Typically, DRs or their scaled version serve as measurement regions and tentative correspondences are established by comparing invariants using Mahalanobis distance [14,16,21]....
[...]
Related Papers (5)
Frequently Asked Questions (18)
Q2. What have the authors contributed in "Robust wide baseline stereo from maximally stable extremal regions" ?
The wide-baseline stereo problem, i. e. the problem of establishing correspondences between a pair of images taken from different viewpoints is studied. A new set of image elements that are put into correspondence, the so called extremal regions, is introduced. Extremal regions possess highly desirable properties: the set is closed under 1. continuous ( and thus projective ) transformation of image coordinates and 2. monotonic transformation of image intensities. An efficient ( near linear complexity ) and practically fast detection algorithm ( near frame rate ) is presented for an affinely-invariant stable subset of extremal regions, the maximally stable extremal regions ( MSER ).
Q3. What are the future works mentioned in the paper "Robust wide baseline stereo from maximally stable extremal regions" ?
In future work, the authors intend to proceed towards fully automatic projective reconstruction of the 3D scene, which requires computing projective reconstruction and dense matching.
Q4. What is the final step of all wide-baseline algorithms?
Finding epipolar geometry consistent with the largest number of tentative (local) correspondences is the final step of all wide-baseline algorithms.
Q5. What are the main novelties of the paper?
The three main novelties are: the introduction of MSERs, robust matching of local features and the use of multiple scaled measurement regions.
Q6. What is the common method of establishing tentative correspondences?
distinguished regions or their scaled version serve as measurement regions and tentative correspondences are established by comparing invariants using Mahalanobis distance [10, 16, 11].
Q7. How did the MSER detector perform on the epipolar scene?
In future work, the authors intend to proceed towards fully automatic projective reconstruction of the 3D scene, which requires computing projective reconstruction and dense matching.
Q8. What are the main design decisions at this stage?
Important design decisions at this stage include: 1. the choice of measurement regions, i.e. the parts of the image on which invariants are computed, 2. the method of selecting tentative correspondences given the invariant description and 3.
Q9. What is the definition of a merge of two components?
A merge of two components is viewed as termination of existence of the smaller component and an insertion of all pixels of the smaller component into the larger one.
Q10. What is the procedure for determining the EG?
an affine transformation between pairs of potentially corresponding DRs, i.e. the DRs consistent with the rough EG, is computed.
Q11. What is the advantage of a robust MR matching algorithm?
Since matching is accomplished in a robust manner, the authors benefit from the increase of distinctiveness of large regions without being severely affected by clutter or non-planarity of the DR’s pre-image.
Q12. What is the important paper by Schmid and Mohr?
Since the influential paper by Schmid and Mohr [11] many image matching and wide-baseline stereo algorithms have been proposed, most commonly usingHarris interest points as distinguished regions.
Q13. What is the definition of a good measurement?
A measurement taken from an almost planar patch of the scene with stable invariant description will be referred to as a ’good measurement’.
Q14. What is the description of the proposed similarity measure?
The robustness of the proposed similarity measure allows us to use invariants from a collection of measurement regions, even some that are much larger than the associated distinguished region.
Q15. Why did the authors have to consider invariants from multiple measurement regions?
Due to the robustness, the authors were able to consider invariants from multiple measurement regions, even some that were significantly larger (and hence probably discriminative) than the associated MSER.
Q16. What is the probability of the success of the procedure?
Probabilistic analysis of the likelihood of the success of the procedure is not simple, since the distribution of invariants and their noise is image-dependent.
Q17. What is the way to define a MSER?
Finally the authors remark that MSERs can be defined on any image (even high-dimensional) whose pixel values are from a totally ordered set.
Q18. What is the way to find a reliable correspondence between two images?
In the wide-baseline set-up, local image deformations cannot be realistically approximated by translation or translation with rotation and a full affine model is required.