Factorization methods for projective structure and motion
Summary (3 min read)
1 Introduction
- There has been considerable progress on scene reconstruction from multiple images in the last few years, aimed at applications ranging from very precise industrial measurement systems with several fixed cameras, to approximate structure and motion from real time video for active robot navigation.
- The key result is that projective reconstruction is the best that can be done without calibration or metric information about the scene, and that it is possible from at least two views of point-scenes or three views of line-scenes [2, 3, 8, 6].
- These are exactly the missing factorization scales mentioned above.
- However they also require a depth recovery phase that is not present in the affine case.
- For matrices of fixed low rank r (as here, where the rank is 3 for the affine method or 4 for the projective one), approximate factorizations can be computed in time O(mnr), i.e. directly proportional to the size of the input data.
2 Point Reconstruction
- Modulo some scale factors ip, the image points are projected from the world points: ip xip =PiXp.
- The ’s ‘cancel out’ the arbitrary scales of the image points, but there is still the freedom to: (i) arbitrarily rescale each world pointXp and each projectionPi; (ii) apply an arbitrary nonsingular4 4 projective deformationT: Xp !.
- PiT 1. Modulo changes of the ip, the image projections are invariant under both of these transformations.
- The scale factors ip will be called projective depths.
- In fact, [18, 19] argues that just as the key to calibrated stereo reconstruction is the recovery of Euclidean depth, the essence of projective reconstruction is precisely the recovery of a coherent set of projective depths modulo overall projection and world point rescalings.
2.1 Factorization
- One practical method of factorizingW is the Singular Value Decomposition [12].
- The decomposition is unique when the singular values are distinct, and can be computed stably and reliably in timeO(klmin(k; l)).
- Ideally, one would like to find reconstructions in timeO(mn) (the size of the input data).
- Rank r matrices can be factorized in ‘output sensitive’ time O(mnr).
- The method repeatedly sweeps the matrix, at each sweep guessing and subtracting a column-vector that ‘explains’ as much as possible of the residual error in the matrix columns.
2.2 Projective Depth Recovery
- The key technical advance that makes this work possible is a practical method for estimating these using fundamental matrices and epipoles obtained from the image data.
- These turn out to be the expressions for the fundamental matrixFij and epipole eji of camera j in image i in terms of projection matrix components [19, 4].
- The two methods give similar results except when there are many (>40) images, when the shorter chains of the parallel system become more robust.
- Theoretically this is not a problem as the overall scales are arbitrary, but it could easily make the factorization phase numerically illconditioned.
- For each the depths are estimated as above, and then: (i) each row of the estimated depth matrix is rescaled to have length pn; (ii) each column of the resulting matrix is rescaled to length pm.
3 Line Reconstruction
- 3D lines can also be reconstructed using the above techniques.
- In fact, epipolar transfer and depth recovery can be done in one step.
- Let yi stand for the rescaled via pointsPiY.
- The required fundamental matrices can not be found directly from line matches, but they can be estimated from point matches, or from the trilinear line matching constraints (trivalent tensor) [6, 14, 4, 19, 18].
- This works with the 3m 2nlines ‘W’ matrix of via-points, iteratively rescaling all coordinates of each image (triple of rows) and all coordinates of each line (pair of columns) until an approximate equilibrium is reached, where the overall mean square size of each coordinate is O(1) in each case.
4 Implementation
- This section summarizes the complete algorithm for factorization-based 3D projective reconstruction from image points and lines, and discusses a few important implementation details and variants.
- Build and balance the depth matrix ip, and use it to build the rescaled point measurement matrixW.
- 4) For each line choose two via-points and transfer them to the other images using the transfer equations (2).
- 6) Un-standardize the projection matrices (see below).
- The basic idea is to choose working coordinates that reflect the least squares trade-offs implicit in the factorization algorithm.
4.1 Generalizations & Variants
- I have implemented and experimented with a number of variants of the above algorithm, the more promising of which are featured in the experiments described below.
- The projective depths depend on the 3D structure, which in turn derives from the depths.
- With SVD-based factorization and standardized image coordinates the iteration turns out to be extremely stable, and always improves the recovered structure slightly (often significantly for lines).
- The ‘linear’ factorization-based projective reconstruction methods described above are a suitable starting point for more refined nonlinear least-squares estimation.
- This can take account of image point error models, camera calibrations, or Euclidean constraints, as in the work of Szeliski and Kang [16], Hartley [5] and Mohr, Boufama and Brand [10].
5 Experiments
- To quantify the performance of the various algorithms, I have run a large number of simulations using synthetic data, and also tested the algorithms on manually matched primitives derived from real images.
- Reconstruction error is measured over 50 trials, after least-squares projective alignment with the true 3D structure.
- Lines parallel trilinear SVD serial bilinear SVD parallel bilinear SVD iterative bilinear SVD bilinear SVD + L-M Figure 1: Mean 3D reconstruction error for points and lines, vs. noise, number of views and number of primitives.
- Iterating the SVD makes a small improvement, and nonlinear least-squares is slightly more accurate again.
- The rapid increase in error at scales below 0.1 is caused by floating-point truncation error.
6 Discussion & Conclusions
- Within the limitations of the factorization paradigm, factorization-based projective reconstruction seems quite successful.
- For points, the methods studied have proved simple, stable, and surprisingly accurate.
- Fixed-rank factorization works well, although (as might be expected) SVD always produces slightly more accurate results.
- All of these allow various trade-offs between redundancy, computation and implementation effort.
- Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm, also known as Summary.
Did you find this useful? Give us your feedback
Citations
14,282 citations
4,146 citations
Cites methods from "Factorization methods for projectiv..."
...…(Section 7.3) were developed to solve efficiently problems for which orthographic camera approximations were applicable (Figure 1.9a) (Tomasi and Kanade 1992; Poelman and Kanade 1997; Anandan and Irani 2002) and then later extended to the perspective case (Christy and Horaud 1996; Triggs 1996)....
[...]
1,098 citations
Cites background or methods from "Factorization methods for projectiv..."
...698 P. David, D. DeMenthon, R. Duraiswami, H. Samet A Pseudo-Metric for Weighted Point Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
[...]
...Furthermore, consider the case of two isolated just distinguishable points in R located at coordinates 0 and σ. Setting c3 = 0, because the smoothing must be negligible with respect to the other effects, the equilibrium condition writes c2(u− 1)− u′′ = 0, which is a linear differential equation of the second order....
[...]
...Level Set [14,13,19] and variational methods are increasingly considered by the vision community [17]....
[...]
...Kalitzin Behrooz Kamgar-Parsi Kenichi Kanatani Danny Keren Erwan Kerrien Charles Kervrann Renato Keshet Ali Khamene Shamim Khan Nahum Kiryati Reinhard Koch Ullrich Koethe Esther B. Koller-Meier John Krumm Hannes Kruppa Murat Kunt Prasun Lala Michael Langer Ivan Laptev Jean-Pierre Le Cadre Bastian Leibe Ricahrd Lengagne Vincent Lepetit Thomas Leung Maxime Lhuillier Weiliang Li David Liebowitz Georg Lindgren David Lowe John MacCormick Henrik Malm Roberto Manduchi Petros Maragos Eric Marchand Jiri Matas Bogdan Matei Esther B. Meier Jason Meltzer Etienne Mémin Rudolf Mester Ross J. Micheals Anurag Mittal Hiroshi Mo William Moran Greg Mori Yael Moses Jane Mulligan Don Murray Masahide Naemura Kenji Nagao Mirko Navara Shree Nayar Oscar Nestares Bernd Neumann Jeffrey Ng Tat Hieu Nguyen Peter Nillius David Nister Alison Noble Tom O’Donnell Takayuki Okatani Nuria Olivier Ole Fogh Olsen Magnus Oskarsson Nikos Paragios Ioannis Patras Josef Pauli Shmuel Peleg Robert Pless Swaminathan Rahul Deva Ramanan Lionel Reveret Dario Ringach Ruth Rosenholtz Volker Roth Payam Saisan Garbis Salgian Frank Sauer Peter Savadjiev Silvio Savarese Harpreet Sawhney Frederik Schaffalitzky Yoav Schechner Chrostoph Schnoerr Stephan Scholze Ali Shahrokri Doron Shaked Eitan Sharon Eli Shechtman Jamie Sherrah Akinobu Shimizu Ilan Shimshoni Kaleem Siddiqi Hedvig Sidenbladh Robert Sim Denis Simakov Philippe Simard Eero Simoncelli Nir Sochen Yang Song Andreas Soupliotis Sven Spanne Martin Spengler Alon Spira Thomas Strömberg Richard Szeliski Hai Tao Huseyin Tek Seth Teller Paul Thompson Jan Tops Benjamin J. Tordoff Kentaro Toyama Tinne Tuytelaars Shimon Ullman Richard Unger Raquel Urtasun Sven Utcke Luca Vacchetti Anton van den Hengel Geert Van Meerbergen X Organization Pierre Vandergheynst Zhizhou Wang Baba Vemuri Frank Verbiest Maarten Vergauwen Jaco Vermaak Mike Werman David Vernon Thomas Vetter Rene Vidal Michel Vidal-Naquet Marta Wilczkowiak Ramesh Visvanathan Dan Witzner Hansen Julia Vogel Lior Wolf Bob Woodham Robert J. Woodham Chenyang Xu Yaser Yacoob Anthony Yezzi Ramin Zabih Hugo Zaragoza Lihi Zelnik-Manor Ying Zhu Assaf Zomet Table of Contents, Part II Surface Geometry A Variational Approach to Recovering a Manifold from Sample Points . . . . . . . . . 3 J. Gomes, A. Mojsilovic A Variational Approach to Shape from Defocus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 H. Jin, P. Favaro Shadow Graphs and Surface Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Y. Yu, J.T. Chang Specularities Reduce Ambiguity of Uncalibrated Photometric Stereo . . . . . . . . . . ....
[...]
...This is mainly due to the intrinsic formulation of geometric constraints in projective geometry and a better understanding of numerical and statistical properties of geometric estimation [19,42]....
[...]
514 citations
306 citations
References
12,662 citations
12,473 citations
"Factorization methods for projectiv..." refers methods in this paper
...One practical method of factorizing 4 is the Singular Value Decomposition [12]....
[...]
11,285 citations
"Factorization methods for projectiv..." refers methods in this paper
...The standard workhorse for such problems is Levenberg-Marquardt iteration [12], so for comparison with the linear methods I have implemented simple L-M based projective reconstruction algorithms....
[...]
...One practical method of factorizingW is the Singular Value Decomposition [12]....
[...]
...Per- in ria -0 05 48 36 4, v er si on 1 - 20 D ec 2 01 0 Author manuscript, published in "International Conference on Computer Vision & Pattern Recognition (CVPR '96) (1996) 845--851" haps the most significant result in the paper is the extension of the method to work for lines as well as points, but I will also show how the factorization can be iteratively ‘polished’ (with results similar to nonlinear least squares iteration), and how any factorization-based method can be speeded up significantly for large problems, by using an approximate fixed-rank factorization technique in place of the Singular Value Decomposition....
[...]
2,696 citations
1,021 citations
"Factorization methods for projectiv..." refers background in this paper
...The key result is that projective reconstruction is the best that can be done without calibration or metric information about the scene, and that it is possible from at least two views of point-scenes or three views of line-scenes [2, 3, 8, 6]....
[...]
Related Papers (5)
Frequently Asked Questions (13)
Q2. What are the future works mentioned in the paper "Factorization methods for projective structure and motion" ?
Future work will expand on this. Summary: Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm.
Q3. How are the fundamental matrices and epipoles estimated?
Fundamental matrices and epipoles are estimated using the linear least squares method with all the available point matches, followed by a supplementary SVD to project the fundamental matrices to rank 2 and find the epipoles.
Q4. What is the main reason for the expansion of the epipolar constraint?
As part of the current blossoming of interest in multiimage reconstruction, Shashua [14] recently extended the wellknown two-image epipolar constraint to a trilinear constraint between matching points in three images.
Q5. How can the authors recover projective structure and motion from multiple perspective images?
Summary: Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm.
Q6. What is the key technical advance that makes this work possible?
The key technical advance that makes this work possible is a practical method for estimating these using fundamental matrices and epipoles obtained from the image data.
Q7. How many points can be recovered from a scene?
The authors need to recover 3D structure (point locations) and motion (camera calibrations and locations) from m uncalibrated perspective images of a scene containing n 3D points.
Q8. What are the key attractions of the factorization paradigm?
The factorization paradigm has two key attractions that are only enhanced by moving from the affine to the projective case: (i) All of the data in all of the images is treated uniformly — there is no need to single out ‘privileged’ features or images for special treatment; (ii) No initialization is required and convergence is virtually guaranteed by the nature of the numerical methods used.
Q9. How can the authors find the depths for each point p?
With such a non-redundant set of equations the depths for each point p can be found trivially by chaining together the solutions for each image, starting from some arbitrary initial value such as 1p = 1.
Q10. What is the way to estimate the r combinations of columns?
When the matrix is not exactly of rank r the guesses are not quite optimal and it is useful to include further sweeps (say 2r in total) and then SVD the matrix of extracted columns to estimate the best r combinations of them.
Q11. How can one factorize a rank r matrix?
Although SVD is probably near-optimal for full-rank matrices, rank r matrices can be factorized in ‘output sensitive’ time O(mnr).
Q12. What is the solution to the error modelling problem?
There is no obvious solution to the error modelling problem, beyond using the factorization to initialize a nonlinear least squares routine (as is done in some of the experiments below).
Q13. What is the underlying theory of projective depth recovery?
The full theory of projective depth recovery applies equally to two, three and four image matching tensors, but throughout this paper The authorwill concentrate on the two-image (fundamental matrix) case for simplicity.