Proceedings Article•DOI•

Factorization methods for projective structure and motion

18 Jun 1996-pp 845-851

TL;DR: This paper describes a family of factorization-based algorithms that recover 3D projective structure and motion from multiple uncalibrated perspective images of 3D points and lines that can be viewed as generalizations of the Tomasi-Kanade algorithm from affine to fully perspective cameras, and from points to lines.

read less

Abstract: This paper describes a family of factorization-based algorithms that recover 3D projective structure and motion from multiple uncalibrated perspective images of 3D points and lines. They can be viewed as generalizations of the Tomasi-Kanade algorithm from affine to fully perspective cameras, and from points to lines. They make no restrictive assumptions about scene or camera geometry, and unlike most existing reconstruction methods they do not rely on 'privileged' points or images. All of the available image data is used, and each feature in each image is treated uniformly. The key to projective factorization is the recovery of a consistent set of projective depths (scale factors) for the image points: this is done using fundamental matrices and epipoles estimated from the image data. We compare the performance of the new techniques with several existing ones, and also describe an approximate factorization method that gives similar results to SVD-based factorization, but runs much more quickly for large problems.

...read moreread less

Summary (3 min read)

Jump to: [1 Introduction] – [2 Point Reconstruction] – [2.1 Factorization] – [2.2 Projective Depth Recovery] – [3 Line Reconstruction] – [4 Implementation] – [4.1 Generalizations & Variants] – [5 Experiments] and [6 Discussion & Conclusions]

1 Introduction

There has been considerable progress on scene reconstruction from multiple images in the last few years, aimed at applications ranging from very precise industrial measurement systems with several fixed cameras, to approximate structure and motion from real time video for active robot navigation.
The key result is that projective reconstruction is the best that can be done without calibration or metric information about the scene, and that it is possible from at least two views of point-scenes or three views of line-scenes [2, 3, 8, 6].
These are exactly the missing factorization scales mentioned above.
However they also require a depth recovery phase that is not present in the affine case.
For matrices of fixed low rank r (as here, where the rank is 3 for the affine method or 4 for the projective one), approximate factorizations can be computed in time O(mnr), i.e. directly proportional to the size of the input data.

2 Point Reconstruction

Modulo some scale factors ip, the image points are projected from the world points: ip xip =PiXp.
The ’s ‘cancel out’ the arbitrary scales of the image points, but there is still the freedom to: (i) arbitrarily rescale each world pointXp and each projectionPi; (ii) apply an arbitrary nonsingular4 4 projective deformationT: Xp !.
PiT 1. Modulo changes of the ip, the image projections are invariant under both of these transformations.
The scale factors ip will be called projective depths.
In fact, [18, 19] argues that just as the key to calibrated stereo reconstruction is the recovery of Euclidean depth, the essence of projective reconstruction is precisely the recovery of a coherent set of projective depths modulo overall projection and world point rescalings.

2.1 Factorization

One practical method of factorizingW is the Singular Value Decomposition [12].
The decomposition is unique when the singular values are distinct, and can be computed stably and reliably in timeO(klmin(k; l)).
Ideally, one would like to find reconstructions in timeO(mn) (the size of the input data).
Rank r matrices can be factorized in ‘output sensitive’ time O(mnr).
The method repeatedly sweeps the matrix, at each sweep guessing and subtracting a column-vector that ‘explains’ as much as possible of the residual error in the matrix columns.

2.2 Projective Depth Recovery

The key technical advance that makes this work possible is a practical method for estimating these using fundamental matrices and epipoles obtained from the image data.
These turn out to be the expressions for the fundamental matrixFij and epipole eji of camera j in image i in terms of projection matrix components [19, 4].
The two methods give similar results except when there are many (>40) images, when the shorter chains of the parallel system become more robust.
Theoretically this is not a problem as the overall scales are arbitrary, but it could easily make the factorization phase numerically illconditioned.
For each the depths are estimated as above, and then: (i) each row of the estimated depth matrix is rescaled to have length pn; (ii) each column of the resulting matrix is rescaled to length pm.

3 Line Reconstruction

3D lines can also be reconstructed using the above techniques.
In fact, epipolar transfer and depth recovery can be done in one step.
Let yi stand for the rescaled via pointsPiY.
The required fundamental matrices can not be found directly from line matches, but they can be estimated from point matches, or from the trilinear line matching constraints (trivalent tensor) [6, 14, 4, 19, 18].
This works with the 3m 2nlines ‘W’ matrix of via-points, iteratively rescaling all coordinates of each image (triple of rows) and all coordinates of each line (pair of columns) until an approximate equilibrium is reached, where the overall mean square size of each coordinate is O(1) in each case.

4 Implementation

This section summarizes the complete algorithm for factorization-based 3D projective reconstruction from image points and lines, and discusses a few important implementation details and variants.
Build and balance the depth matrix ip, and use it to build the rescaled point measurement matrixW.
4) For each line choose two via-points and transfer them to the other images using the transfer equations (2).
6) Un-standardize the projection matrices (see below).
The basic idea is to choose working coordinates that reflect the least squares trade-offs implicit in the factorization algorithm.

4.1 Generalizations & Variants

I have implemented and experimented with a number of variants of the above algorithm, the more promising of which are featured in the experiments described below.
The projective depths depend on the 3D structure, which in turn derives from the depths.
With SVD-based factorization and standardized image coordinates the iteration turns out to be extremely stable, and always improves the recovered structure slightly (often significantly for lines).
The ‘linear’ factorization-based projective reconstruction methods described above are a suitable starting point for more refined nonlinear least-squares estimation.
This can take account of image point error models, camera calibrations, or Euclidean constraints, as in the work of Szeliski and Kang [16], Hartley [5] and Mohr, Boufama and Brand [10].

5 Experiments

To quantify the performance of the various algorithms, I have run a large number of simulations using synthetic data, and also tested the algorithms on manually matched primitives derived from real images.
Reconstruction error is measured over 50 trials, after least-squares projective alignment with the true 3D structure.
Lines parallel trilinear SVD serial bilinear SVD parallel bilinear SVD iterative bilinear SVD bilinear SVD + L-M Figure 1: Mean 3D reconstruction error for points and lines, vs. noise, number of views and number of primitives.
Iterating the SVD makes a small improvement, and nonlinear least-squares is slightly more accurate again.
The rapid increase in error at scales below 0.1 is caused by floating-point truncation error.

6 Discussion & Conclusions

Within the limitations of the factorization paradigm, factorization-based projective reconstruction seems quite successful.
For points, the methods studied have proved simple, stable, and surprisingly accurate.
Fixed-rank factorization works well, although (as might be expected) SVD always produces slightly more accurate results.
All of these allow various trade-offs between redundancy, computation and implementation effort.
Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm, also known as Summary.

Did you find this useful? Give us your feedback

Content maybe subject to copyright Report

HAL Id: inria-00548364

https://hal.inria.fr/inria-00548364

Submitted on 20 Dec 2010

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Factorization Methods for Projective Structure and

Motion

Bill Triggs

To cite this version:

Bill Triggs. Factorization Methods for Projective Structure and Motion. International Conference

on Computer Vision & Pattern Recognition (CVPR ’96), Jun 1996, San Francisco, United States.

pp.845–851, �10.1109/CVPR.1996.517170�. �inria-00548364�

Factorization Methods for Projective Structure and Motion

Bill Triggs

INRIA Rh

one-Alpes,

655 avenue de l’Europe, 38330 Montbonnot Saint-Martin, France.

Bill.Triggs@inrialpes.fr http://www.inrialpes.fr/MOVI/Triggs

Abstract

This paper describes a family of factorization-based algorithms that

recover 3D projective structure and motion from multiple uncali-

brated perspective images of 3D points and lines. They can be viewed

as generalizations of the Tomasi-Kanade algorithm from afﬁne to

fully perspective cameras, and from points to lines. They make no

restrictive assumptions about scene or camera geometry, and unlike

most existing reconstruction methods they do not rely on ‘privileged’

points or images. All of the available imagedata is used, and each fea-

ture in each image is treateduniformly. The key to projective factoriz-

ation is the recovery of a consistent set of projective depths (scale fac-

tors) forthe image points: this is done using fundamental matricesand

epipoles estimatedfromthe image data. We compare the performance

of the new techniques with several existing ones, and also describe an

approximate factorization method that gives similar results to SVD-

based factorization, but runs much more quickly for large problems.

Keywords: Multi-image Structure, Projective Reconstruction, Ma-

trix Factorization.

1 Introduction

There has been considerable progress on scene reconstruction

from multiple images in the last few years, aimed at applica-

tions ranging from very precise industrial measurement sys-

tems with several ﬁxed cameras, to approximate structure and

motion from real time video for active robot navigation. One

can usefully begin by ignoring the issues of camera calibra-

tion andmetricstructure, initially recoveringthescene up to an

overall projective transformation and only later adding metric

information if needed [5, 10, 1]. The key result is that projec-

tive reconstruction is the best that can be done without calibra-

tion or metric information about the scene, and that it is pos-

sible from at least two views of point-scenes or three views of

line-scenes [2, 3, 8, 6].

Most current reconstruction methods either work only for

the minimal number of views (typically two), or single out a

few ‘privileged’ views for initialization before bootstrapping

themselves to the multi-view case [5, 10, 9]. For robustness

and accuracy, there is a need for methods that uniformly take

To appear in CVPR’96. This work was supported by an EC HCM grant and

INRIA Rhˆone-Alpes. I would like to thank Peter Sturm and Richard Hartley

for enlightening discussions.

account of all the data in all the images, without making re-

strictive special assumptions or relying on privileged features

or images for initialization. The orthographicand paraperspec-

tive structure/motionfactorizationmethods of Tomasi, Kanade

and Poelman [17, 11] partially fulﬁll these requirements, but

they only apply when the camera projections are well approx-

imated by afﬁne mappings. This happens only for cameras

viewingsmall, distant scenes, which is seldom the case in prac-

tice. Factorization methods for perspective images are needed,

however it has not been clear how to ﬁnd the unknown projec-

tive scale factors of the image measurements that are required

for this. (In the afﬁne case the scales are constant and can be

eliminated).

As part of the current blossoming of interest in multi-

imagereconstruction,Shashua[14]recentlyextendedthewell-

known two-image epipolar constraint to a trilinear constraint

between matching points in three images. Hartley [6] showed

that this constraint also applies to lines in three images, and

Faugeras & Mourrain [4] and I [18, 19] completed that cor-

ner of the puzzle by systematically studying the constraints for

lines and points in any number of images. A key aspect of the

viewpointpresentedin [18, 19] is that projectivereconstruction

is essentially a matter of recovering a coherent set of projec-

tive depths — projective scale factors that represent the depth

informationlostduringimageprojection. These are exactlythe

missing factorization scales mentioned above. They satisfy a

set of consistencyconditionscalled ‘joint image reconstruction

equations’ [18], that link them together via the corresponding

image point coordinates and the various inter-image matching

tensors.

In the MOVI group, we have recently been developing pro-

jective structure and motion algorithms based on this ‘projec-

tive depth’ picture. Several of these methods use the factoriz-

ation paradigm, and so can be viewed as generalizations of the

Tomasi-Kanade method from afﬁne to fully perspective pro-

jections. However they also require a depth recovery phase

that is not present in the afﬁne case. The basic reconstruction

method for point images was introduced in [15]. The current

paper extends this in several directions, and presents a detailed

assessment of the performanceof the new methods in compar-

ison to existing techniques such as Tomasi-Kanade factoriz-

ation and Levenberg-Marquardt nonlinear least squares. Per-

haps the most signiﬁcant result in the paper is the extension

of the method to work for lines as well as points, but I will

also show how the factorization can be iteratively ‘polished’

(with results similar to nonlinear least squares iteration), and

how any factorization-based method can be speeded up signif-

icantly for large problems, by using an approximate ﬁxed-rank

factorization technique in place of the Singular Value Decom-

position.

The factorization paradigm has two key attractions that are

onlyenhancedbymovingfromtheafﬁnetotheprojectivecase:

(i) All of the data in all of the images is treated uniformly —

there is no need to single out ‘privileged’ features or images

for special treatment; (ii) No initialization is required and con-

vergence is virtually guaranteed by the nature of the numerical

methods used. Factorization also has some well known disad-

vantages:

1) Every primitive must be visible in every image. This is un-

realistic in practice givenocclusionand extractionand tracking

failures.

2) It is not possible to incorporate a full statistical error model

for the image data, althoughsome sort of implicit least-squares

trade-off is made.

3) It is not clear how to incorporate additional points or im-

ages incrementally: the whole calculation must be redone.

4) SVD-based factorization is slow for large problems.

Only the speed problem will be considered here. SVD is

slow because it was designed for general, full rank matrices.

For matrices of ﬁxed low rank

(as here, where the rank is 3

for the afﬁne method or 4 for the projective one), approximate

factorizations can be computed in time

(

mnr

)

, i.e. directly

proportional to the size of the input data.

The Tomasi-Kanade ‘hallucination’ process can be used to

work around missing data [17], as in the afﬁne case. How-

ever this greatly complicates the method and dilutes some of

its principal beneﬁts. There is no obvious solution to the error

modelling problem, beyond using the factorization to initialize

a nonlinear least squares routine (as is done in some of the ex-

periments below). It would probablybe possible to develop in-

cremental factorization update methods, although there do not

seem to be any in the standard numerical algebra literature.

The rest of the paper outlines the theory of projective fac-

torization for points and lines, describes the ﬁnal algorithms

and implementation, reportsonexperimentalresults using syn-

thetic and real data, and concludes with a discussion. The

full theory of projective depth recovery applies equally to two,

three and four image matching tensors, but throughout this pa-

per I will concentrate on the two-image (fundamental matrix)

case for simplicity. The underlying theory for the higher va-

lency cases can be found in [18].

2 Point Reconstruction

We need to recover 3D structure (point locations) and mo-

tion (camera calibrations and locations) from

uncalibrated

perspective images of a scene containing

3D points. With-

out further information it is only possible to reconstruct the

scene up to an overall projective transformation [2, 8], so we

willworkin homogeneouscoordinateswith respect to arbitrary

projectivecoordinateframes. Let

(

= 1

; : : : ; n

) be the un-

known homogeneous 3D point vectors,

(

= 1

; : : : ; m

) the

unknown



image projections,and

the measured homo-

geneous image point vectors. Modulo some scale factors



theimagepointsareprojectedfromtheworldpoints:



. Each object is deﬁned only up to rescaling. The



’s

‘cancel out’ the arbitrary scales of the image points, but there

is still the freedom to: (i) arbitrarily rescale each world point

and each projection

; (ii) apply an arbitrary nonsingular



projective deformation



Modulo changes of the



, the image projectionsare invariant

under both of these transformations.

The scale factors



will be called projectivedepths. With

correctly normalized points and projections they become true

optical depths, i.e. orthogonal distances from the focal planes

of the cameras. (NB: this is not the same as Shashua’s ‘projec-

tive depth’ [13]). In general,



projective depths can

be set arbitrarily by choosing appropriatescales forthe

and

. However, once this is done the remaining

(



1)(



degrees of freedom contain real information that can be used

for 3D reconstruction: taken as a whole the projective depths

have a strong internal coherence. In fact, [18, 19] argues that

just as the key to calibrated stereo reconstruction is the recov-

ery of Euclidean depth, the essence of projective reconstruc-

tion is precisely the recovery of a coherent set of projective

depths modulo overall projection and world point rescalings.

Once this is done, reconstruction reduces to choosinga projec-

tive basis for a certain abstract three dimensional ‘joint image’

subspace, and reading off point coordinates with respect to it.

2.1 Factorization

Gather the point projections into a single



matrix equa-

tion:





  



  



  





  



Hence, with a consistent set of projective depths the rescaled

measurement matrix

has rank at most 4. Any rank 4 ma-

trix can be factorized into some



matrix of ‘projections’

multiplying a



matrix of ‘points’ as shown, and any such

factorization corresponds to a valid projective reconstruction:

the freedom in factorization is exactly a



nonsingular lin-

ear transformation

P T



T X

, which can be re-

garded as a projective transformation of the reconstructed 3D

space.

One practical methodof factorizing

is the Singular Value

Decomposition [12]. This decomposes an arbitrary



ma-

trix



of rank

into a product



where the columns of



and



are orthonormal bases

for the input (co-kernel) and output (range) spaces of



and



is a diagonal matrix of positive decreasing ‘singular

values’. Thedecompositionis uniquewhenthe singular values

are distinct, and can be computed stably and reliably in time

(

k l

min(

k ; l

))

. The matrix

of singular values can be ab-

sorbed into either

to give a decomposition of the pro-

jection/point form

. (I absorb it into

to form

The SVD has been used by Tomasi, Kanade and Poel-

man [17, 11] for their afﬁne (orthographic and paraperspec-

tive) reconstruction techniques. The currentapplication can be

viewed as a generalization of these methods to projective re-

construction. The projective case leads to slightly larger ma-

trices (



rank 4 as opposed to



rank 3), but is

actually simpler than the afﬁne case as there is no need to sub-

tract translation terms orapply nonlinear constraints to guaran-

tee the orthogonality of the projection matrices.

Ideally, one would like to ﬁnd reconstructions in time

(

)

(the size of the input data). SVD is a factor of

(min(3

m; n

))

slower than this, which can be signiﬁcant if

there are many points and images. Although SVD is proba-

bly near-optimal for full-rank matrices, rank

matrices can be

factorized in ‘output sensitive’ time

(

mnr

)

. I have experi-

mented with one such ‘ﬁxed rank’ method, and ﬁnd it to be al-

most as accurate as SVD and signiﬁcantly faster for large prob-

lems. The method repeatedlysweepsthe matrix, at each sweep

guessing and subtracting a column-vector that ‘explains’ as

much as possible of the residual error in the matrix columns.

A rank

matrix is factorized in

sweeps. When the matrix is

not exactly of rank

the guesses are not quite optimal and it is

useful to includefurther sweeps (say

in total) and then SVD

the matrix of extracted columns to estimate the best

combi-

nations of them.

2.2 Projective Depth Recovery

The above factorization techniques can only be used if a self-

consistent set of projective depths



can be found. The key

technical advance that makes this work possible is a practical

method for estimating these using fundamental matrices and

epipoles obtained from the image data. The full theory can be

found in [18], which also describes how to use trivalent and

quadrivalent matching tensors for depth recovery. Here we

brieﬂy sketch the fundamental matrix case. The image projec-

tions



imply that the



matrix

















has rank at most 4, so all of its



minors vanish. Expand-

ing by cofactors in the last column gives homogeneous linear

equations in the components of



and



, with coef-

ﬁcients that are



determinants of projection matrix rows.

These turn out to be the expressions for the fundamentalmatrix

and epipole

of camera

in image

in terms of projection

matrix components [19, 4]. The result is the projective depth

recovery equation:

(

)



= (

)



(1)

This says two things: (i) The epipolar line of

in image

the same as the line through the corresponding point

and

epipole

(as is well known); (ii) With the correct projective

depths and scalings for

and

, the two terms have exactly

the same size. The equality is exact, not just up to scale. This is

the new result that allows us to recover projective depths using

fundamental matrices and epipoles. Analogous results based

on higher order matching tensors can be found in [18].

It is straightforward to recover projective depths using (1).

Each instance of it linearly relates the depths of a single 3D

point in two images. By estimating a sufﬁcient number of fun-

damental matrices and epipoles, we can amass a system of ho-

mogeneous linear equations that allows the complete set of

depths for a given point to be found, up to an arbitrary over-

all scale factor. At a minimum, this can be done by selecting

any set of



equations that link the

images into a single

connected graph. With such a non-redundant set of equations

the depths for each point

can be found trivially by chaining

together the solutions for each image, starting from some arbi-

trary initial value such as



= 1

. Solving the depth recovery

equation in least squares gives a simple recursion relation for



in terms of



(

)



(

)



If additional depth recovery equations are used, this simple re-

cursion must be replacedby a redundant(and hence potentially

more robust) homogeneous linear system. However, care is

needed. The depth recovery equationsare sensitive to the scale

factors chosen for the

’s and

’s, and these can not be recov-

ereddirectlyfromthe image data. This is irrelevantwhen a sin-

gle chain of equations is used, as rescalings of

and

affect

all points equally and hence amount to rescalings of the corre-

sponding projection matrices. However with redundant equa-

tions it is essential to choose a mutually self-consistent set of

scales for the

’s and

’s. I will not describe this process here,

except to note that the consistency condition is the Grassmann

identity

[18].

It is still unclear what the best trade-off between economy

and robustnessis fordepth recovery. This paperconsidersonly

two simple non-redundantchoices: either the images are taken

pairwiseinsequence,

;

; : : : ;

m m



, or all subsequent

images are scaled in parallel from the ﬁrst,

;

; : : : ;

It might seem that long chains of rescalings would prove nu-

merically unstable, but in practice depth recovery is surpris-

ingly well conditioned. Both serial and parallel chains work

very well despite their non-redundancyand chain length or re-

liance on a ‘key’ image. The two methods give similar results

except when there are many (

40) images, when the shorter

chainsof theparallelsystembecomemorerobust. Both are sta-

ble even when epipolarpoint transferis ill-conditioned (e.g. for

a camera moving in a straight line, when the epipolar lines of

different images coincide): the image observations act as sta-

ble ‘anchors’ for the transfer process.

Balancing: A further point is that with arbitrary choices of

scale for the fundamental matrices and epipoles, the average

size of the recovered depths might tend to increase or decrease

exponentially during the solution-chaining process. Theoret-

ically this is not a problem as the overall scales are arbitrary,

but it couldeasilymake the factorization phase numerically ill-

conditioned. To counter this the recovered matrix of projec-

tive depths must be balanced after it has been built, by judi-

cious overall row and column rescalings. The process is very

simple. The image points are normalized on input, so ideally

all of the scale factors



should have roughly the same or-

der of magnitude,

(1)

say. For each point the depths are esti-

matedas above, andthen: (i) each row (image) of theestimated

depth matrix is rescaled to have length

; (ii) each column

(point) of the resulting matrix is rescaled to length

. This

process is repeated until it roughly converges, which happens

very quickly (within 2–3 iterations).

3 Line Reconstruction

3D lines can also be reconstructed using the above techniques.

A line

can be represented by any two 3D points lying on it,

say

and

. In image

projects to some image line

and

project to image points

and

lying on

. The

points

= 1

; : : : ; m

are in epipolar correspondence, so

they can be used in the depth recovery equation (1) to recon-

struct

, and similarly for

. The representatives

and

can

be ﬁxed implicitly by choosing

and

arbitrarilyon

in the

ﬁrst image, and using the epipolar constraint to transfer these

to the correspondingpoints in the remaining images:

lies on

both

and the epipolar line of

, so is located at their inter-

section.

In fact, epipolar transfer and depth recovery can be done in

one step. Let

stand for the rescaled via points

. Substi-

tute these into equation (1), cross-product with

, expand, and

simplify using



= 0

(

) =

(

)



(



)

+ (



)



(



)

(2)

Up to a factor of



, the intersection

(

)

with the epipolarline of

automatically gives the correct pro-

jective depth for reconstruction. Hence, factorization-based

line reconstruction can be implemented by choosing a suitable

(widely spaced) pair of via-points on each line in the ﬁrst im-

age, and then chaining together instances of equation (2) to

ﬁnd the corresponding, correctly scaled via-points in the other

images. The required fundamental matrices can not be found

directly from line matches, but they can be estimated from

point matches, or from the trilinear line matching constraints

(trivalent tensor) [6, 14, 4, 19, 18]. Alternatively, the triva-

lent tensor can be used directly: in tensorial notation [18], the

trivalent via-point transfer equation is

)

As with points, redundant equations may be included if and

only if a self-consistent normalization is chosen for the funda-

mental matrices and epipoles. For numerical stability, it is es-

sential to balance the resulting via-points(i.e. depthestimates).

This works with the



lines

‘

’ matrix of via-points,

iteratively rescaling all coordinates of each image (triple of

rows) and all coordinates of each line (pair of columns) until

an approximateequilibrium is reached,where the overallmean

square size of each coordinate is

(1)

in each case. To ensure

that the via-points representing each line are on average well

separated, I also orthonormalize the two

-component col-

umn vectors for each line with respect to one another. The via-

point equations (2) are linear and hence invariant with respect

to this, but it does of course change the 3D representatives

and

recovered for each line.

4 Implementation

This section summarizes the complete algorithm

for factorization-based 3D projective reconstruction from im-

age points and lines, and discusses a few important implemen-

tation details and variants. The algorithm goes as follows:

0) Extract and match points and lines across all images.

1) Standardize all image coordinates (see below).

2) Estimate a set of fundamental matrices and epipoles sufﬁ-

cienttochainalltheimagestogether(e.g. using pointmatches).

3) For each point, estimate the projective depths using equa-

tion (1). Build and balance the depth matrix



, and use it to

build the rescaled point measurement matrix

4) For each line choose two via-pointsandtransferthemto the

other images using the transfer equations (2). Build and bal-

ance the rescaled line via-point matrix.

5) Combine the line and point measurement matrices into a



(

points

+ 2

lines

)

data matrix and factorize it using either

SVD or the ﬁxed-rank method. Recover 3D projective struc-

ture (point and via-point coordinates) and motion (projection

matrices) from the factorization.

6) Un-standardize the projection matrices (see below).

HTML Viewer

Frequently Asked Questions (13)

Q1. What contributions have the authors mentioned in the paper "Factorization methods for projective structure and motion" ?

This paper describes a family of factorization-based algorithms that recover 3D projective structure and motion from multiple uncalibrated perspective images of 3D points and lines. The key to projective factorization is the recovery of a consistent set of projective depths ( scale factors ) for the image points: this is done using fundamental matrices and epipoles estimated from the image data. The authors compare the performance of the new techniques with several existing ones, and also describe an approximate factorization method that gives similar results to SVDbased factorization, but runs much more quickly for large problems.

Q2. What are the future works mentioned in the paper "Factorization methods for projective structure and motion" ?

Future work will expand on this. Summary: Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm.

Q3. How are the fundamental matrices and epipoles estimated?

Fundamental matrices and epipoles are estimated using the linear least squares method with all the available point matches, followed by a supplementary SVD to project the fundamental matrices to rank 2 and find the epipoles.

Q4. What is the main reason for the expansion of the epipolar constraint?

As part of the current blossoming of interest in multiimage reconstruction, Shashua [14] recently extended the wellknown two-image epipolar constraint to a trilinear constraint between matching points in three images.

Q5. How can the authors recover projective structure and motion from multiple perspective images?

Summary: Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm.

Q6. What is the key technical advance that makes this work possible?

The key technical advance that makes this work possible is a practical method for estimating these using fundamental matrices and epipoles obtained from the image data.

Q7. How many points can be recovered from a scene?

The authors need to recover 3D structure (point locations) and motion (camera calibrations and locations) from m uncalibrated perspective images of a scene containing n 3D points.

Q8. What are the key attractions of the factorization paradigm?

The factorization paradigm has two key attractions that are only enhanced by moving from the affine to the projective case: (i) All of the data in all of the images is treated uniformly — there is no need to single out ‘privileged’ features or images for special treatment; (ii) No initialization is required and convergence is virtually guaranteed by the nature of the numerical methods used.

Q9. How can the authors find the depths for each point p?

With such a non-redundant set of equations the depths for each point p can be found trivially by chaining together the solutions for each image, starting from some arbitrary initial value such as 1p = 1.

Q10. What is the way to estimate the r combinations of columns?

When the matrix is not exactly of rank r the guesses are not quite optimal and it is useful to include further sweeps (say 2r in total) and then SVD the matrix of extracted columns to estimate the best r combinations of them.

Q11. How can one factorize a rank r matrix?

Although SVD is probably near-optimal for full-rank matrices, rank r matrices can be factorized in ‘output sensitive’ time O(mnr).

Q12. What is the solution to the error modelling problem?

There is no obvious solution to the error modelling problem, beyond using the factorization to initialize a nonlinear least squares routine (as is done in some of the experiments below).

Q13. What is the underlying theory of projective depth recovery?

The full theory of projective depth recovery applies equally to two, three and four image matching tensors, but throughout this paper The authorwill concentrate on the two-image (fundamental matrix) case for simplicity.

Factorization methods for projective structure and motion

Summary (3 min read)

1 Introduction

2 Point Reconstruction

2.1 Factorization

2.2 Projective Depth Recovery

3 Line Reconstruction

4 Implementation

4.1 Generalizations & Variants

5 Experiments

6 Discussion & Conclusions

Citations

Cites methods from "Factorization methods for projectiv..."

Cites background or methods from "Factorization methods for projectiv..."

References

"Factorization methods for projectiv..." refers methods in this paper

"Factorization methods for projectiv..." refers methods in this paper

"Factorization methods for projectiv..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (13)

Q1. What contributions have the authors mentioned in the paper "Factorization methods for projective structure and motion" ?

Q2. What are the future works mentioned in the paper "Factorization methods for projective structure and motion" ?

Q3. How are the fundamental matrices and epipoles estimated?

Q4. What is the main reason for the expansion of the epipolar constraint?

Q5. How can the authors recover projective structure and motion from multiple perspective images?

Q6. What is the key technical advance that makes this work possible?

Q7. How many points can be recovered from a scene?

Q8. What are the key attractions of the factorization paradigm?

Q9. How can the authors find the depths for each point p?

Q10. What is the way to estimate the r combinations of columns?

Q11. How can one factorize a rank r matrix?

Q12. What is the solution to the error modelling problem?

Q13. What is the underlying theory of projective depth recovery?