scispace - formally typeset
Open AccessProceedings ArticleDOI

Factorization methods for projective structure and motion

Bill Triggs
- pp 845-851
Reads0
Chats0
TLDR
This paper describes a family of factorization-based algorithms that recover 3D projective structure and motion from multiple uncalibrated perspective images of 3D points and lines that can be viewed as generalizations of the Tomasi-Kanade algorithm from affine to fully perspective cameras, and from points to lines.
Abstract
This paper describes a family of factorization-based algorithms that recover 3D projective structure and motion from multiple uncalibrated perspective images of 3D points and lines. They can be viewed as generalizations of the Tomasi-Kanade algorithm from affine to fully perspective cameras, and from points to lines. They make no restrictive assumptions about scene or camera geometry, and unlike most existing reconstruction methods they do not rely on 'privileged' points or images. All of the available image data is used, and each feature in each image is treated uniformly. The key to projective factorization is the recovery of a consistent set of projective depths (scale factors) for the image points: this is done using fundamental matrices and epipoles estimated from the image data. We compare the performance of the new techniques with several existing ones, and also describe an approximate factorization method that gives similar results to SVD-based factorization, but runs much more quickly for large problems.

read more

Content maybe subject to copyright    Report

HAL Id: inria-00548364
https://hal.inria.fr/inria-00548364
Submitted on 20 Dec 2010
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Factorization Methods for Projective Structure and
Motion
Bill Triggs
To cite this version:
Bill Triggs. Factorization Methods for Projective Structure and Motion. International Conference
on Computer Vision & Pattern Recognition (CVPR ’96), Jun 1996, San Francisco, United States.
pp.845–851, �10.1109/CVPR.1996.517170�. �inria-00548364�

Factorization Methods for Projective Structure and Motion
Bill Triggs
INRIA Rh
ˆ
one-Alpes,
655 avenue de l’Europe, 38330 Montbonnot Saint-Martin, France.
Bill.Triggs@inrialpes.fr http://www.inrialpes.fr/MOVI/Triggs
Abstract
This paper describes a family of factorization-based algorithms that
recover 3D projective structure and motion from multiple uncali-
brated perspective images of 3D points and lines. They can be viewed
as generalizations of the Tomasi-Kanade algorithm from affine to
fully perspective cameras, and from points to lines. They make no
restrictive assumptions about scene or camera geometry, and unlike
most existing reconstruction methods they do not rely on ‘privileged’
points or images. All of the available imagedata is used, and each fea-
ture in each image is treateduniformly. The key to projective factoriz-
ation is the recovery of a consistent set of projective depths (scale fac-
tors) forthe image points: this is done using fundamental matricesand
epipoles estimatedfromthe image data. We compare the performance
of the new techniques with several existing ones, and also describe an
approximate factorization method that gives similar results to SVD-
based factorization, but runs much more quickly for large problems.
Keywords: Multi-image Structure, Projective Reconstruction, Ma-
trix Factorization.
1 Introduction
There has been considerable progress on scene reconstruction
from multiple images in the last few years, aimed at applica-
tions ranging from very precise industrial measurement sys-
tems with several xed cameras, to approximate structure and
motion from real time video for active robot navigation. One
can usefully begin by ignoring the issues of camera calibra-
tion andmetricstructure, initially recoveringthescene up to an
overall projective transformation and only later adding metric
information if needed [5, 10, 1]. The key result is that projec-
tive reconstruction is the best that can be done without calibra-
tion or metric information about the scene, and that it is pos-
sible from at least two views of point-scenes or three views of
line-scenes [2, 3, 8, 6].
Most current reconstruction methods either work only for
the minimal number of views (typically two), or single out a
few ‘privileged’ views for initialization before bootstrapping
themselves to the multi-view case [5, 10, 9]. For robustness
and accuracy, there is a need for methods that uniformly take
To appear in CVPR’96. This work was supported by an EC HCM grant and
INRIA Rhˆone-Alpes. I would like to thank Peter Sturm and Richard Hartley
for enlightening discussions.
account of all the data in all the images, without making re-
strictive special assumptions or relying on privileged features
or images for initialization. The orthographicand paraperspec-
tive structure/motionfactorizationmethods of Tomasi, Kanade
and Poelman [17, 11] partially fulfill these requirements, but
they only apply when the camera projections are well approx-
imated by affine mappings. This happens only for cameras
viewingsmall, distant scenes, which is seldom the case in prac-
tice. Factorization methods for perspective images are needed,
however it has not been clear how to find the unknown projec-
tive scale factors of the image measurements that are required
for this. (In the affine case the scales are constant and can be
eliminated).
As part of the current blossoming of interest in multi-
imagereconstruction,Shashua[14]recentlyextendedthewell-
known two-image epipolar constraint to a trilinear constraint
between matching points in three images. Hartley [6] showed
that this constraint also applies to lines in three images, and
Faugeras & Mourrain [4] and I [18, 19] completed that cor-
ner of the puzzle by systematically studying the constraints for
lines and points in any number of images. A key aspect of the
viewpointpresentedin [18, 19] is that projectivereconstruction
is essentially a matter of recovering a coherent set of projec-
tive depths projective scale factors that represent the depth
informationlostduringimageprojection. These are exactlythe
missing factorization scales mentioned above. They satisfy a
set of consistencyconditionscalled ‘joint image reconstruction
equations’ [18], that link them together via the corresponding
image point coordinates and the various inter-image matching
tensors.
In the MOVI group, we have recently been developing pro-
jective structure and motion algorithms based on this ‘projec-
tive depth’ picture. Several of these methods use the factoriz-
ation paradigm, and so can be viewed as generalizations of the
Tomasi-Kanade method from affine to fully perspective pro-
jections. However they also require a depth recovery phase
that is not present in the affine case. The basic reconstruction
method for point images was introduced in [15]. The current
paper extends this in several directions, and presents a detailed
assessment of the performanceof the new methods in compar-
ison to existing techniques such as Tomasi-Kanade factoriz-
ation and Levenberg-Marquardt nonlinear least squares. Per-
1

haps the most significant result in the paper is the extension
of the method to work for lines as well as points, but I will
also show how the factorization can be iteratively polished’
(with results similar to nonlinear least squares iteration), and
how any factorization-based method can be speeded up signif-
icantly for large problems, by using an approximate fixed-rank
factorization technique in place of the Singular Value Decom-
position.
The factorization paradigm has two key attractions that are
onlyenhancedbymovingfromtheaffinetotheprojectivecase:
(i) All of the data in all of the images is treated uniformly
there is no need to single out ‘privileged’ features or images
for special treatment; (ii) No initialization is required and con-
vergence is virtually guaranteed by the nature of the numerical
methods used. Factorization also has some well known disad-
vantages:
1) Every primitive must be visible in every image. This is un-
realistic in practice givenocclusionand extractionand tracking
failures.
2) It is not possible to incorporate a full statistical error model
for the image data, althoughsome sort of implicit least-squares
trade-off is made.
3) It is not clear how to incorporate additional points or im-
ages incrementally: the whole calculation must be redone.
4) SVD-based factorization is slow for large problems.
Only the speed problem will be considered here. SVD is
slow because it was designed for general, full rank matrices.
For matrices of fixed low rank
r
(as here, where the rank is 3
for the affine method or 4 for the projective one), approximate
factorizations can be computed in time
O
(
mnr
)
, i.e. directly
proportional to the size of the input data.
The Tomasi-Kanade ‘hallucination’ process can be used to
work around missing data [17], as in the affine case. How-
ever this greatly complicates the method and dilutes some of
its principal benefits. There is no obvious solution to the error
modelling problem, beyond using the factorization to initialize
a nonlinear least squares routine (as is done in some of the ex-
periments below). It would probablybe possible to develop in-
cremental factorization update methods, although there do not
seem to be any in the standard numerical algebra literature.
The rest of the paper outlines the theory of projective fac-
torization for points and lines, describes the final algorithms
and implementation, reportsonexperimentalresults using syn-
thetic and real data, and concludes with a discussion. The
full theory of projective depth recovery applies equally to two,
three and four image matching tensors, but throughout this pa-
per I will concentrate on the two-image (fundamental matrix)
case for simplicity. The underlying theory for the higher va-
lency cases can be found in [18].
2 Point Reconstruction
We need to recover 3D structure (point locations) and mo-
tion (camera calibrations and locations) from
m
uncalibrated
perspective images of a scene containing
n
3D points. With-
out further information it is only possible to reconstruct the
scene up to an overall projective transformation [2, 8], so we
willworkin homogeneouscoordinateswith respect to arbitrary
projectivecoordinateframes. Let
X
p
(
p
= 1
; : : : ; n
) be the un-
known homogeneous 3D point vectors,
P
i
(
i
= 1
; : : : ; m
) the
unknown
3
4
image projections,and
x
ip
the measured homo-
geneous image point vectors. Modulo some scale factors
ip
,
theimagepointsareprojectedfromtheworldpoints:
ip
x
ip
=
P
i
X
p
. Each object is defined only up to rescaling. The
s
‘cancel out’ the arbitrary scales of the image points, but there
is still the freedom to: (i) arbitrarily rescale each world point
X
p
and each projection
P
i
; (ii) apply an arbitrary nonsingular
4
4
projective deformation
T
:
X
p
!
TX
p
,
P
i
!
P
i
T
1
.
Modulo changes of the
ip
, the image projectionsare invariant
under both of these transformations.
The scale factors
ip
will be called projectivedepths. With
correctly normalized points and projections they become true
optical depths, i.e. orthogonal distances from the focal planes
of the cameras. (NB: this is not the same as Shashua’s ‘projec-
tive depth’ [13]). In general,
m
+
n
1
projective depths can
be set arbitrarily by choosing appropriatescales forthe
X
p
and
P
i
. However, once this is done the remaining
(
m
1)(
n
1)
degrees of freedom contain real information that can be used
for 3D reconstruction: taken as a whole the projective depths
have a strong internal coherence. In fact, [18, 19] argues that
just as the key to calibrated stereo reconstruction is the recov-
ery of Euclidean depth, the essence of projective reconstruc-
tion is precisely the recovery of a coherent set of projective
depths modulo overall projection and world point rescalings.
Once this is done, reconstruction reduces to choosinga projec-
tive basis for a certain abstract three dimensional ‘joint image’
subspace, and reading off point coordinates with respect to it.
2.1 Factorization
Gather the point projections into a single
3
m
n
matrix equa-
tion:
W
0
B
B
B
@
11
x
11
12
x
12
1n
x
1n
21
x
21
22
x
22
2n
x
2n
.
.
.
.
.
.
.
.
.
.
.
.
m1
x
m1
m2
x
m2
mn
x
mn
1
C
C
C
A
=
0
B
B
B
@
P
1
P
2
.
.
.
P
m
1
C
C
C
A
X
1
X
2
X
n
Hence, with a consistent set of projective depths the rescaled
measurement matrix
W
has rank at most 4. Any rank 4 ma-
2

trix can be factorized into some
3
m
4
matrix of ‘projections’
multiplying a
4
n
matrix of ‘points’ as shown, and any such
factorization corresponds to a valid projective reconstruction:
the freedom in factorization is exactly a
4
4
nonsingular lin-
ear transformation
P
!
P T
1
,
X
!
T X
, which can be re-
garded as a projective transformation of the reconstructed 3D
space.
One practical methodof factorizing
W
is the Singular Value
Decomposition [12]. This decomposes an arbitrary
k
l
ma-
trix
W
k
l
of rank
r
into a product
W
k
l
=
U
k
r
D
r
r
V
>
l
r
,
where the columns of
V
l
r
and
U
k
r
are orthonormal bases
for the input (co-kernel) and output (range) spaces of
W
k
l
,
and
D
r
r
is a diagonal matrix of positive decreasing ‘singular
values’. Thedecompositionis uniquewhenthe singular values
are distinct, and can be computed stably and reliably in time
O
(
k l
min(
k ; l
))
. The matrix
D
of singular values can be ab-
sorbed into either
U
or
V
to give a decomposition of the pro-
jection/point form
PX
. (I absorb it into
V
to form
X
).
The SVD has been used by Tomasi, Kanade and Poel-
man [17, 11] for their affine (orthographic and paraperspec-
tive) reconstruction techniques. The currentapplication can be
viewed as a generalization of these methods to projective re-
construction. The projective case leads to slightly larger ma-
trices (
3
m
n
rank 4 as opposed to
2
m
n
rank 3), but is
actually simpler than the affine case as there is no need to sub-
tract translation terms orapply nonlinear constraints to guaran-
tee the orthogonality of the projection matrices.
Ideally, one would like to find reconstructions in time
O
(
mn
)
(the size of the input data). SVD is a factor of
O
(min(3
m; n
))
slower than this, which can be significant if
there are many points and images. Although SVD is proba-
bly near-optimal for full-rank matrices, rank
r
matrices can be
factorized in ‘output sensitive’ time
O
(
mnr
)
. I have experi-
mented with one such ‘fixed rank’ method, and find it to be al-
most as accurate as SVD and significantly faster for large prob-
lems. The method repeatedlysweepsthe matrix, at each sweep
guessing and subtracting a column-vector that ‘explains’ as
much as possible of the residual error in the matrix columns.
A rank
r
matrix is factorized in
r
sweeps. When the matrix is
not exactly of rank
r
the guesses are not quite optimal and it is
useful to includefurther sweeps (say
2
r
in total) and then SVD
the matrix of extracted columns to estimate the best
r
combi-
nations of them.
2.2 Projective Depth Recovery
The above factorization techniques can only be used if a self-
consistent set of projective depths
ip
can be found. The key
technical advance that makes this work possible is a practical
method for estimating these using fundamental matrices and
epipoles obtained from the image data. The full theory can be
found in [18], which also describes how to use trivalent and
quadrivalent matching tensors for depth recovery. Here we
briefly sketch the fundamental matrix case. The image projec-
tions
ip
x
ip
=
P
i
X
p
imply that the
6
5
matrix
P
i
ip
x
ip
P
j
jp
x
jp
=
P
i
P
j
I
4
4
X
p
has rank at most 4, so all of its
5
5
minors vanish. Expand-
ing by cofactors in the last column gives homogeneous linear
equations in the components of
ip
x
ip
and
jp
x
jp
, with coef-
ficients that are
4
4
determinants of projection matrix rows.
These turn out to be the expressions for the fundamentalmatrix
F
ij
and epipole
e
ji
of camera
j
in image
i
in terms of projection
matrix components [19, 4]. The result is the projective depth
recovery equation:
(
F
ij
x
jp
)
jp
= (
e
ji
^
x
ip
)
ip
(1)
This says two things: (i) The epipolar line of
x
jp
in image
i
is
the same as the line through the corresponding point
x
ip
and
epipole
e
ji
(as is well known); (ii) With the correct projective
depths and scalings for
F
ij
and
e
ji
, the two terms have exactly
the same size. The equality is exact, not just up to scale. This is
the new result that allows us to recover projective depths using
fundamental matrices and epipoles. Analogous results based
on higher order matching tensors can be found in [18].
It is straightforward to recover projective depths using (1).
Each instance of it linearly relates the depths of a single 3D
point in two images. By estimating a sufficient number of fun-
damental matrices and epipoles, we can amass a system of ho-
mogeneous linear equations that allows the complete set of
depths for a given point to be found, up to an arbitrary over-
all scale factor. At a minimum, this can be done by selecting
any set of
m
1
equations that link the
m
images into a single
connected graph. With such a non-redundant set of equations
the depths for each point
p
can be found trivially by chaining
together the solutions for each image, starting from some arbi-
trary initial value such as
1p
= 1
. Solving the depth recovery
equation in least squares gives a simple recursion relation for
ip
in terms of
jp
:
ip
:=
(
e
ji
^
x
ip
)
(
F
ij
x
jp
)
k
e
ji
^
x
ip
k
2
jp
If additional depth recovery equations are used, this simple re-
cursion must be replacedby a redundant(and hence potentially
more robust) homogeneous linear system. However, care is
needed. The depth recovery equationsare sensitive to the scale
factors chosen for the
F
s and
e
s, and these can not be recov-
ereddirectlyfromthe image data. This is irrelevantwhen a sin-
gle chain of equations is used, as rescalings of
F
and
e
affect
all points equally and hence amount to rescalings of the corre-
sponding projection matrices. However with redundant equa-
tions it is essential to choose a mutually self-consistent set of
scales for the
F
s and
e
s. I will not describe this process here,
except to note that the consistency condition is the Grassmann
identity
F
kj
e
ij
=
e
ik
^
e
jk
[18].
It is still unclear what the best trade-off between economy
and robustnessis fordepth recovery. This paperconsidersonly
3

two simple non-redundantchoices: either the images are taken
pairwiseinsequence,
F
21
;
F
32
; : : : ;
F
m m
1
, or all subsequent
images are scaled in parallel from the first,
F
21
;
F
31
; : : : ;
F
m1
.
It might seem that long chains of rescalings would prove nu-
merically unstable, but in practice depth recovery is surpris-
ingly well conditioned. Both serial and parallel chains work
very well despite their non-redundancyand chain length or re-
liance on a ‘key’ image. The two methods give similar results
except when there are many (
>
40) images, when the shorter
chainsof theparallelsystembecomemorerobust. Both are sta-
ble even when epipolarpoint transferis ill-conditioned (e.g. for
a camera moving in a straight line, when the epipolar lines of
different images coincide): the image observations act as sta-
ble ‘anchors’ for the transfer process.
Balancing: A further point is that with arbitrary choices of
scale for the fundamental matrices and epipoles, the average
size of the recovered depths might tend to increase or decrease
exponentially during the solution-chaining process. Theoret-
ically this is not a problem as the overall scales are arbitrary,
but it couldeasilymake the factorization phase numerically ill-
conditioned. To counter this the recovered matrix of projec-
tive depths must be balanced after it has been built, by judi-
cious overall row and column rescalings. The process is very
simple. The image points are normalized on input, so ideally
all of the scale factors
ip
should have roughly the same or-
der of magnitude,
O
(1)
say. For each point the depths are esti-
matedas above, andthen: (i) each row (image) of theestimated
depth matrix is rescaled to have length
p
n
; (ii) each column
(point) of the resulting matrix is rescaled to length
p
m
. This
process is repeated until it roughly converges, which happens
very quickly (within 2–3 iterations).
3 Line Reconstruction
3D lines can also be reconstructed using the above techniques.
A line
L
can be represented by any two 3D points lying on it,
say
Y
and
Z
. In image
i
,
L
projects to some image line
l
i
and
Y
and
Z
project to image points
y
i
and
z
i
lying on
l
i
. The
points
f
y
i
j
i
= 1
; : : : ; m
g
are in epipolar correspondence, so
they can be used in the depth recovery equation (1) to recon-
struct
Y
, and similarly for
Z
. The representatives
Y
and
Z
can
be fixed implicitly by choosing
y
1
and
z
1
arbitrarilyon
l
1
in the
first image, and using the epipolar constraint to transfer these
to the correspondingpoints in the remaining images:
y
i
lies on
both
l
i
and the epipolar line of
y
1
, so is located at their inter-
section.
In fact, epipolar transfer and depth recovery can be done in
one step. Let
y
i
stand for the rescaled via points
P
i
Y
. Substi-
tute these into equation (1), cross-product with
l
i
, expand, and
simplify using
l
i
y
i
= 0
:
l
i
^
(
F
ij
y
j
) =
l
i
^
(
e
ji
^
y
i
)
=
(
l
i
e
ji
)
y
i
+ (
l
i
y
i
)
e
ji
=
(
l
i
e
ji
)
y
i
(2)
Up to a factor of
l
i
e
ji
, the intersection
l
i
^
(
F
ij
y
j
)
of
l
i
with the epipolarline of
y
j
automatically gives the correct pro-
jective depth for reconstruction. Hence, factorization-based
line reconstruction can be implemented by choosing a suitable
(widely spaced) pair of via-points on each line in the first im-
age, and then chaining together instances of equation (2) to
find the corresponding, correctly scaled via-points in the other
images. The required fundamental matrices can not be found
directly from line matches, but they can be estimated from
point matches, or from the trilinear line matching constraints
(trivalent tensor) [6, 14, 4, 19, 18]. Alternatively, the triva-
lent tensor can be used directly: in tensorial notation [18], the
trivalent via-point transfer equation is
l
B
k
G
C
j
A
i
B
k
y
C
j
=
(l
B
k
e
B
k
j
)
y
A
i
.
As with points, redundant equations may be included if and
only if a self-consistent normalization is chosen for the funda-
mental matrices and epipoles. For numerical stability, it is es-
sential to balance the resulting via-points(i.e. depthestimates).
This works with the
3
m
2
n
lines
W
matrix of via-points,
iteratively rescaling all coordinates of each image (triple of
rows) and all coordinates of each line (pair of columns) until
an approximateequilibrium is reached,where the overallmean
square size of each coordinate is
O
(1)
in each case. To ensure
that the via-points representing each line are on average well
separated, I also orthonormalize the two
3
m
-component col-
umn vectors for each line with respect to one another. The via-
point equations (2) are linear and hence invariant with respect
to this, but it does of course change the 3D representatives
Y
and
Z
recovered for each line.
4 Implementation
This section summarizes the complete algorithm
for factorization-based 3D projective reconstruction from im-
age points and lines, and discusses a few important implemen-
tation details and variants. The algorithm goes as follows:
0) Extract and match points and lines across all images.
1) Standardize all image coordinates (see below).
2) Estimate a set of fundamental matrices and epipoles suffi-
cienttochainalltheimagestogether(e.g. using pointmatches).
3) For each point, estimate the projective depths using equa-
tion (1). Build and balance the depth matrix
ip
, and use it to
build the rescaled point measurement matrix
W
.
4) For each line choose two via-pointsandtransferthemto the
other images using the transfer equations (2). Build and bal-
ance the rescaled line via-point matrix.
5) Combine the line and point measurement matrices into a
3
m
(
n
points
+ 2
n
lines
)
data matrix and factorize it using either
SVD or the fixed-rank method. Recover 3D projective struc-
ture (point and via-point coordinates) and motion (projection
matrices) from the factorization.
6) Un-standardize the projection matrices (see below).
4

Citations
More filters
Proceedings ArticleDOI

PALM: portable sensor-augmented vision system for large-scene modeling

TL;DR: PALM-a portable sensor-augmented vision system for large-scene modeling solves the problem of recovering large structures in arbitrary scenes from video streams taken by a sensor-AUgmented camera through the use of multiple constraints derived from GPS measurements, camera orientation sensor readings, and image features.
Journal ArticleDOI

Optimal motion estimation from multiple images by normalized epipolar constraint

TL;DR: This paper contends that multilinear constraints, when used for motion and structure estimation, need to be properly normalized, which makes them no longer tensors, and shows how to use geometric optimization techniques to minimize such a function.
Journal ArticleDOI

Extending 3D Lucas–Kanade tracking with adaptive templates for head pose estimation

TL;DR: This paper proposes an extended model-based 3D LKT for estimating 3D head poses by tracking human heads on video sequences that exploits an adaptive template with each template pixel modeled by a continuously updated Gaussian distribution during head tracking.
References
More filters
Book

Numerical Recipes in C: The Art of Scientific Computing

TL;DR: Numerical Recipes: The Art of Scientific Computing as discussed by the authors is a complete text and reference book on scientific computing with over 100 new routines (now well over 300 in all), plus upgraded versions of many of the original routines, with many new topics presented at the same accessible level.
Journal ArticleDOI

Shape and motion from image streams under orthography: a factorization method

TL;DR: In this paper, the singular value decomposition (SVDC) technique is used to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively, and two of the three translation components are computed in a preprocessing stage.
Book ChapterDOI

Camera Self-Calibration: Theory and Experiments

TL;DR: It is shown, using experiments with noisy data, that it is possible to calibrate a camera just by pointing it at the environment, selecting points of interest and then tracking them in the image as the camera moves.
Related Papers (5)
Frequently Asked Questions (13)
Q1. What contributions have the authors mentioned in the paper "Factorization methods for projective structure and motion" ?

This paper describes a family of factorization-based algorithms that recover 3D projective structure and motion from multiple uncalibrated perspective images of 3D points and lines. The key to projective factorization is the recovery of a consistent set of projective depths ( scale factors ) for the image points: this is done using fundamental matrices and epipoles estimated from the image data. The authors compare the performance of the new techniques with several existing ones, and also describe an approximate factorization method that gives similar results to SVDbased factorization, but runs much more quickly for large problems. 

Future work will expand on this. Summary: Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm. 

Fundamental matrices and epipoles are estimated using the linear least squares method with all the available point matches, followed by a supplementary SVD to project the fundamental matrices to rank 2 and find the epipoles. 

As part of the current blossoming of interest in multiimage reconstruction, Shashua [14] recently extended the wellknown two-image epipolar constraint to a trilinear constraint between matching points in three images. 

Summary: Projective structure and motion can be recovered from multiple perspective images of a scene consisting of points and lines, by estimating fundamental matrices and epipoles from the image data, using these to rescale the image measurements, and then factorizing the resulting rescaled measurement matrix using either SVD or a fast approximate factorization algorithm. 

The key technical advance that makes this work possible is a practical method for estimating these using fundamental matrices and epipoles obtained from the image data. 

The authors need to recover 3D structure (point locations) and motion (camera calibrations and locations) from m uncalibrated perspective images of a scene containing n 3D points. 

The factorization paradigm has two key attractions that are only enhanced by moving from the affine to the projective case: (i) All of the data in all of the images is treated uniformly — there is no need to single out ‘privileged’ features or images for special treatment; (ii) No initialization is required and convergence is virtually guaranteed by the nature of the numerical methods used. 

With such a non-redundant set of equations the depths for each point p can be found trivially by chaining together the solutions for each image, starting from some arbitrary initial value such as 1p = 1. 

When the matrix is not exactly of rank r the guesses are not quite optimal and it is useful to include further sweeps (say 2r in total) and then SVD the matrix of extracted columns to estimate the best r combinations of them. 

Although SVD is probably near-optimal for full-rank matrices, rank r matrices can be factorized in ‘output sensitive’ time O(mnr). 

There is no obvious solution to the error modelling problem, beyond using the factorization to initialize a nonlinear least squares routine (as is done in some of the experiments below). 

The full theory of projective depth recovery applies equally to two, three and four image matching tensors, but throughout this paper The authorwill concentrate on the two-image (fundamental matrix) case for simplicity.