What is the advantage of using uncalibrated cameras?

Another important advantage of working with uncalibrated affine cameras is that the reconstruction is affine, rather than projective as with uncalibrated projective cameras.

What is the determinant of the 2 2 matrix?

Its 2 2 determinant must vanish, i.e. jT jke2j = 0:As each entry of the 2 2 matrix is homogeneous linear in e2 = (u1; u2)T , the expansion of jT jke2j gives a homogeneous quadratic u21 + u1u2 + u22 = 0; (11) where ; ; are known in terms of Tijk .

How can the authors convert the affine structures obtained to Euclidean ones?

The affine structures obtained can be converted to Euclidean ones (up to a global scaling factor) as soon as the authors know the aspect ratio of the camera [17].

What is the tensor component of the projective camera?

The authors have just proven that recovering the directions of affine lines in 3D space is equivalent to 2D projective reconstruction from one-dimensional projective images.

How can the authors recover the projection matrices without loss of generality?

Without loss of generality, the authors can always take the following normal forms for the 3 projection matricesM = I2 2 0 ; M0 = A2 2 c ; M00 = D2 2 f : (12)It is straightforward to verify that the projection center of the first view is Ker(M1) = (0; 0; 1)T , so that e01 = c and e001 = f.

(Open Access) Uncalibrated 1D projective camera and 3D affine reconstruction of lines (1997) | Long Quan

Q: What is the main advantage of using simplified camera models?

In such cases, it is not only easier to use these simplified models but also advisable to do so, as by explicitly eliminating the ambiguities from the algorithm, one avoids computing parameters that are inherently ill-conditioned.

Q: How many minors of the joint projection matrix are there?

One way to explicitly recover the scale factors( ; 0; 00)T is to notice that the rescaled image coordinates( u; 0u0; 00u00)T should lie in the joint image, or alternatively to observe the following matrix identity:0@M uM0 0u0M00 00u001A = 0@MM0M001A I3 3 x : The rank of the left matrix is therefore at most 3.

HAL Id: inria-00590078

https://hal.inria.fr/inria-00590078

Submitted on 5 May 2011

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Uncalibrated 1D Projective Camera and 3D Ane

Reconstruction of Lines

Long Quan

To cite this version:

Long Quan. Uncalibrated 1D Projective Camera and 3D Ane Reconstruction of Lines. IEEE

Conference on Computer Vision and Pattern Recognition (CVPR ’97), Jun 1997, San Juan, Puerto

Rico. pp.60–65, �10.1109/CVPR.1997.609298�. �inria-00590078�

Uncalibrated 1D Projective Camera and 3D Afﬁne Reconstruction of Lines

Long QUAN

CNRS-GRAVIR-INRIA

ZIRST – 655 avenue de l’Europe

38330 Montbonnot, France

Email: Long.Quan@inrialpes.fr

Abstract

We describe a linear algorithm to recover 3D afﬁne

shape/motion from line correspondences over three views

with uncalibrated afﬁne cameras. The key idea is the in-

troduction of a one-dimensional projective camera. This

converts the 3D afﬁne reconstruction of “lines” into 2D

projective reconstruction of “points”. Using the full tenso-

rial representation of three uncalibrated 1D views, we prove

that the 3D afﬁne reconstruction of lines from minimal data

is unique up to a re-ordering of the views. 3D afﬁne line

reconstruction can be performed by properly rescaling im-

age coordinates instead of using projection matrices. The

algorithm is validated on both simulated and real image se-

quences.

1. Introduction

Using line segments instead of points as features has

attracted the attention of many researchers [11, 2, 29, 28,

27, 1] for various tasks such as pose estimation, stereo and

structure from motion. In this paper, we are interested in

structure from motion using line correspondences across

mutiple images. A minimum of three views is essential

for this, whereas two views sufﬁce for point correspon-

dences. In the case of calibrated perspective cameras, the

main results on structure from line correspondences were

established in [11, 22, 2]: With at least six line correspon-

dences over three views, nonlinear algorithms are possible.

With at least thirteen lines over three views, a linear algo-

rithm is possible. The basic idea of the thirteen-line lin-

ear algorithm is similar to that of the eight-point algorithm

[12]: It is based on the introduction of a set of redundant

intermediate parameters. This provides a very heavy over-

parametrization of the problem that deﬁnitely leads to the

instability of the algorithm reported in [11]. The thirteen-

line algorithm was extended to uncalibrated camera case in

[7, 27]. The situation for uncalibrated camera case might be

expected to be better, as more free parameters are needed.

However, the 27 tensor components that are introduced as

intermediateparametersare still subject to 9 complicated al-

gebraic constraints. The algorithm can hardly be stable. A

subsequent nonlinear optimization step is almost unavoid-

able to reﬁne the solution [2, 11, 22, 7].

In parallel, there has been a lot of work [23, 26, 20, 16,

17, 9, 10, 8, 14, 25] on structure from motion with sim-

pliﬁed camera models varing from orthographic projections

via weak and para-perspective to afﬁne cameras, almost ex-

clusively for point features. These simpliﬁed camera mod-

els provide a good approximation to perpsective projection

when the depth of the object is small compared to the view-

ing distance. More importantly, they expose the ambiguities

that arise when perspective effects diminish. In such cases,

it is not only easier to use these simpliﬁed models but also

advisable to do so, as by explicitly eliminating the ambigu-

ities from the algorithm, one avoids computing parameters

that are inherently ill-conditioned. Another important ad-

vantage of working with uncalibrated afﬁne cameras is that

the reconstruction is afﬁne, rather than projective as with

uncalibrated projective cameras.

Motivated on the one hand by the lack of satisfactory

line-based algorithms for projective cameras and on the

other by the fact that the afﬁne camera is a good model

for many practical cases, we investigated the properties of

line projection by afﬁne cameras and proposed a linear al-

gorithm [18, 19] for afﬁne structure from line correspon-

dences.

This paper is an extension of our previous work in which

the key advance introducing a one-dimensional projective

camera was made. The previous work concentrated on the

redundant data case to accomodate a factorization scheme

for lines. We were unable to solve for the reconstruction

ambiguity. In this paper, we use the same theoretical frame-

work but concentrate on the minimal data case. Instead

of using a projection matrix representation for reconstruc-

tion as in the previous work, we rely on a tensorial rep-

resentation of multi-views with one-dimensional cameras.

A complete analysis of the joint projection matrix reveals

the important role of the “epipoles” which, although redun-

dant with respect to the trilinear tensor, play a central role in

disambiguating the reconstruction. This new developement

allows us to ﬁnally prove that 3D afﬁne reconstruction of

lines with the minimal data is unique up to a re-ordering

of views. Subsequently, a reconstruction algorithm based

on the rescaling of image coordinates is proposed and vali-

dated on both simulated and real images.

Throughout the paper, tensors and matrices are denoted

in upper case boldface, vectors in lower case boldface and

scalars in either plain letters or lower case Greek.

2. Review of the afﬁne camera model for lines

As far as perspective (pin-hole) cameras are concerned,

the projection of a point x

= (

x; y ; z ; t

)

to a point

= (

u; v ; w

)

can be described by a



homoge-

neous projection matrix P:





(1)

For a restricted class of camera models, by setting the

third row of the perspective camera P to

;

; 

)

, we ob-

tain the afﬁne camera initially introduced by Mundy and

Zisserman in [15]



0 0 0









(2)

This is the uncalibrated afﬁne camera which emcom-

passes all the uncalibrated versions of the orthographic,

weak perspective and paraperspective camera models.

Now consider a line in

through a point x

with direc-

tion d



The afﬁne camera A



projects this to an image line:







= (

) +



;

with direction





;

(3)

passing through the image point



Equation (3) describes a linear mapping between direc-

tions of 3D lines and those of 2D lines. It can be derived

even more directly using projective geometry, by consid-

ering that the line with direction d

is the point at inﬁnity

= (

;

and the line with direction d

is the

point at inﬁnity u

Comparing Equation (3) with Equation (1) which is a

projection from

, we see that Equation (3) is noth-

ing but a projective projection from

if we consider

the 3D and 2D directions of lines as 2D and 1D projective

points. This means that the afﬁne reconstruction of lines

with a two-dimensional afﬁne camera is equivalent to the

projective reconstruction of points with a one-dimensional

projective camera!

There have been many recent works [3, 5, 24, 13, 4, 6,

21, 22] on projective reconstruction and the geometry of

multi-views of two dimensional uncalibrated cameras. Par-

ticularly, the tensorial formalism developed by Triggs [24]

is very interesting and powerful. We are now extending this

study to the case of the one-dimensional camera.

3. Uncalibrated one-dimensional camera

First, rewrite Equation (3) in the following form:





x (4)

in which we use u

= (

; u

)

and x

= (

; x

)

instead of d

and d

to stress that we are dealing with

“points” in the projective spaces

and

rather than line

directions in the vector spaces

and

. This exactly de-

scribes a one-dimensional projective camera which projects

a point x in

onto a point u in

We now examine the matching constraints between mul-

tiple views of the same point. There is a constraint only for

the case of 3 views.

Let the three views of the same point x be given as fol-

lows:



;



;



(5)

These can be rewritten in matrix form as

M u

0 0



= 0

;

(6)

which is the basic reconstruction equation for a one-

dimensional camera. The vector

(

;

;



;



)

can-

not be zero, and so



M u

0 0



= 0

(7)

The expansion of this determinant produces a trilinear

constraint of three views

i;j;k

ij k

= 0

;

(8)

or in short



= 0

where T



= (

ij k

)

is a



homogeneous

tensor whose components

ij k

are



minors of the fol-

lowing



joint projection matrix:

(9)

The components of the tensor can be made explicit as

ij k

= [



]

;

for

i; j

; k

= 1

;

where the bracket

[

]

denotes the



minor of

-th,

th and

-th row vector of the above joint projection matrix

and bar “



” in



and



denotes the mapping

;

It can be easily seen that any constraint obtained by

adding further views reduces to a trilinearity. This proves

the uniqueness of the trilinear constraint. Moreover, the



homogeneoustensor T



has

7 = 2



d.o.f., so it is a minimal parametrizationof three views since

three views have exactly



1) = 7

d.o.f., up to a projective transformation in

Each correspondence over three views gives one linear

constraint on the tensor components

ij k

. With at least 7

points in

, the tensor components

ij k

can be estimated

linearly.

At this point, we have obtained a remarkable result that

for the one-dimensional projective camera, the trilinear ten-

sor encapsulates exactly the information needed for projec-

tive reconstruction in

. Namely, it is the unique matching

constraint, it minimally parametrizes the three views and it

can be estimated linearly. Contrast this to the 2D image

case in which the multilinear constraints are algebraically

redundant and the linear estimation is only an approxima-

tion based on over-parametrization.

3.1. 2D projective reconstruction by rescaling

According to Triggs [24], the projective reconstruction

can be viewed as being equivalent to the rescaling of

the image points in

. We have just proven that recover-

ing the directions of afﬁne lines in 3D space is equivalent

to 2D projective reconstruction from one-dimensional pro-

jective images. Therefore, a reconstruction of the directions

of 3D afﬁne lines can be obtained by rescaling the direction

vectors of image lines, viewed as points of

For each 1D image point through in views (cf. Equa-

tion (5)), the scale factors



and



–taken individually–

are arbitrary: However,taken as a whole

(



; 

)

they encode the projective structure of the points x in

. One way to explicitly recover the scale factors

(

; 

; 

)

is to notice that the rescaled image coordinates

(



; 

)

should lie in the joint image, or alterna-

tively to observe the following matrix identity:







The rank ofthe left matrix is thereforeat most 3. All



minors vanish. Expanding by cofactors in the last column

gives homogeneous linear equations in the components of



and



with coefﬁcients that are



minors of

the joint projection matrix:



j k

(



)

(



)

(



)



;

(10)

where T



j k

u is for

ij k

, a



matrix.

There are two types of minors: Those involving three

views with one row from each view and those involving two

views with two rows from one view and one from the other.

The ﬁrst type gives the 8 components of the tensor T



and the second type gives 12 components of the “epipoles”

;

. The epipoles are deﬁned by analogy

with the 2D camera case, as the projection of one projection

center onto another view.

At present we only know

ij k

–the epipoles are still un-

known. To ﬁnd the rescaling factors for projective recon-

strucion, we need to solve for the epipoles. One way to

proceed is as follows. Taking x to be the projection center

of the second view o

, and projecting into the three views,

Equation (10) reduces to





j k



As e

has rank 1, so does T



j k

. Its



determi-

nant must vanish, i.e.



j k

= 0

As each entry of the



matrix is homogeneous linear

in e

= (

; u

)

, the expansion of



j k

gives a homo-

geneous quadratic

u

 u

 u

= 0

;

(11)

where

;  ; 

are known in terms of

ij k

Doing the same thing with the projection center of the

third view o

gives





j k



and hence



j k

= 0

In other words, it leads to exactly the same quadratic equa-

tion (11) with e

replacing e

. The two solutions of the

quadratic (11) are e

and e

–only the ordering remains am-

biguous.

The other epipoles are easily obtained, e

and e

factorizing the matrix T



j k

and e

by factorizing



j k

If the ﬁrst solution set is

;

the reordering gives the second solution set

;

Once all the epipoles have been recovered, the scale fac-

tors of the image “points” for 3D direction reconstruction

can easily be recovered by solving the linear homogeneous

equation (10).

3.2. Retrieving normal forms for projection matri-

ces

The geometry of the three views is most conveniently,

and completely represented by the projection matrices asso-

ciated with each view. In the previous section, the trilinear

tensor was expressed in terms of the projection matrices.

Now we seek a map from the trilinear tensor representa-

tion back to the projection matrix representation of the three

views.

Without loss of generality, we can always take the fol-

lowing normal forms for the 3 projection matrices





;





;





(12)

It is straightforward to verify that the projection center

of the ﬁrst view is Ker

(

) = (0

;

, so that e

c and

Now, the trilinear tensor

(

ij k

)

can be exhibited as

T

ij k

= (

(



j i



)

(13)

As c and f are known,

and

can be solved linearly

from the eight homogeneous equations of (13).

Note that in our previous work [18], we recovered the

projection matrices nonlinearly without knowing epipoles,

whereas here we recover them linearly using the epipoles.

4. Uncalibrated translations and afﬁne shape

To recover the full afﬁne structure of the lines, we still

need to ﬁnd the vector t



of the afﬁne cameras deﬁned

in (2). These represent the image translation and magniﬁca-

tion components of the camera. Recall that line correspon-

dences from two views do not impose any constraints on

camera motion: The minimum number of views required

is three. The recovery of the uncalibrated translations is

essentially linear once the uncalibrated rotations have been

recovered. A detailed linear algorithm is developed in our

previous work [18, 19].

The ﬁnal reconstruction step of lines can be easily

formulated as a subspace selection and solved by SVD

[18, 19].

5. Afﬁne-structure-from-lines theorem

In view of the results obtained above, we can establish

the following.

For the recovery of afﬁne shape and afﬁne motion from

line correspondences with an uncalibrated afﬁne camera,

the minimum number of views needed is three and the mini-

mum number of lines required is seven for a linear solution.

The recovery is unique up to a re-ordering of the views.

This result can be compared with that of Koenderink and

Van Doorn [9] for afﬁne structure with a minimum of two

views and ﬁve points.

6. Experimental results

The algorithm presented in this paper has been validated

with both simulated and real image sequences. Due to lack

of space, only an experiment based on real images will be

presented.

A Fujinon/Photometrics CCD camera is used to aquire a

sequence of images of a box of size



. The

image resolution is

576



384

. A Canny-like edge detector

is ﬁrst applied to each image. The contour points are then

linked and ﬁtted to line segments by least squares. Line

correspondences across three views are selected by hand. A

total of 46 lines is selected, as shown in Figure 1.

The reconstruction algorithm generates inﬁnite 3D lines.

To ﬁnd 3D line segments, we reproject the 3D lines into one

Uncalibrated 1D projective camera and 3D affine reconstruction of lines

Figures

Citations

Multiple View Geometry in Computer Vision.

Self-calibration of a 1D projective camera and its application to the self-calibration of a 2D projective camera

Image Analysis and Computer Vision

Self-Calibration of a 1D Projective Camera and Its Application to the Self-Calibration of a 2D Projective Camera

Linear 2D localization and mapping for single and multiple robot scenarios

References

Shape and motion from image streams under orthography: a factorization method

A computer algorithm for reconstructing a scene from two projections

Model-based object pose in 25 lines of code

What can be seen in three dimensions with an uncalibrated stereo rig

Geometric invariance in computer vision

Related Papers (5)

Affine structure from line correspondences with uncalibrated affine cameras

Affine and Projective Structure from Motion

Invariants of six points and projective reconstruction from three uncalibrated images

Affine stereo calibration

A New Affine Registration Algorithm for Matching 2D Point Sets

Frequently Asked Questions (9)

Q1. What is the advantage of using uncalibrated cameras?

Q2. What is the main idea of the thirteen-line linear algorithm?

Q3. How many complicated algebraic constraints are there?

Q4. What is the determinant of the 2 2 matrix?

Q5. How can the authors convert the affine structures obtained to Euclidean ones?

Q6. What is the tensor component of the projective camera?

Q7. How can the authors recover the projection matrices without loss of generality?

Q8. What is the main advantage of using simplified camera models?

Q9. How many minors of the joint projection matrix are there?