The 3D Line Motion Matrix and Alignment of Line Reconstructions

doi:10.1023/B:VISI.0000013092.07433.82

CVPR’01 - IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, HAWAII,

USA, PP. 287-292, VOL. 1, DECEMBER 2001.

The 3D Line Motion Matrix and Alignment of Line Reconstructions

Adrien Bartoli Peter Sturm

INRIA Rhˆone-Alpes, 655, av. de l’Europe

38334 St. Ismier cedex, France. ﬁrst.last@inria.fr

Abstract

We study the problem of aligning two 3D line reconstruc-

tions expressed in Pl

¨

ucker line coordinates.

We introduce the 6×6 3D line motion matrix that acts

on Pl

¨

ucker coordinates in projective, afﬁne or Euclidean

space. We characterize its algebraic properties and its re-

lation to the usual 4×4 point motion matrix, and propose

various methods for estimating 3D motion from line corre-

spondences, based on image-related and 3D cost functions.

We assess the quality of the different estimation methods us-

ing simulated data and real images.

1. Introduction

The goal of this paper is to align two reconstructions of

3D lines (ﬁgure 1). The recovered motions can be used in

many areas of computer vision, including tracking and mo-

tion segmentation, visual servoing and self-calibration.

Lines are widely used for tracking [5, 17], for visual ser-

voing [1] or for pose estimation [8] and their reconstruction

has been well studied (see e.g. [2] for image detection, [12]

for matching and [13, 14, 15] for structure and motion).

There are three intrinsic difﬁculties to motion estimation

from 3D line correspondences, even in Euclidean space.

Firstly, there is no global minimal parameterization for lines

representing their 4 degrees of freedom by 4 global param-

eters. Secondly, there is no universally agreed error metric

for comparing lines. Thirdly, depending on the represen-

tation, it may be non trivial to transfer a line between two

different bases.

In this paper, we address the problem of motion com-

putation using projective line reconstructions. This is the

most general case, so our results can easily be specialized

to afﬁne and Euclidean spaces. See [18] for a review of

previous work on the Euclidean case.

In each of these spaces, motion is usually represented by

4×4 matrices (homography, afﬁnity or rigid displacement),

with different numbers of parameters. See [6] for more de-

tails. This representation is well-suited to points and planes.

We call it the usual motion matrix.

This work was supported by the project IST-1999-10756, VISIRE.

One way to represent 3D lines is to use Pl¨ucker coordi-

nates. These are consistent in that they do not depend on the

speciﬁc points or planes used to deﬁne the line. On the other

hand, transferring a line between bases is difﬁcult (one must

either recover two points lying on it, transfer these and form

their Pl¨ucker coordinates or transform one of the two lines

4×4 skew-symmetric Pl¨ucker matrix representations). The

problem with the Pl¨ucker matrix representation is that it is

quadratic in the transformation which therefore can not be

estimated linearly from line matches.

rigid scene

motion

line reconstruction 2

line reconstruction 1

camera set 1

camera set 2

Figure 1. Our problem is to estimate the motion of the

cameras between two corresponding line reconstruc-

tions.

To overcome this, we derive a motion representation that

is well-adapted to Pl¨ucker coordinates in that it transfers

them linearly between bases. The transformation is rep-

resented by a 6×6 matrix that we call the 3D line motion

matrix. We characterize the algebraic properties of this in

terms of the usual motion matrix. The expressions obtained

were previously known in the Euclidean case [10, 18]. We

give a means of extracting the usual motion matrix from

the 3D line motion matrix and show how to correct a gen-

eral 6×6 matrix so that it represents a motion (compare this

with the case of the fundamental matrix estimation using

the 8 point algorithm: the obtained 3×3 matrix is corrected

so that its smallest singular value becomes zero [16]).

Using this representation, we derive several estimators

for 3D motion from line reconstructions. The motion allows

lines to be transfered and reprojected from the ﬁrst recon-

struction onto the images of the second one. Optimization

criteria can therefore be expressed in image-related quanti-

ties, in terms of the actual and reprojected lines in the sec-

ond set of images.

Our two ﬁrst methods are based on algebraic distances

between reprojected lines and either actual lines or their

end-points. A third method is based on direct compari-

son of Pl¨ucker coordinates. A 6×6 matrix is recovered lin-

early, then corrected so that it exactly represents a motion.

A fourth method uses a more physically meaningful cri-

terion based on orthogonal distances between reprojected

lines and actual end-points. This requires non-linear opti-

mization techniques that need an initialization provided by

a linear method. To avoid the use of non-linear optimiza-

tion while keeping the criterion, we devise a method that

quasi-linearly optimizes it and that does not require a sepa-

rate initialization.

§2 gives some preliminaries and our notation. We intro-

duce the 3D line motion matrix in §3 and show how this can

be used to estimate the motion between two reconstructions

of 3D lines in §4. We validate our methods on both sim-

ulated data and real images in §§5 and 6 respectively, and

give our conclusions and perspectives in §7.

2. Preliminaries and Notations

We make no formal distinction between coordinate vec-

tors and physical entities. Equality up to a non-null scale

factor is denoted by ∼, transposition and transposed inverse

by

T

and

−T

, and the skew-symmetric 3×3-matrix associ-

ated with the cross product by [.]

×

, i.e. [v]

×

q = v × q.

Vectors are typeset using bold fonts (L, l) and matrices us-

ing sans-serif fonts (H, A, D). Everything is represented in

homogeneous coordinates. Bars represent inhomogeneous

parts, e.g. M

T

∼



¯

M

T

m



.

Pl¨ucker line coordinates. Given two 3D points M

T

∼



¯

M

T

m



and N

T

∼



¯

N

T

n



, one can form the Pl¨ucker

matrix representing the line joining them by:

L ∼ MN

T

− NM

T

.

This is a skew-symmetric rank-2 4×4-matrix [6]. The

Pl¨ucker coordinates L

T

∼



a

T

b

T



of the line are its 6

different (up to sign) off-diagonal entries, written as a vec-

tor. There are many ways of arranging them. We choose the

following:



a =

¯

M ×

¯

N

b = m

¯

N − n

¯

M,

(1)

i.e. L ∼



[a]

×

−b

b

T

0



. The constraint det L = 0 corre-

sponds to a

T

b = 0.

Standard motion representation. Motions in projective,

afﬁne and Euclidean spaces are usually represented by 4×4

matrices. In the general projective case, the matrices are

unconstrained, while in the afﬁne and Euclidean cases they

have the following forms, where R is a 3×3 rotation matrix:

projective:

afﬁne: Euclidean:

homography H afﬁnity A displacement D



¯

H h

1

h

2

T

h

 

¯

A t

0

T

1





R t

0

T

1



3. The 3D Line Motion Matrix

In this section, we deﬁne the 3D line motion matrix in

the projective case, and then specialize it to the afﬁne and

Euclidean cases.

Proposition 1 The Pl

¨

ucker coordinates of a line, expressed

in two different bases, are linearly linked. The 6×6 ma-

trix

e

H describing the transformation in the projective case

is called the 3D line homography matrix and can be param-

eterized as :

e

H ∼



det(

¯

H)

¯

H

−T

[h

1

]

×

¯

H

−

¯

H[h

2

]

×

h

¯

H − h

1

h

2

T



,

where H is the usual 4×4 homography matrix for points.

If L

T

∼



a

T

b

T



are the Pl

¨

ucker coordinates of a line (i.e.

a

T

b = 0), then

e

HL are the Pl

¨

ucker coordinatesof the trans-

formed line.

Proof: consider a line with coordinates L

T

1

deﬁned by

two points M

T

1

and N

T

1

in the ﬁrst projective basis and co-

ordinates L

T

2

deﬁned by points M

T

2

and N

T

2

in the second

projective basis. Expanding the expressions for a

2

and b

2

according to the deﬁnition of Pl¨ucker coordinates (1) gives

respectively the 3×6 upper and lower parts of

e

H:

a

2

=

¯

M

2

×

¯

N

2

=



¯

H

¯

M

1

+ m

1

h

1



×



¯

H

¯

N

1

+ n

1

h

1



= det(

¯

H)

¯

H

−T

(

¯

M

1

×

¯

N

1

) + [h

1

]

×

¯

H



m

¯

N − n

¯

M



= det(

¯

H)

¯

H

−T

a

1

+ [h

1

]

×

¯

Hb

1

,

b

2

= m

2

¯

N

2

− n

2

¯

M

2

= h

2

T



¯

M

1

¯

H

¯

N

1

−

¯

N

1

¯

H

¯

M

1



+h

2

T



¯

M

1

h

1

n

1

−

¯

N

1

h

1

m

1



+h

¯

H(m

1

¯

N

1

− n

1

¯

M

1

)

= −

¯

H[h

2

]

×

a

1

− h

1

h

2

T

b

1

+ h

¯

Hb

1

. 

Corollary 1 In afﬁne and Euclidean coordinates, the 3D

line motion matrix takes the forms:

e

A ∼



det(

¯

A)

¯

A

−T

[t]

×

¯

A

0

¯

A



and

e

D ∼



R [t]

×

R

0 R



.

This result coincides with that obtained in [10, 18] in the

Euclidean case.

Extracting the motion from the 3D line motion matrix.

Given a 6×6 3D line motion matrix, one can extract the cor-

responding motion parameters, i.e. the usual 4×4 motion

matrix. An algorithm is given in table 1 for the projective

case. In the presence of noise

e

H does not exactly satisfy

the constraints and steps 2-4 have to be achieved in a least

square sense. From there, one can further improvethe result

by non-linear minimization of the Frobenius norm between

the given line homography and the one corresponding to

the recovered motion. This algorithm can be specialized by

Let

e

H be subdivided in 3×3 blocks as:

e

H ∼

e

H

11

e

H

12

e

H

21

e

H

22

!

.

1.

¯

H: compute

¯

H =

q

| det

e

H

11

|

e

H

−T

11

up to sign;

2. h

1

: compute [h

1

]

×

=

e

H

12

¯

H

−1

;

3. h

2

: compute [h

2

]

×

= −

¯

H

−1

e

H

21

;

4. h: compute h as hI

3×3

= (

e

H

22

+ h

1

h

2

T

)

¯

H

−1

.

Table 1. Extracting the point homography from the 3D

line homography matrix.

considering the special structure of the 3D line motion ma-

trix in the afﬁne and Euclidean cases.

4. Aligning Two Line Reconstructions

We now describe how the 3D line motion matrix can be

used to align two sets of n corresponding 3D lines expressed

in Pl¨ucker coordinates. We examine the projective case but

the method can also be used for afﬁne or Euclidean frames.

We assume that the two sets of cameras are independently

weakly calibrated, i.e. their projection matrices are known

up to a 3D homography,so that a projective basis is attached

to each set [9]. Lines can be projectively reconstructed in

these two bases. Our goal is to align these 3D lines i.e. to

ﬁnd the projective motion between the two bases using the

line reconstructions.

General estimation scheme. For the reasons mentioned

in the introduction, we have chosen to use image-based cost

functions. Alternatively, we could use an algebraic distance

between Pl¨ucker coordinates to linearly estimate the motion

using 3D lines (see [3] in the case of points). This estimator

is called “Lin3D”.

Estimation is performed by ﬁnding arg min

e

H

C where C

is the cost function considered. The scale ambiguity is re-

moved by using the additional constraint ||

e

H||

2

= 1. Non-

linear optimization is performed directly on the motion pa-

rameters (the entries of H) whereas the other estimators de-

termine

e

H ﬁrst, then recover the motion using algorithm 1.

Our cost functions are expressed in terms of observed

image lines or their end-points, and reprojected lines in the

second set of images. They are therefore non-symmetric,

taken into account only the error in the second set of im-

ages. We derive a perspective projection matrix for 3D lines

expressed in Pl¨ucker coordinates and a joint projection ma-

trix mapping a 3D line to a set of image lines in the second

set of images.

If end-points are not available they can be hallucinated,

e.g. by intersecting the image lines with the image bound-

aries. The linear and quasi-linear methods need at least 9

lines to solve for the motion while the non-linear one needs

4 but requires an initial guess.

Perspective projection matrix for lines. With our

choice of Pl¨ucker coordinates (1), the image projection

of a line [6] becomes the linear transformation

e

P ∼



det(

¯

P)

¯

P

−T

[p]

×

¯

P



3×6

, where P ∼ (

¯

P p) is the perspec-

tive camera matrix. This result can be easily demonstrated

by ﬁnding the image line joining the projections of two

points on the line.

Let P

j

be the projection matrices of the m images corre-

sponding to the second reconstruction. We deﬁne the joint

projection matrix for lines as:

P

T

=



e

P

T

1

. . .

e

P

T

m



.

Linear estimation 1. Our ﬁrst alignment method “Lin1”

directly uses the line equations in the images. End-points

need not be available. We deﬁne an algebraic measure of

distance between two image lines l and

b

l by d

2

(l,

b

l) =

||l ×

b

l||

2

. This distance does not have any direct physical

signiﬁcance, but it is zero if the two lines are identical and

simple in that it is bilinear. This distance induces the error

criterion:

C

1

=

X

i

X

j

d

2

(l

ij

,

b

l

ij

),

where l

ij

is the i-th observed line in the j-th image and

b

l

ij

the corresponding reprojection. Each term of the sum over i

can be written as B

i

P

e

HL

i

where B

i

is a 3m × 3m rank-2m

matrix deﬁned as:

B

i

=







[l

i1

]

×

.

[l

im

]

×







.

These equations can be rearranged to form a linear system

in the unknown entries of

e

H, where each line correspon-

dence provides 3m equations. The system can be solved

using SVD (Singular Value Decomposition) [11] to obtain

a solution that satisﬁes ||

e

H||

2

= 1 as the null-vector of a

3mn×36 matrix.

Linear estimation 2. Our second method “Lin2” uses ob-

served end-points in the second image set and the algebraic

distance d

2

a

(x, l) =



x

T

l



2

between an image point x and

line l. This gives the criterion:

C

2

=

X

i

X

j



d

2

a

(x

ij

,

b

l

ij

) + d

2

a

(y

ij

,

b

l

ij

)



,

where x

ij

and y

ij

designate the end-points of the i-th line in

the j-th image. Each term of the sum over i can be written

as C

i

P

e

HL

i

where

C

i

=







x

i1

T

y

i1

T

.

x

im

T

y

im

T







is a full-rank 2m × 3m matrix. These equations can be

rearranged to form a linear system in the unknown entries

of

e

H. Each line correspondence accounts for 2m equations.

The system can be solved by SVD [11] of a 2mn×36matrix.

Non-linear estimation. Our third method “NLin” uses a

physical cost function based on the orthogonal distance be-

tween reprojected 3D lines and their measured end-points

[7], deﬁned as d

2

⊥

(x, l) =

(

x

T

l

)

2

l

2

1

+l

2

:

C

3

=

X

i

X

j



d

2

⊥

(x

ij

,

b

l

ij

) + d

2

⊥

(y

ij

,

b

l

ij

)



.

This is non-linear in the image lines and consequently in

the entries of

e

H, which implies the use of non-linear opti-

mization techniques. The unknowns are minimally param-

eterized (we optimize directly the entries of H, not

e

H), so

no subsequent correction is needed to recover the motion

parameters.

Quasi-linear estimation. The drawbacks of non-linear

optimization are that the implementation is complicated and

the computational cost is high. For these reasons, we also

developed a quasi-linear estimator “Qlin” that minimizes

the same cost function. Consider the cost functions C

2

and

C

3

. Both depend on the same data, measured end-points and

reprojected lines, the former using an algebraic and the lat-

ter the orthogonal distance. We can relate these distances

by:

d

2

⊥

(x, l) = w

l

d

2

a

(x, l) where w

l

=

1

l

2

1

+ l

2

, (2)

and rewrite C

3

as:

C

3

=

X

i

X

j

w

l

ij



d

2

a

(x

ij

,

b

l

ij

) + d

2

a

(y

ij

,

b

l

ij

)



.

The non-linearity is hidden in the weight factors w

l

ij

. If

they were known, the criterion would be linear in the en-

tries of

e

H. This leads to the following iterative algorithm.

Weights, assumed unknown, are initialized to 1 and iter-

atively updated. The loop is ended when the weights or

equivalently the error converge. The algorithm is summa-

rized in table 2. It is a quasi-linear optimization that con-

verges from the algebraic minimum error to the geometri-

cal one. It is simple to implement (as a loop over a linear

method) and less sensitive to local minima than the non-

linear method [4].

1. initialization: set w

l

ij

=1;

2. estimation: estimate

e

H using standard weighted

least squares; the 6×6 matrix obtained is corrected

so that it represents a motion, see algorithm 1;

3. weighting: use

e

H to update the weights w

l

ij

accord-

ing to equation (2);

4. iteration: iterate steps 2. and 3. until convergence

(see text).

Table 2. Quasi-linear motion estimation from 3D line cor-

respondences.

5. Results Using Simulated Data

We ﬁrst compare our estimators using simulated data.

The test bench consists of four cameras that form two stereo

pairs observing a set of n 3D lines randomly chosen in a

sphere lying in the ﬁelds of view of all cameras. Lines are

projected onto the imageplanes, end-points are hallucinated

at the image boundaries and corrupted by additive Gaussian

noise, and the equations of the image lines are estimated

from these noisy end-points.

A canonical projective basis [9] is attached to each cam-

era pair and used to reconstruct the lines in projectivespace.

We then compute the 3D homography between the two pro-

jective bases using the estimators given in §4. We assess

the quality of an estimated motion by measuring the RMS

(Root Mean Square) of the Euclidean reprojection errors

(orthogonal distances between reprojected lines and end-

points in the second image pair). This corresponds to the

criterion minimized by the non-linear and the quasi-linear

algorithms. We compute the median error over 100 trials.

Figure 2 shows the error as the level of added noise

varies. The non-linear method is initialized using the quasi-

linear one. We observe that the methods Lin3D (based on an

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0

0.5

1

1.5

2

2.5

3

Image noise standard deviation (pixel)

Reprojected error (pixel)

PSfrag replacements

Lin3D

Lin1

Lin2

QLin

NLin

Figure 2. Comparison of reprojected error versus added

image noise for different motion estimators.

algebraic distance between 3D Pl¨ucker coordinates), Lin1

and Lin2 perform worse than the others. This is due to the

fact that the criteria used in these methods are not physi-

cally meaningful and biased compared to C

3

. Method QLin

gives results close to those obtained using NLin. It is there-

fore a good compromise between the linear and non-linear

methods, achievinggood results whilekeeping simplicity of

implementation. However, we observed that in a few cases

(about 4%), the quasi-linear method does not enhance the

result obtained by Lin2 while NLin does. QLin estimates

more parameters than necessary and this may cause numer-

ical instabilities.

6. Results on Real Images

We also tested our algorithms using images taken with a

stereo rig, so that the epipolar geometry is the same for both

image pairs, see ﬁgure 3. We use the technique givenin [16]

pair 1 pair 2

Figure 3. The two image pairs of a ship part used in the

experiments, overlaid with actual lines. Note that the

extracted end-points do not necessarily correspond.

to estimate the fundamental matrix and deﬁne a canonical

reconstruction basis for each pair [9]. This also gives the

joint line projection matrix P. We track lines across images

by hand and projectively reconstruct them for each image

pair.

Motion estimation. We used the methods of §4 to esti-

mate the projective motion between the two reconstruction

bases, but since we have no 3D ground truth we will only

show the result of transferring the set of reconstructed lines

from the ﬁrst to the second 3D frame, using the 3D line ho-

mography matrix, and reprojecting them. Figure 4 shows

these reprojections, which conﬁrms that the non-linear and

quasi-linear methods achieve much better results than the

linear ones.

7. Conclusions and Perspectives

We addressed the problem of estimating the motion be-

tween two line reconstructions in the general projective

case. We used Pl¨ucker coordinates to represent 3D lines

and showed that they could be transfered linearly between

two reconstruction bases using a 6×6 3D line motion ma-

trix. We investigate the algebraic properties of this matrix

and show how to extract the usual 4×4 motion matrix (i.e.

homography, afﬁnity or rigid displacement) from it.

We then proposed several 3D and image-based estima-

tors for the motion between two line reconstructions. Ex-

perimental results on both simulated and real data show

that the linear estimators perform worse (the residuals are

at least twice as large) than the non-linear ones, especially

The 3D Line Motion Matrix and Alignment of Line Reconstructions

Figures

Citations

Structure-from-motion using lines: representation, triangulation, and bundle adjustment

PL-VIO: Tightly-Coupled Monocular Visual-Inertial Odometry Using Point and Line Features.

Impact of Landmark Parametrization on Monocular EKF-SLAM with Points and Lines

Building a 3-D Line-Based Map Using Stereo SLAM

Robust visual SLAM with point and line features

References

A Computational Approach to Edge Detection

Multiple view geometry in computer vision

Multiple View Geometry in Computer Vision.

Numerical Recipes in C: The Art of Scientific Computing

Determining the Epipolar Geometry and its Uncertainty: A Review

Related Papers (5)

Multiple view geometry in computer vision

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

LSD: A Fast Line Segment Detector with a False Detection Control

Self-Calibration and Metric Reconstruction Inspite of Varying and Unknown Intrinsic Camera Parameters

Computational line geometry