scispace - formally typeset
Open AccessJournal ArticleDOI

The 3D Line Motion Matrix and Alignment of Line Reconstructions

Reads0
Chats0
TLDR
The 6 × 6 3D line motion matrix that acts on Plücker coordinates is introduced, its algebraic properties are characterized, and various methods for estimating 3D motion from line correspondences are proposed, based on cost functions defined in images or 3D space.
Abstract
We study the problem of aligning two 3D line reconstructions in projective, affine, metric or Euclidean space. We introduce the 6 × 6 3D line motion matrix that acts on Plucker coordinates. We characterize its algebraic properties and its relation to the usual 4 × 4 point motion matrix, and propose various methods for estimating 3D motion from line correspondences, based on cost functions defined in images or 3D space. We assess the quality of the different estimation methods using simulated data and real images.

read more

Content maybe subject to copyright    Report

CVPR01 - IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, HAWAII,
USA, PP. 287-292, VOL. 1, DECEMBER 2001.
The 3D Line Motion Matrix and Alignment of Line Reconstructions
Adrien Bartoli Peter Sturm
INRIA Rhˆone-Alpes, 655, av. de l’Europe
38334 St. Ismier cedex, France. first.last@inria.fr
Abstract
We study the problem of aligning two 3D line reconstruc-
tions expressed in Pl
¨
ucker line coordinates.
We introduce the 6×6 3D line motion matrix that acts
on Pl
¨
ucker coordinates in projective, affine or Euclidean
space. We characterize its algebraic properties and its re-
lation to the usual 4×4 point motion matrix, and propose
various methods for estimating 3D motion from line corre-
spondences, based on image-related and 3D cost functions.
We assess the quality of the different estimation methods us-
ing simulated data and real images.
1. Introduction
The goal of this paper is to align two reconstructions of
3D lines (figure 1). The recovered motions can be used in
many areas of computer vision, including tracking and mo-
tion segmentation, visual servoing and self-calibration.
Lines are widely used for tracking [5, 17], for visual ser-
voing [1] or for pose estimation [8] and their reconstruction
has been well studied (see e.g. [2] for image detection, [12]
for matching and [13, 14, 15] for structure and motion).
There are three intrinsic difficulties to motion estimation
from 3D line correspondences, even in Euclidean space.
Firstly, there is no global minimal parameterization for lines
representing their 4 degrees of freedom by 4 global param-
eters. Secondly, there is no universally agreed error metric
for comparing lines. Thirdly, depending on the represen-
tation, it may be non trivial to transfer a line between two
different bases.
In this paper, we address the problem of motion com-
putation using projective line reconstructions. This is the
most general case, so our results can easily be specialized
to affine and Euclidean spaces. See [18] for a review of
previous work on the Euclidean case.
In each of these spaces, motion is usually represented by
4×4 matrices (homography, affinity or rigid displacement),
with different numbers of parameters. See [6] for more de-
tails. This representation is well-suited to points and planes.
We call it the usual motion matrix.
This work was supported by the project IST-1999-10756, VISIRE.
One way to represent 3D lines is to use Pl¨ucker coordi-
nates. These are consistent in that they do not depend on the
specific points or planes used to define the line. On the other
hand, transferring a line between bases is difficult (one must
either recover two points lying on it, transfer these and form
their Pl¨ucker coordinates or transform one of the two lines
4×4 skew-symmetric Pl¨ucker matrix representations). The
problem with the Pl¨ucker matrix representation is that it is
quadratic in the transformation which therefore can not be
estimated linearly from line matches.
rigid scene
motion
line reconstruction 2
line reconstruction 1
camera set 1
camera set 2
Figure 1. Our problem is to estimate the motion of the
cameras between two corresponding line reconstruc-
tions.
To overcome this, we derive a motion representation that
is well-adapted to Pl¨ucker coordinates in that it transfers
them linearly between bases. The transformation is rep-
resented by a 6×6 matrix that we call the 3D line motion
matrix. We characterize the algebraic properties of this in
terms of the usual motion matrix. The expressions obtained

were previously known in the Euclidean case [10, 18]. We
give a means of extracting the usual motion matrix from
the 3D line motion matrix and show how to correct a gen-
eral 6×6 matrix so that it represents a motion (compare this
with the case of the fundamental matrix estimation using
the 8 point algorithm: the obtained 3×3 matrix is corrected
so that its smallest singular value becomes zero [16]).
Using this representation, we derive several estimators
for 3D motion from line reconstructions. The motion allows
lines to be transfered and reprojected from the first recon-
struction onto the images of the second one. Optimization
criteria can therefore be expressed in image-related quanti-
ties, in terms of the actual and reprojected lines in the sec-
ond set of images.
Our two first methods are based on algebraic distances
between reprojected lines and either actual lines or their
end-points. A third method is based on direct compari-
son of Pl¨ucker coordinates. A 6×6 matrix is recovered lin-
early, then corrected so that it exactly represents a motion.
A fourth method uses a more physically meaningful cri-
terion based on orthogonal distances between reprojected
lines and actual end-points. This requires non-linear opti-
mization techniques that need an initialization provided by
a linear method. To avoid the use of non-linear optimiza-
tion while keeping the criterion, we devise a method that
quasi-linearly optimizes it and that does not require a sepa-
rate initialization.
§2 gives some preliminaries and our notation. We intro-
duce the 3D line motion matrix in §3 and show how this can
be used to estimate the motion between two reconstructions
of 3D lines in §4. We validate our methods on both sim-
ulated data and real images in §§5 and 6 respectively, and
give our conclusions and perspectives in §7.
2. Preliminaries and Notations
We make no formal distinction between coordinate vec-
tors and physical entities. Equality up to a non-null scale
factor is denoted by , transposition and transposed inverse
by
T
and
T
, and the skew-symmetric 3×3-matrix associ-
ated with the cross product by [.]
×
, i.e. [v]
×
q = v × q.
Vectors are typeset using bold fonts (L, l) and matrices us-
ing sans-serif fonts (H, A, D). Everything is represented in
homogeneous coordinates. Bars represent inhomogeneous
parts, e.g. M
T
¯
M
T
m
.
Pl¨ucker line coordinates. Given two 3D points M
T
¯
M
T
m
and N
T
¯
N
T
n
, one can form the Pl¨ucker
matrix representing the line joining them by:
L MN
T
NM
T
.
This is a skew-symmetric rank-2 4×4-matrix [6]. The
Pl¨ucker coordinates L
T
a
T
b
T
of the line are its 6
different (up to sign) off-diagonal entries, written as a vec-
tor. There are many ways of arranging them. We choose the
following:
a =
¯
M ×
¯
N
b = m
¯
N n
¯
M,
(1)
i.e. L
[a]
×
b
b
T
0
. The constraint det L = 0 corre-
sponds to a
T
b = 0.
Standard motion representation. Motions in projective,
affine and Euclidean spaces are usually represented by 4×4
matrices. In the general projective case, the matrices are
unconstrained, while in the affine and Euclidean cases they
have the following forms, where R is a 3×3 rotation matrix:
projective:
affine: Euclidean:
homography H affinity A displacement D
¯
H h
1
h
2
T
h
¯
A t
0
T
1
R t
0
T
1
3. The 3D Line Motion Matrix
In this section, we define the 3D line motion matrix in
the projective case, and then specialize it to the affine and
Euclidean cases.
Proposition 1 The Pl
¨
ucker coordinates of a line, expressed
in two different bases, are linearly linked. The 6×6 ma-
trix
e
H describing the transformation in the projective case
is called the 3D line homography matrix and can be param-
eterized as :
e
H
det(
¯
H)
¯
H
T
[h
1
]
×
¯
H
¯
H[h
2
]
×
h
¯
H h
1
h
2
T
,
where H is the usual 4×4 homography matrix for points.
If L
T
a
T
b
T
are the Pl
¨
ucker coordinates of a line (i.e.
a
T
b = 0), then
e
HL are the Pl
¨
ucker coordinatesof the trans-
formed line.
Proof: consider a line with coordinates L
T
1
defined by
two points M
T
1
and N
T
1
in the first projective basis and co-
ordinates L
T
2
defined by points M
T
2
and N
T
2
in the second
projective basis. Expanding the expressions for a
2
and b
2
according to the definition of Pl¨ucker coordinates (1) gives

respectively the 3×6 upper and lower parts of
e
H:
a
2
=
¯
M
2
×
¯
N
2
=
¯
H
¯
M
1
+ m
1
h
1
×
¯
H
¯
N
1
+ n
1
h
1
= det(
¯
H)
¯
H
T
(
¯
M
1
×
¯
N
1
) + [h
1
]
×
¯
H
m
¯
N n
¯
M
= det(
¯
H)
¯
H
T
a
1
+ [h
1
]
×
¯
Hb
1
,
b
2
= m
2
¯
N
2
n
2
¯
M
2
= h
2
T
¯
M
1
¯
H
¯
N
1
¯
N
1
¯
H
¯
M
1
+h
2
T
¯
M
1
h
1
n
1
¯
N
1
h
1
m
1
+h
¯
H(m
1
¯
N
1
n
1
¯
M
1
)
=
¯
H[h
2
]
×
a
1
h
1
h
2
T
b
1
+ h
¯
Hb
1
.
Corollary 1 In affine and Euclidean coordinates, the 3D
line motion matrix takes the forms:
e
A
det(
¯
A)
¯
A
T
[t]
×
¯
A
0
¯
A
and
e
D
R [t]
×
R
0 R
.
This result coincides with that obtained in [10, 18] in the
Euclidean case.
Extracting the motion from the 3D line motion matrix.
Given a 6×6 3D line motion matrix, one can extract the cor-
responding motion parameters, i.e. the usual 4×4 motion
matrix. An algorithm is given in table 1 for the projective
case. In the presence of noise
e
H does not exactly satisfy
the constraints and steps 2-4 have to be achieved in a least
square sense. From there, one can further improvethe result
by non-linear minimization of the Frobenius norm between
the given line homography and the one corresponding to
the recovered motion. This algorithm can be specialized by
Let
e
H be subdivided in 3×3 blocks as:
e
H
e
H
11
e
H
12
e
H
21
e
H
22
!
.
1.
¯
H: compute
¯
H =
q
| det
e
H
11
|
e
H
T
11
up to sign;
2. h
1
: compute [h
1
]
×
=
e
H
12
¯
H
1
;
3. h
2
: compute [h
2
]
×
=
¯
H
1
e
H
21
;
4. h: compute h as hI
3×3
= (
e
H
22
+ h
1
h
2
T
)
¯
H
1
.
Table 1. Extracting the point homography from the 3D
line homography matrix.
considering the special structure of the 3D line motion ma-
trix in the affine and Euclidean cases.
4. Aligning Two Line Reconstructions
We now describe how the 3D line motion matrix can be
used to align two sets of n corresponding 3D lines expressed
in Pl¨ucker coordinates. We examine the projective case but
the method can also be used for affine or Euclidean frames.
We assume that the two sets of cameras are independently
weakly calibrated, i.e. their projection matrices are known
up to a 3D homography,so that a projective basis is attached
to each set [9]. Lines can be projectively reconstructed in
these two bases. Our goal is to align these 3D lines i.e. to
find the projective motion between the two bases using the
line reconstructions.
General estimation scheme. For the reasons mentioned
in the introduction, we have chosen to use image-based cost
functions. Alternatively, we could use an algebraic distance
between Pl¨ucker coordinates to linearly estimate the motion
using 3D lines (see [3] in the case of points). This estimator
is called Lin3D”.
Estimation is performed by finding arg min
e
H
C where C
is the cost function considered. The scale ambiguity is re-
moved by using the additional constraint ||
e
H||
2
= 1. Non-
linear optimization is performed directly on the motion pa-
rameters (the entries of H) whereas the other estimators de-
termine
e
H first, then recover the motion using algorithm 1.
Our cost functions are expressed in terms of observed
image lines or their end-points, and reprojected lines in the
second set of images. They are therefore non-symmetric,
taken into account only the error in the second set of im-
ages. We derive a perspective projection matrix for 3D lines
expressed in Pl¨ucker coordinates and a joint projection ma-
trix mapping a 3D line to a set of image lines in the second
set of images.
If end-points are not available they can be hallucinated,
e.g. by intersecting the image lines with the image bound-
aries. The linear and quasi-linear methods need at least 9
lines to solve for the motion while the non-linear one needs
4 but requires an initial guess.
Perspective projection matrix for lines. With our
choice of Pl¨ucker coordinates (1), the image projection
of a line [6] becomes the linear transformation
e
P
det(
¯
P)
¯
P
T
[p]
×
¯
P
3×6
, where P (
¯
P p) is the perspec-
tive camera matrix. This result can be easily demonstrated
by finding the image line joining the projections of two
points on the line.
Let P
j
be the projection matrices of the m images corre-
sponding to the second reconstruction. We define the joint
projection matrix for lines as:
P
T
=
e
P
T
1
. . .
e
P
T
m
.

Linear estimation 1. Our first alignment method Lin1
directly uses the line equations in the images. End-points
need not be available. We define an algebraic measure of
distance between two image lines l and
b
l by d
2
(l,
b
l) =
||l ×
b
l||
2
. This distance does not have any direct physical
significance, but it is zero if the two lines are identical and
simple in that it is bilinear. This distance induces the error
criterion:
C
1
=
X
i
X
j
d
2
(l
ij
,
b
l
ij
),
where l
ij
is the i-th observed line in the j-th image and
b
l
ij
the corresponding reprojection. Each term of the sum over i
can be written as B
i
P
e
HL
i
where B
i
is a 3m × 3m rank-2m
matrix defined as:
B
i
=
[l
i1
]
×
.
.
.
[l
im
]
×
.
These equations can be rearranged to form a linear system
in the unknown entries of
e
H, where each line correspon-
dence provides 3m equations. The system can be solved
using SVD (Singular Value Decomposition) [11] to obtain
a solution that satisfies ||
e
H||
2
= 1 as the null-vector of a
3mn×36 matrix.
Linear estimation 2. Our second method Lin2 uses ob-
served end-points in the second image set and the algebraic
distance d
2
a
(x, l) =
x
T
l
2
between an image point x and
line l. This gives the criterion:
C
2
=
X
i
X
j
d
2
a
(x
ij
,
b
l
ij
) + d
2
a
(y
ij
,
b
l
ij
)
,
where x
ij
and y
ij
designate the end-points of the i-th line in
the j-th image. Each term of the sum over i can be written
as C
i
P
e
HL
i
where
C
i
=
x
i1
T
y
i1
T
.
.
.
x
im
T
y
im
T
is a full-rank 2m × 3m matrix. These equations can be
rearranged to form a linear system in the unknown entries
of
e
H. Each line correspondence accounts for 2m equations.
The system can be solved by SVD [11] of a 2mn×36matrix.
Non-linear estimation. Our third method NLin uses a
physical cost function based on the orthogonal distance be-
tween reprojected 3D lines and their measured end-points
[7], defined as d
2
(x, l) =
(
x
T
l
)
2
l
2
1
+l
2
2
:
C
3
=
X
i
X
j
d
2
(x
ij
,
b
l
ij
) + d
2
(y
ij
,
b
l
ij
)
.
This is non-linear in the image lines and consequently in
the entries of
e
H, which implies the use of non-linear opti-
mization techniques. The unknowns are minimally param-
eterized (we optimize directly the entries of H, not
e
H), so
no subsequent correction is needed to recover the motion
parameters.
Quasi-linear estimation. The drawbacks of non-linear
optimization are that the implementation is complicated and
the computational cost is high. For these reasons, we also
developed a quasi-linear estimator Qlin that minimizes
the same cost function. Consider the cost functions C
2
and
C
3
. Both depend on the same data, measured end-points and
reprojected lines, the former using an algebraic and the lat-
ter the orthogonal distance. We can relate these distances
by:
d
2
(x, l) = w
l
d
2
a
(x, l) where w
l
=
1
l
2
1
+ l
2
2
, (2)
and rewrite C
3
as:
C
3
=
X
i
X
j
w
l
ij
d
2
a
(x
ij
,
b
l
ij
) + d
2
a
(y
ij
,
b
l
ij
)
.
The non-linearity is hidden in the weight factors w
l
ij
. If
they were known, the criterion would be linear in the en-
tries of
e
H. This leads to the following iterative algorithm.
Weights, assumed unknown, are initialized to 1 and iter-
atively updated. The loop is ended when the weights or
equivalently the error converge. The algorithm is summa-
rized in table 2. It is a quasi-linear optimization that con-
verges from the algebraic minimum error to the geometri-
cal one. It is simple to implement (as a loop over a linear
method) and less sensitive to local minima than the non-
linear method [4].
1. initialization: set w
l
ij
=1;
2. estimation: estimate
e
H using standard weighted
least squares; the 6×6 matrix obtained is corrected
so that it represents a motion, see algorithm 1;
3. weighting: use
e
H to update the weights w
l
ij
accord-
ing to equation (2);
4. iteration: iterate steps 2. and 3. until convergence
(see text).
Table 2. Quasi-linear motion estimation from 3D line cor-
respondences.

5. Results Using Simulated Data
We first compare our estimators using simulated data.
The test bench consists of four cameras that form two stereo
pairs observing a set of n 3D lines randomly chosen in a
sphere lying in the fields of view of all cameras. Lines are
projected onto the imageplanes, end-points are hallucinated
at the image boundaries and corrupted by additive Gaussian
noise, and the equations of the image lines are estimated
from these noisy end-points.
A canonical projective basis [9] is attached to each cam-
era pair and used to reconstruct the lines in projectivespace.
We then compute the 3D homography between the two pro-
jective bases using the estimators given in §4. We assess
the quality of an estimated motion by measuring the RMS
(Root Mean Square) of the Euclidean reprojection errors
(orthogonal distances between reprojected lines and end-
points in the second image pair). This corresponds to the
criterion minimized by the non-linear and the quasi-linear
algorithms. We compute the median error over 100 trials.
Figure 2 shows the error as the level of added noise
varies. The non-linear method is initialized using the quasi-
linear one. We observe that the methods Lin3D (based on an
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
0.5
1
1.5
2
2.5
3
Image noise standard deviation (pixel)
Reprojected error (pixel)
PSfrag replacements
Lin3D
Lin1
Lin2
QLin
NLin
Figure 2. Comparison of reprojected error versus added
image noise for different motion estimators.
algebraic distance between 3D Pl¨ucker coordinates), Lin1
and Lin2 perform worse than the others. This is due to the
fact that the criteria used in these methods are not physi-
cally meaningful and biased compared to C
3
. Method QLin
gives results close to those obtained using NLin. It is there-
fore a good compromise between the linear and non-linear
methods, achievinggood results whilekeeping simplicity of
implementation. However, we observed that in a few cases
(about 4%), the quasi-linear method does not enhance the
result obtained by Lin2 while NLin does. QLin estimates
more parameters than necessary and this may cause numer-
ical instabilities.
6. Results on Real Images
We also tested our algorithms using images taken with a
stereo rig, so that the epipolar geometry is the same for both
image pairs, see figure 3. We use the technique givenin [16]
pair 1 pair 2
Figure 3. The two image pairs of a ship part used in the
experiments, overlaid with actual lines. Note that the
extracted end-points do not necessarily correspond.
to estimate the fundamental matrix and define a canonical
reconstruction basis for each pair [9]. This also gives the
joint line projection matrix P. We track lines across images
by hand and projectively reconstruct them for each image
pair.
Motion estimation. We used the methods of §4 to esti-
mate the projective motion between the two reconstruction
bases, but since we have no 3D ground truth we will only
show the result of transferring the set of reconstructed lines
from the first to the second 3D frame, using the 3D line ho-
mography matrix, and reprojecting them. Figure 4 shows
these reprojections, which confirms that the non-linear and
quasi-linear methods achieve much better results than the
linear ones.
7. Conclusions and Perspectives
We addressed the problem of estimating the motion be-
tween two line reconstructions in the general projective
case. We used Pl¨ucker coordinates to represent 3D lines
and showed that they could be transfered linearly between
two reconstruction bases using a 6×6 3D line motion ma-
trix. We investigate the algebraic properties of this matrix
and show how to extract the usual 4×4 motion matrix (i.e.
homography, affinity or rigid displacement) from it.
We then proposed several 3D and image-based estima-
tors for the motion between two line reconstructions. Ex-
perimental results on both simulated and real data show
that the linear estimators perform worse (the residuals are
at least twice as large) than the non-linear ones, especially

Citations
More filters
Journal ArticleDOI

Structure-from-motion using lines: representation, triangulation, and bundle adjustment

TL;DR: Results show that the triangulation algorithm outperforms standard linear and bias-corrected quasi-linear algorithms, and that bundle adjustment using the orthonormal representation yields results similar to the standard maximum likelihood trifocal tensor algorithm, while being usable for any number of views.
Journal ArticleDOI

PL-VIO: Tightly-Coupled Monocular Visual-Inertial Odometry Using Point and Line Features.

TL;DR: The experiments evaluated on public datasets demonstrate that the PL-VIO method that combines point and line features outperforms several state-of-the-art VIO systems which use point features only.
Journal ArticleDOI

Impact of Landmark Parametrization on Monocular EKF-SLAM with Points and Lines

TL;DR: This paper explores the impact that landmark parametrization has in the performance of monocular, EKF-based, 6-DOF simultaneous localization and mapping (SLAM) in the context of undelayed landmark initialization.
Journal ArticleDOI

Building a 3-D Line-Based Map Using Stereo SLAM

TL;DR: A graph-based visual simultaneous localization and mapping (SLAM) system using straight lines as features using a stereo rig as the sole sensor and using two different representations to parameterize 3-D lines, which exhibits better reconstruction performance against a point-based SLAM system in line-rich environments.
Proceedings ArticleDOI

Robust visual SLAM with point and line features

TL;DR: The orthonormal representation is employed as the minimal parameterization to model line features along with point features in visual SLAM and analytically derive the Jacobians of the re-projection errors with respect to the line parameters, which significantly improves the SLAM solution.
References
More filters
Journal ArticleDOI

A Computational Approach to Edge Detection

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.
Book

Multiple view geometry in computer vision

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

Multiple View Geometry in Computer Vision.

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Journal ArticleDOI

Determining the Epipolar Geometry and its Uncertainty: A Review

TL;DR: A complete review of the current techniques for estimating the fundamental matrix and its uncertainty is provided, and a well-founded measure is proposed to compare these techniques.
Related Papers (5)