scispace - formally typeset

Proceedings ArticleDOI

Detection of the intersection lines in multiplanar environments: Application to real-time estimation of the camera-scene geometry

01 Dec 2008-pp 1-4

TL;DR: An integrated system for building a multiplanar model of the scene as the camera is localized on the fly as a robust and accurate procedure for detecting the intersection line between two planes is described.
Abstract: This paper describes an integrated system for building a multiplanar model of the scene as the camera is localized on the fly. The core of this system is a robust and accurate procedure for detecting the intersection line between two planes. User cues are used to assist the system in the mapping tasks. Synthetic results and a long video demonstrate the relevance of the method.
Topics: Intersection (57%), Line (geometry) (50%)

Content maybe subject to copyright    Report

Detection of the Intersection Lines in Multiplanar Environments:
Application to Real-Time Estimation of the Camera-Scene Geometry
Gilles Simon and Marie-Odile Berger
LORIA - Nancy-Universit
´
e - INRIA Nancy Grand-Est
{gsimon,berger}@loria.fr
Abstract
This paper describes an integrated system for build-
ing a multiplanar model of the scene as the camera is
localized on the fly. The core of this system is a robust
and accurate procedure for detecting the intersection
line between two planes. User cues are used to assist
the system in the mapping tasks. Synthetic results and
a long video demonstrate the relevance of the method.
1 Introduction
Recent years have seen the emergence of vision-
based algorithms for performing localization and map-
ping in unknown environments [3, 2, 6]. These tech-
niques tend to avoid the need of pre-calibrated envi-
ronments in real-time applications such as augmented
reality. However, they are still very sensitive to data-
association errors which can irretrievably corrupt the
maps generated by incremental systems.
The method we propose differs in several ways from
standard works. Firstly, we consider multiplanar envi-
ronments (urban or indoors) and aim at building planar
surfaces instead of clouds of points. Planar surfaces
are natural supports for objects insertion and are easy
to track when well textured [7]. Secondly, user cues
are used to assist the system in the mapping tasks and
visual information are provided to help him make in-
formed decisions about the scene.
The core of our system is an accurate and robust
procedure for detecting the projection of the intersec-
tion line between two planes based on their apparent
motion. Projections of the intersection lines are inter-
mediate results toward the Euclidean reconstruction of
the planes. Most of all, their computation can be visu-
ally assessed by the user, which is of great interest to
prevent map corruptions.
2 Preliminaries
We first set out some theoretical results that will be
useful. Suppose we have two images, I
1
and I
2
, of a
scene consisting of two planes, π
1
and π
2
. We further
assume that the related homography matrices, H
1
and
H
2
, from I
1
to I
2
also are known. Using the duality
between points and lines in the plane, we deduce from
[5] the following result.
Result 1. The 3 × 3 matrix T = H
T
1
H
T
2
is a ho-
mology, which admits a pencil of globally fixed lines
intersecting at the epipole e and a distinct fixed line
corresponding to the projection l of the intersection line
between π
1
and π
2
.
l is therefore the eigenvector associated to the simple
eigenvalue of T. However, algebraic computations of
the eigenvectors of T is very unstable in practice as T
is a non-symmetric matrix. Particle filtering will there-
fore be used to detect l from several subsequent images.
The following result will help to distinguish between
the two kinds of fixed lines:
Result 2. Any point on line l is fixed by T
T
while
any point on a line passing through e is generally trans-
formed by T
T
to a distinct point on the same line.
The first assertion is straightforward. The second
one is illustrated in Fig. 1 (c
1
and c
2
are the camera
centers): p is generally distinct from p
= T
T
p, unless
π
1
= π
2
or p is on line l.
Figure 1. Fixed lines of the homology.

frame #1 frame #10 frame #20
Figure 2. Images of a synthetic sequence.
3 Particle ltering of the intersection line
We now assume we are able to compute several
pairs of homographies H
i
1
, H
i
2
between I
1
and subse-
quent images I
i
. Temporal consistency can therefore
be exploited to get an accurate and robust computa-
tion of line l. We use particle filtering (PF) which is
a well known technique for implementing a recursive
Bayesian filter by Monte Carlo simulations [1]. The
key idea is to represent the required posterior density
function p(x
i
|z
1:i
), where z
1:i
is the set of all available
measurements up to time i, by a set of random samples
with associated weights w
j
i
, and to compute estimates
based on these samples and weights:
p(x
i
|z
1:i
)
N
X
j=1
w
j
i
δ(x
i
x
j
i
),
N
X
j=1
w
j
i
= 1. (1)
We implement the generic PF according to the frame-
work described in [1]. Resampling is used whenever a
significant degeneracy is observed (i.e., when the effec-
tive sample size N
eff
falls below some threshold N
t
).
Our implementation has the following characteristics:
(i) particles are homogeneous coordinates of lines.
The first mode of distribution (1) is taken as the esti-
mated line l (in image 1) at time i. Initially, the parti-
cles are uniformly distributed inside the largest ellipse
E contained in the image (Fig.2, first frame);
(ii) the prior p(x
i
|x
i1
) is the normal distribution
centered at x
i1
with covariance matrix V (V =
diag
2
([0.01, 0.01, 5]) in our experiments); the impor-
tance density is the prior;
(iii) the likelihood density at time i is given by:
p(z
i
|x
i
) = p(z
g
i
|x
i
)p(z
p
i
|x
i
),
where z
g
i
and z
p
i
are (assumed independent) geometric
and (resp.) photometric measurements we now detail.
Geometric likelihood. Measuring “how fixed” a line
is when transformed by T provides a geometric mea-
sure of the likelihood of the related particle. In order
not to confuse between the two kinds of fixed lines of
T and according to Result 2, we actually measure the
fixity of some points on the line. In practice, we found
enough discriminant to measure the fixity of the inter-
section points p
1
and p
2
of the line with the ellipse E:
p(z
g
i
|x
i
) = exp
D
2
2σ
2
g
«
, D =
1
2
v
u
u
t
2
X
k=1
||z(p
k
) z(T
T
p
k
)||
2
where ||.|| denotes the L2 norm of a vector,
z
[x, y, z]
T
= [x/z, y/z]
T
and σ
g
= 3 in our ex-
periments.
Photometric likelihood. Accuracy and convergence
of the PF can be increased by also measuring the dis-
tance of the particles to the highest gradients of the im-
age. This is done by Sobel filtering, hysteresis thresh-
olding and lines detection using a fast Hough Trans-
form (HT). A significant pruning is obtained by com-
puting a single global HT updated from frame to frame
by transferring the line candidates of image i to image
1, using the homography H
i
k
T
, k = 1 or k = 2: doing
that, only the projections of the lines on plane π
k
con-
tribute to the local maxima of the HT (other lines are
transferred to unstable coordinates of the HT). Finally,
we keep the lines m
j
i
corresponding to the M greatest
local maxima of the HT (M = 100 in our experiments).
This leads to:
p(z
p
i
|x
i
) = exp
D
2
2σ
2
p
, D =
1
2
M
min
j=1
v
u
u
t
2
X
k=1
(m
j
i
|p
k
)
2
, (2)
where (.|.) denotes the dot product, m
j
i
is expressed un-
der the form [cos(θ), sin(θ), ρ]
T
and σ
p
= σ
g
in our
experiments. This measure benefits from the robustness
of the HT and can therefore tolerate partial occlusions
of the intersection line.
4 Euclidean reconstruction
Knowing the projection l in I
1
of the intersection
line between two planes π
1
and π
2
and a pair of re-
lated homographies H
i
1
and H
i
2
allows to reconstruct
the planes in the view coordinate system. Here and in
the rest of the paper, we assume that the camera intrin-
sic parameters of the camera are known and the image
coordinates are affine-transformed using the inverse in-
trinsic matrix [4].
When the equation of one plane (say π
1
) and the
camera motion R, t between I
1
and I
i
are known, com-
putation of π
2
is straightforward: as shown in Fig. 1,
π
2
belongs to a sheaf of planes passing through the
3D intersection line between π
1
and the plane pass-
ing through l and the camera center c
1
. This is alge-
braically expressed as:
Π
2
= Π
1
+ λ[l
T
0]
T
(3)

where Π
1
= [n
T
1
, d
1
]
T
and Π
2
= [n
T
2
, d
2
]
T
are the
equation vectors of the planes expressed in the first
view coordinate system. In the common case where
π
2
is orthogonal to π
1
, we directly obtain Π
2
using
λ = 1/(n
1
|l). In the general case, λ can be deter-
mined performing a 1-parameter LMS optimization of
the transfer error of n 4 points v
i
on π
2
(for instance,
the vertices of the planar region). Indeed, any value of λ
induces a homography H(λ) = d
2
(λ)R+tn
2
(λ)
T
[4],
providing transfer errors ||z (H(λ)v
i
) z (H
2
v
i
) ||.
When no information is available about the camera-
planes geometry, structure and motion are computed
using a higher degree optimization that requires an ini-
tial estimate. It is shown in [4] that the simultaneous
estimate of the camera motion and the plane pose cor-
responding to a homography has in general two physi-
cal solutions which can be obtained using SVD. As we
know two homographies, this twofold ambiguity can
be removed by finding the common solution for cam-
era motion: this provides initial values for R, t and
the structure n
1
, n
2
, d
2
of the planes (d
1
determines
the scale of the scene and is set to the assumed value
of the height of the camera in I
1
). These values are
then refined performing a 9 or 8-parameters optimiza-
tion (parameters are R, t, n
1
and λ when the angle be-
tween π
1
and π
2
is unknown) of the transfer error of
n 4 points on π
1
and m 4 points on π
2
using the
Levenberg-Marquardt algorithm. It is shown in section
5 that a 9-parameter optimization converges faster and
more accurately than a 11-parameter optimization that
does not handle the knowledge of line l.
5 Synthetic results
Filtering parameters. Synthetic tests have been per-
formed in order to assess the effects of the number
of particles N and the resampling threshold N
t
over
the convergence of the PF. A 80-frame sequence was
used in which the camera followed a circular path while
10
20
30
40
50
60
70
80
90
Convergence rate (nb frames)
0
100
200
300
400
500
600
700
800
900
1000
Nb particles (N)
0
200
400
600
800
1000
1000*Nt/N
Figure 3. Convergence of the PF.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 10 20 30 40 50
Error on n1 (rad)
#frame
Initialization
11-parameters optimization
9-parameters optimization
0
2
4
6
8
10
12
0 10 20 30 40 50
Error on tx
#frame
Initialization
11-parameters optimization
9-parameters optimization
Figure 4. Structure and motion errors.
pointing toward a horizontal and a vertical plane (Fig.
2). A Gaussian noise of standard deviation 0.3 was
added to the coordinates of the points used to compute
the inter-image homographies. Fig. 2 shows examples
of particle distributions obtained in three images of the
sequence. Figure 3 shows the mean number of frames
(over 100 tests) needed to reach the convergence, for
N varying from 20 to 1000 and N
t
from 0 to N. The
convergenceis always reached, even for small values of
N (except for N
t
= 0 due to the degeneracy problem).
However, the convergence is faster for high values of
N and when N
t
is closer to N . These results led us to
use N
t
= N = 1000 in our real experiments.
Euclidean reconstruction. Figure 4 shows the errors
obtained on the normal to the horizontal plane and the
x-coordinate of the camera translation when comput-
ing structure and motion between the first frame of the
synthetic sequence and the next 50 frames. Errors are
shownfor the SVD, the 11-parameters optimization and
the 9-parameters optimization (in that case the intersec-
tion line is extracted from a HT in the first frame). This
graphic shows that the 9-parameters optimization can
substantially improve the accuracy, especially when the
baseline is small (except for too small baselines which
lead to unstable results). Moreover, the mean number
of iterations of the Levenberg-Marquardt algorithm is
3.9 for the 9-parameters optimization, against 6.7 for
the 11-parameters optimization.
6 On-the-fly map building
The previous theoretical results have been used to
design an integrated system for building a multiplanar
model of the scene as the camera is localized on the fly.
The main characteristic of this system is that the user
is able to assist the mapping tasks. In particular, visual
assessments of the filtered intersection lines greatly re-
duce map corruptions. An intuitive interface controlled
by only four keys is used to define blobs, indicate the
PF convergence, and validate or invalidate the initial
and further Euclidean reconstructions. All these inter-
actions can be done on the fly while the camera is mov-
ing. As no mouse interaction is needed, the interface

Figure 5. On-the-fly map building.
is particularly well adapted to applications of AR that
run on modern devices such as PDA or mobile phones
equipped with digital cameras.
A “ground and wall” type of scene is considered:
two circles are displayed, one on the bottom half and
the other on the top half of the screen, that allow the
user to “capture” blobs on the ground plane and (resp.)
the vertical planes (see Fig. 5, top frame). These blobs
are tracked using the method presented in [7]. Initially,
planes poses and camera motion are computed from a
vertical and a horizontal blob by filtering their intersec-
tion line and performing the SVD + 8-parameters opti-
mization. Then the camera pose is updated in real time
using the existing planes in the map. When a new hor-
izontal blob is defined, it is back-projected using the
known equation of the horizontal plane in the camera
coordinate system. When a newverticalblob is defined,
the intersection line with the ground plane is filtered. If
this line is aligned with another intersection line in the
map, merging with the related plane is proposed. Oth-
erwise, a new blob is added to the map using equation
(3). Keyframes with SIFT features are saved during
the process (upon user request and each time a blob is
added to the map) and these are used upon user request
for global relocalization and bundle adjustments of the
poses of all the planes in the map (in the spirit of [6]).
This system has been used in a two-room scene (Fig.
5). Computation rates were about 12 Hz in tracking
mode and 8 Hz in tracking + filtering mode on a PC
Dell Precision 390, 2.93 Ghz. A hand-held Sony cam-
era DFW-VL500 was used at resolution 320x240. The
#frame N
K
N
B
Event Error angles (deg)
(1,2) (1,3) (1,4) (1,5)
675 3 3 P2 added -10.3
891 7 3 Bundle (0.3 s) 1.5
2710 14 10 P3 added 1.5 -1.5
2954 16 11 P4 added 1.5 -1.5 12.6
5648 17 12 P5 added 1.5 -1.5 12.6 7.2
7842 18 13 Bundle (2.1 s) 0.9 -3.2 0.6 7.3
Table 1. Error angles between the planes.
session lasted about 10 minutes: a video is associated
to the paper
1
showing the most interesting parts of this
session. 13 blobs defined on 6 different planes were
successfully reconstructed. 3D virtual objects were au-
tomatically added on the middle of each blob. As one
can see in the video, these appear firmly anchored in
the scene, and camera tracking performs well despite
erratic motions of the hand-held camera. Table 1 shows
the errorangles obtained between the first vertical plane
added to the map and the other vertical planes (plane
numbers are those shown in the video).
7 Conclusion
We presented a method for tracking and mapping
in multiplanar environments, which has been validated
on both synthetic and real-size (spatially and tempo-
rally) experiments. This method may be extended to
allow detection and reconstruction of other kinds of
features like the edges of the scene. Using the system
in larger environments (a complete level of a building
for instance) would also require improvements: in or-
der to keep a reasonable rate of exploration, the system
should be allowed to perform local bundle adjustments
as in [6].
References
[1] S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on par-
ticle filters for on-line non-linear/non-gaussian bayesian tracking. IEEE
Transactions on Signal Processing, 50(2):174–188, Feb. 2002.
[2] A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse. MonoSLAM:
Real-time single camera SLAM. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 29(6):1052–1067, June 2007.
[3] E. Eade and T. Drummond. Scalable monocular SLAM. In CVPR ’06:
Proceedings of the 2006 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, pages 469–476, Washington, DC, USA,
2006. IEEE Computer Society.
[4] O. Faugeras and F. Lustman. Motion and structure from motion in a piece-
wise planar environment. Rapport de recherche 856, INRIA, 1988.
[5] B. Johansson. View synthesis and 3d reconstruction of piecewise planar
scenes using intersection lines between the planes. In ICCV, pages 54–59,
1999.
[6] G. Klein and D. Murray. Parallel tracking and mapping for small AR
workspaces. In Proc. Sixth IEEE and ACM International Symposium on
Mixed and Augmented Reality (ISMAR’07), Nara, Japan, November 2007.
[7] F. Vigueras, M.-O. Berger, and G. Simon. Iterative multi-planar camera
calibration: Improving stability using model selection. In E. Association,
editor, Vision, Video and Graphics (VVG’03), Bath, UK, Jul 2003.
1
http://www.loria.fr/˜gsimon/icpr08
Citations
More filters

Dissertation
09 Dec 2019-
TL;DR: Nous decrivons de plus une methode de modelisation in situ, qui permet d'obtenir de maniere fiable, de par leur confrontation immediate a the realite, des modeles 3D utiles au calcul de pose tel que nous l'envisageons.
Abstract: Mesurer en temps reel la pose d'une camera relativement a des reperes tridimensionnels identifies dans une image video est un, sinon le pilier fondamental de la realite augmentee. Nous proposons de resoudre ce probleme dans des environnements bâtis, a l'aide de la vision par ordinateur. Nous montrons qu'un systeme de positionnement plus precis que le GPS, et par ailleurs plus stable, plus rapide et moins couteux en memoire que d'autres systemes de positionnement visuel introduits dans la litterature, peut etre obtenu en faisant cooperer : approche probabiliste et geometrie aleatoire (detection a contrario des points de fuite de l'image), apprentissage profond (proposition de boites contenant des facades, elaboration d'un descripteur de facades base sur un reseau de neurones convolutifs), inference bayesienne (recalage par esperance-maximisation d'un modele geometrique et semantique compact des facades identifiees) et selection de modele (analyse des mouvements de la camera par suivi de plans textures). Nous decrivons de plus une methode de modelisation in situ, qui permet d'obtenir de maniere fiable, de par leur confrontation immediate a la realite, des modeles 3D utiles au calcul de pose tel que nous l'envisageons.

8 citations


Journal ArticleDOI
Gilles Simon, Marie-Odile Berger1Institutions (1)
TL;DR: An important contribution of the algorithm is that the process of tracking and reconstructing planar structures is decomposed into three steps that can each be visually assessed by the user, making the interactive modeling procedure really robust and accurate with intuitive interaction.
Abstract: This paper describes a method for online interactive building of piecewise planar environments for immediate use in augmented reality. This system combines user interaction from a camera–mouse and automated tracking/reconstruction methods to recover planar structures of the scene that are relevant for the augmentation task. An important contribution of our algorithm is that the process of tracking and reconstructing planar structures is decomposed into three steps—tracking, computation of the intersection lines of the planes, reconstruction—that can each be visually assessed by the user, making the interactive modeling procedure really robust and accurate with intuitive interaction. Videos illustrating our system both on synthetic and long real-size experiments are available at http://www.loria.fr/~gsimon/vc.

4 citations


Cites background from "Detection of the intersection lines..."

  • ...This paper is an expanded version of [25]....

    [...]


Journal ArticleDOI
01 Sep 2013-
TL;DR: A general purpose Augmented Reality AR system that allows to add easily 3D computer generated CG objects into real man-made environments without using powerful hardware nor commodity sensors is proposed.
Abstract: In this paper, we propose a general purpose Augmented Reality AR system that allows to add easily 3D computer generated CG objects into real man-made environments. Our system goes to a very intuitive and easy in situ 3D structure recovery of planar piecewise scenes without using powerful hardware nor commodity sensors. The user simply has to move the camera translation of the focus is mandatory and take two different pictures of the scene and our approach obtains a rough planar piecewise representation of the environment suitable to conduct multi-planar tracking for visual model-based augmented reality and to augment it with virtual objects coherently. Polyhedral representations of scenes are very convenient for manmade environments indoor e.g., offices, rooms, classrooms and outdoor e.g., looking at facades, floor, hence we focus the potential applications of our system to augment simple rooms or urban scenes with virtual imagery.

1 citations


Cites background or methods from "Detection of the intersection lines..."

  • ...Most of the AR markerless approaches [12,21,22] are based on feature point tracking and commonly fail with untextured planes unless some flaws are present on the surface....

    [...]

  • ...In [21], the authors used a similar consistency approach to [2] and realized that detecting the projection of the intersection line between any two planes gives enough information to build a polyhedral representation of the environment, although these projections are just intermediate results toward the Euclidean reconstruction of planes....

    [...]

  • ...The use of planar surfaces for camera localization and structure recovery has recently received attention from the computer vision [2,21,22] and robotics [8,23] communities, because the presence of planar surfaces in man-made environments is common: planes are present as walls, doors, posters, books, facades, desks and other furniture....

    [...]


Book ChapterDOI
01 Nov 2009-
TL;DR: This work proposes an iterative linear algorithm exploiting geometrical and algebraic constraints induced by rigidity and planarity in the scene to solve iteratively several linear problems: coplanar features segmentation, planar projective transferring, epipole computation, and all plane intersections.
Abstract: This work addresses two main problems: (i) localization of two cameras observing a 3D scene composed by planar structures; (ii) recovering of the original structure of the scene, i.e. the scene reconstruction and segmentation stages. Although there exist some work intending to deal with these problems, most of them are based on: epipolar geometry, non-linear optimization, or linear systems that do not incorporate geometrical consistency. In this paper, we propose an iterative linear algorithm exploiting geometrical and algebraic constraints induced by rigidity and planarity in the scene. Instead of solving a complex multi-linear problem, we solve iteratively several linear problems: coplanar features segmentation, planar projective transferring, epipole computation, and all plane intersections. Linear methods allow our approach to be suitable for real-time localization and 3D reconstruction. Furthermore, our approach does not compute the fundamental matrix; therefore it does not face stability problems commonly associated with explicit epipolar geometry computation.

1 citations


Cites background or methods from "Detection of the intersection lines..."

  • ...The use of our framework for video processing is direct and may require non-linear filtering theory as has been shown in [2,7]....

    [...]

  • ...The use of planar surfaces for camera localization and structure recovery have recently received attention from the computer vision community [2,7,8,10,11]....

    [...]

  • ...In [7] he shows how to recover the localization and reconstruction parameters of multi-planar scenes with minimal user assistance, which consists of manual selection of a base plane, additional planes are considered as...

    [...]

  • ...The intersection between planes [7] i and j implies that a given point x belonging to this intersection satisfies x ′ ∼ Hix and x ∼ Hjx....

    [...]


Proceedings ArticleDOI
28 Sep 2010-
Abstract: This work addresses the problems of (i) self-calibration of a moving camera observing a 3D scene composed by planar structures and (ii) scene segmentation and reconstruction. Although there exist some works intending to deal with these problems, most of them are based on the estimation of the epipolar geometry, non-linear optimization, or linear systems that do not incorporate geometrical consistency and may produce undesirable side-effects. In this paper, we propose a novel iterative linear algorithm that exploits the geometrical and algebraic constraints induced by rigidity and planarity in the scene. Instead of solving a complex multi-linear problem, we solve iteratively several linear problems: coplanar features segmentation, planar projective transferring, epipole computation, and all the plane intersections. Linear methods allow our approach to be suitable for real-time localization and 3D reconstruction, e.g. for autonomous mobile robots applications. Furthermore, we avoid the explicit epipolar geometry computation and all the stability problems commonly associated with it.

1 citations


Cites background or methods from "Detection of the intersection lines..."

  • ...Although there exist some works intending to deal with these problems, most of them are based on the estimation of the epipolar geometry, non-linear optimization, or linear systems that do not incorporate geometrical consistency and may produce undesirable side-effects....

    [...]

  • ...…induced by planarity in the scene in order to estimate calibration, localization and reconstruction by using only linear systems of equations, instead of solving non-linear problems with the corresponding instability due to errors in the initial parameters guess or multi-linear problems....

    [...]


References
More filters

Journal ArticleDOI
TL;DR: Both optimal and suboptimal Bayesian algorithms for nonlinear/non-Gaussian tracking problems, with a focus on particle filters are reviewed.
Abstract: Increasingly, for many application areas, it is becoming important to include elements of nonlinearity and non-Gaussianity in order to model accurately the underlying dynamics of a physical system. Moreover, it is typically crucial to process data on-line as it arrives, both from the point of view of storage costs as well as for rapid adaptation to changing signal characteristics. In this paper, we review both optimal and suboptimal Bayesian algorithms for nonlinear/non-Gaussian tracking problems, with a focus on particle filters. Particle filters are sequential Monte Carlo methods based on point mass (or "particle") representations of probability densities, which can be applied to any state-space model and which generalize the traditional Kalman filtering methods. Several variants of the particle filter such as SIR, ASIR, and RPF are introduced within a generic framework of the sequential importance sampling (SIS) algorithm. These are discussed and compared with the standard EKF through an illustrative example.

10,977 citations


Proceedings ArticleDOI
Georg Klein1, David W. Murray1Institutions (1)
13 Nov 2007-
TL;DR: A system specifically designed to track a hand-held camera in a small AR workspace, processed in parallel threads on a dual-core computer, that produces detailed maps with thousands of landmarks which can be tracked at frame-rate with accuracy and robustness rivalling that of state-of-the-art model-based systems.
Abstract: This paper presents a method of estimating camera pose in an unknown scene. While this has previously been attempted by adapting SLAM algorithms developed for robotic exploration, we propose a system specifically designed to track a hand-held camera in a small AR workspace. We propose to split tracking and mapping into two separate tasks, processed in parallel threads on a dual-core computer: one thread deals with the task of robustly tracking erratic hand-held motion, while the other produces a 3D map of point features from previously observed video frames. This allows the use of computationally expensive batch optimisation techniques not usually associated with real-time operation: The result is a system that produces detailed maps with thousands of landmarks which can be tracked at frame-rate, with an accuracy and robustness rivalling that of state-of-the-art model-based systems.

3,776 citations


"Detection of the intersection lines..." refers methods in this paper

  • ...The core of this system is a robust and accurate procedure for detecting the intersection line between two planes....

    [...]


Journal ArticleDOI
TL;DR: The first successful application of the SLAM methodology from mobile robotics to the "pure vision" domain of a single uncontrolled camera, achieving real time but drift-free performance inaccessible to structure from motion approaches is presented.
Abstract: We present a real-time algorithm which can recover the 3D trajectory of a monocular camera, moving rapidly through a previously unknown scene. Our system, which we dub MonoSLAM, is the first successful application of the SLAM methodology from mobile robotics to the "pure vision" domain of a single uncontrolled camera, achieving real time but drift-free performance inaccessible to structure from motion approaches. The core of the approach is the online creation of a sparse but persistent map of natural landmarks within a probabilistic framework. Our key novel contributions include an active approach to mapping and measurement, the use of a general motion model for smooth camera movement, and solutions for monocular feature initialization and feature orientation estimation. Together, these add up to an extremely efficient and robust algorithm which runs at 30 Hz with standard PC and camera hardware. This work extends the range of robotic systems in which SLAM can be usefully applied, but also opens up new areas. We present applications of MonoSLAM to real-time 3D localization and mapping for a high-performance full-size humanoid robot and live augmented reality with a hand-held camera

3,319 citations


"Detection of the intersection lines..." refers methods in this paper

  • ...The core of this system is a robust and accurate procedure for detecting the intersection line between two planes....

    [...]


Proceedings ArticleDOI
Simon Maskell1, Neil GordonInstitutions (1)
01 Jan 2001-
TL;DR: Both optimal and suboptimal Bayesian algorithms for nonlinear/non-Gaussian tracking problems, with a focus on particle filters are reviewed.
Abstract: Increasingly, for many application areas, it is becoming important to include elements of nonlinearity and non-Gaussianity in order to model accurately the underlying dynamics of a physical system. Moreover, it is typically crucial to process data on-line as it arrives, both from the point of view of storage costs as well as for rapid adaptation to changing signal characteristics. In this paper, we review both optimal and suboptimal Bayesian algorithms for nonlinear/non-Gaussian tracking problems, with a focus on particle filters. Particle filters are sequential Monte Carlo methods based on point mass (or “particle”) representations of probability densities, which can be applied to any state-space model and which generalize the traditional Kalman filtering methods. Several variants of the particle filter such as SIR, ASIR, and RPF are introduced within a generic framework of the sequential importance sampling (SIS) algorithm. These are discussed and compared with the standard EKF through an illustrative example.

967 citations


Journal ArticleDOI
Olivier Faugeras1, Francis Lustman1Institutions (1)
TL;DR: It is shown that when the environment is piecewise linear, it provides a powerful constraint on the kind of matches that exist between two images of the scene when the camera motion is unknown, and that this constraint can be recovered from an estimate of the matrix of this collineation.
Abstract: We show in this article that when the environment is piecewise linear, it provides a powerful constraint on the kind of matches that exist between two images of the scene when the camera motion is unknown. For points and lines located in the same plane, the correspondence between the two cameras is a collineation. We show that the unknowns (the camera motion and the plane equation) can be recovered, in general, from an estimate of the matrix of this collineation. The two-fold ambiguity that remains can be removed by looking at a second plane, by taking a third view of the same plane, or by using a priori knowledge about the geometry of the plane being looked at. We then show how to combine the estimation of the matrix of collineation and the obtaining of point and line matches between the two images, by a strategy of Hypothesis Prediction and Testing guided by a Kalman filter. We finally show how our approach can be used to calibrate a system of cameras.

545 citations


Network Information
Related Papers (5)
22 Nov 2010

Svenja Kahn, Harald Wuest +2 more

02 Sep 2015

Angelique Loesch, Steve Bourgeois +2 more

18 Sep 2003

T. Wu, Takashi Matsuyama

01 Apr 2001, IEEE Transactions on Image Processing

Y. Xirouhakis, Athanasios Drosopoulos +1 more

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20191
20131
20111
20101
20091