scispace - formally typeset
Open AccessProceedings ArticleDOI

Revisiting the PnP Problem: A Fast, General and Optimal Solution

Reads0
Chats0
TLDR
In this paper, a non-iterative O(n) solution was proposed for the perspective n-point problem, which is fast, generally applicable and globally optimal, using the Gr"obner basis technique.
Abstract
In this paper, we revisit the classical perspective-n-point (PnP) problem, and propose the first non-iterative O(n) solution that is fast, generally applicable and globally optimal. Our basic idea is to formulate the PnP problem into a functional minimization problem and retrieve all its stationary points by using the Gr"obner basis technique. The novelty lies in a non-unit quaternion representation to parameterize the rotation and a simple but elegant formulation of the PnP problem into an unconstrained optimization problem. Interestingly, the polynomial system arising from its first-order optimality condition assumes two-fold symmetry, a nice property that can be utilized to improve speed and numerical stability of a Grobner basis solver. Experiment results have demonstrated that, in terms of accuracy, our proposed solution is definitely better than the state-of-the-art O(n) methods, and even comparable with the reprojection error minimization method.

read more

Content maybe subject to copyright    Report

Revisiting the PnP Problem: A Fast, General and Optimal Solution
Yinqiang Zheng
Yubin Kuang
Shigeki Sugimoto
Kalle Åstr
¨
om
Masatoshi Okutomi
Department of Mechanical and Control Engineering, Tokyo Institute of Technology, JAPAN
{zheng,shige}@ok.ctrl.titech.ac.jp mxo@ctrl.titech.ac.jp
Centre for Mathematical Sciences, Lund University, SWEDEN
{yubin,kalle}@maths.lth.se
Abstract
In this paper, we revisit the classical perspective-n-point
(PnP) problem, and propose the first non-iterative O(n) so-
lution that is fast, generally applicable and globally opti-
mal. Our basic idea is to formulate the PnP problem into
a functional minimization problem and retrieve all its sta-
tionary points by using the Gr¨obner basis technique. The
novelty lies in a non-unit quaternion representation to pa-
rameterize the rotation and a simple but elegant formula-
tion of the PnP problem into an unconstrained optimization
problem. Interestingly, the polynomial system arising from
its first-order optimality condition assumes two-fold sym-
metry, a nice property that can be utilized to improve speed
and numerical stability of a Gr¨obner basis solver. Experi-
ment results have demonstrated that, in terms of accuracy,
our proposed solution is definitely better than the state-of-
the-art O(n) methods, and even comparable with the repro-
jection error minimization method.
1. Introduction
Given n (n 3) 3D reference points in the object frame-
work and their corresponding 2D projections, to determine
the orientation and position of a fully calibrated perspec-
tive camera is known as the perspective-n-point (PnP) prob-
lem [9]. It has widespread applications in augmented re-
ality, incremental structure-from-motion, robot localization
and so on. Considering its importance, there is no surprise
that, in the past few decades, a huge amount of works have
addressed this problem. Unfortunately, to the best of our
knowledge, there does not exist a fast (preferably, real-time)
and globally optimal solution, which is accurate and appli-
cable to a PnP problem with any point number n (n 3),
any 3D point configuration and arbitrary camera pose.
1.1. Literature Review
The minimal P3P problem has been systematically inves-
tigated in the literature, such as [5] and the recent work [12].
In practice, a P3P solution is usually used in combination
with RANSAC [4] to remove outliers. Considering that da-
ta redundancy generally contributes to improving accuracy,
most of existing works on PnP focus on overconstrained
cases with more than three points. To properly accoun-
t for the latest progress, we would like to roughly catego-
rize them into two groups - the multi-stage methods and the
direct minimization methods.
Typically, the multi-stage methods first estimate the co-
ordinates of some (or all) points in the camera framework,
and transform the PnP problem into the 3D-3D absolute
pose problem, for which there exist closed-form solution-
s[21]. There are a few works, like [11], dedicated to P4P
or P5P, whose application is limited due to the restriction
of point number. To relax this restriction, two linear solu-
tions were presented in [18] and [1], with respective com-
putational complexity O(n
5
) and O(n
8
), which are, at least
in theory, applicable to general PnP with n 4. However,
they are inaccurate when n is small, due to ignoring some
nonlinear constraints. On the contrary, when n is large, their
speed is slow because of their high complexity.
Lepetit et al.[14] introduced several virtual control
points to represent the 3D reference points, and successful-
ly reduced the complexity to O(n). As pointed out in [15],
its accuracy is low for slightly redundant cases with n = 4
or n = 5, due to its underlying linearization scheme. To im-
prove accuracy, Li et al.[15] proposed another non-iterative
O(n) solution, which estimates the coordinates of two spe-
cial endpoints and ignores only one nonlinear constraint.
Without estimating point coordinates, the well-known
direct linear transformation (DLT) method [9] is also a
multi-stage method, since it first determines the projec-
tion matrix and then extracts the calibration parameters and
the camera pose. In the scenario of a calibrated camera,
the DLT method is quite inaccurate due to overlooking the
known calibration parameters.
To sum up, all the aforementioned multi-stage methods
are usually poor in accuracy, due primarily to ignoring some
nonlinear correlation. This is especially true for a PnP prob-
2013 IEEE International Conference on Computer Vision
1550-5499/13 $31.00 © 2013 IEEE
DOI 10.1109/ICCV.2013.291
2344
2013 IEEE International Conference on Computer Vision
1550-5499/13 $31.00 © 2013 IEEE
DOI 10.1109/ICCV.2013.291
2344

lem with a few points, in which the accuracy loss can hardly
be compensated by data redundancy. In addition, without a
clearly defined objective function, these methods do not as-
sume overall global optimality, even supposing that there
exists an optimal solution at each stage.
In contrast, the direct minimization methods are charac-
teristic of minimizing a properly defined error function, ei-
ther in the image space or the object space, while taking into
consideration all nonlinear constraints. It is widely known
that minimizing the reprojection error is the best criterion,
which leads to a challenging nonconvex fractional program-
ming problem. Olsson et al.[17] proposed a branch-and-
bound method to retrieve its global optimum. Unfortunate-
ly, it can rarely be used in practice due to its tremendous
computational cost.
As a trade o, some direct minimization methods [6,16]
minimize instead certain algebraic error functions via local
optimization techniques. For example, Lu et al.[16] devel-
oped an orthogonal iteration method to directly minimize
the object space error, while Garro et al.[6]oered an alter-
nating minimization method to minimize an algebraic error
defined in the image space. However, these local optimiza-
tion based methods suer from the risk of getting trapped
into local minimum, and provide poor results when they in-
deed do so. Schweighofer and Pinz [19] partially addressed
the problem of multiple local minima with a planar target,
but failed to provide a general solution.
The work [20] tried to avoid local minimum by relax-
ing the PnP problem into a semidefinite programme (SDP).
The major drawback lies in that the relaxation is usually not
tight. It is also inappropriate for real-time applications due
to the dependence on an o-the-shelf SDP solver, in spite
of its O(n) complexity.
The above direct minimization methods [6, 16, 17, 20]
share another shortage that they return only a single solu-
tion, which might not correspond to the true camera pose in
case of multiple solutions.
To resolve these drawbacks, Hesch and Roumeliotis [10]
developed a direct least square (DLS) method with com-
plexity O(n), in which all stationary points are retrieved by
solving the polynomial system derived from its first-order
optimal condition via the resultant technique. Unfortunate-
ly, they parameterized rotation by using the Cayley repre-
sentation, which is degenerate in all cases of 180 degree
rotations around the x -, y- and z-axis
1
. The accuracy dete-
riorates seriously when the camera pose approaches these
singularities.
As pointed out in [15], in addition to the number of
points n, the configuration of the 3D reference points plays
1
On the project page, Hesch and Roumeliotis provided a remedy by
solving DLS three times under dierent rotated 3D points. Since the com-
putational time would be tripled, this remedy is not attractive, especially
considering that DLS itself is not very fast. In addition, such a remedy
harms the theoretical elegance of global optimality.
a critical role as well. A desirable PnP solution should
be able to handle all 3D point configurations, including
the ordinary-3D, the planar and the quasi-singular (near-
planar or near-linear) configuration. However, some exist-
ing methods, like [14, 20], handle the ordinary-3D and the
planar configuration separately, thus tend to be inaccurate
in the in-between quasi-singular case. Additionally, such
works as [19] dedicate to the planar case, which are inap-
plicable to the other two configurations at all.
1.2. Overview of the Proposed Solution
In this paper, we propose the first non-iterative O(n) solu-
tion that is fast, globally optimal and universally applicable.
Our basic idea is to formulate the PnP problem into a mini-
mization problem and retrieve all its stationary points by us-
ing the Gr
¨
obner basis technique. It is therefore a direct min-
imization method. By using a unusual non-unit quaternion
representation to parameterize rotation, we formulate the
PnP problem into an unconstrained optimization problem.
The polynomial system arising from its first-order optimal-
ity condition is simpler than using the standard unit quater-
nion parameterization. More interestingly, this polynomial
system is of odd-degree, thus assuming two-fold symmetry,
a nice property that can be utilized to improve speed and
numerical stability of a Gr
¨
obner basis solver. Being global-
ly optimal, our proposed solution successfully conquers the
problem of local optimality (or even divergence) that might
upset a local optimization based method. It is capable of re-
trieving all solutions, when multiple solutions indeed exist.
Unlike [10], our solution does not suer from any degener-
acy of camera pose.
Experiment results have demonstrated that, in terms of
accuracy, our proposed solution is definitely better than the
examined state-of-the-art methods. Actually, although our
cost function is only algebraically meaningful, its accuracy
is even comparable to that of the reprojection error mini-
mization method. Theoretically speaking, the computation-
al complexity of our solution is O(n). However, we have
empirically observed that its computational time keeps al-
most constant even with thousands of points. Therefore, the
proposed solution is universally applicable to any PnP prob-
lem, irrespective of the 3D point configuration, the camera
pose and the number of points.
2. Mathematical Formulation
Throughout this paper, matrices, vectors and scalars are
denoted by using capital letters, bold lowercase letters and
plain lowercase letters, respectively. One exception is that
the capital letter T represents matrix or vector transpose.
All vectors are column-wise in default.
23452345

2.1. Preliminaries of the PnP Problem
Given n 3D reference points q
i
=
x
i
y
i
z
i
T
, i =
1, 2, ··· , n, in the object reference framework, and their cor-
responding projections p
i
=
u
i
v
i
1
T
, the PnP problem
aims to retrieve the rotation matrix R and the translation
vector t, accounting for camera orientation and position,
respectively. Considering that the perspective camera has
been calibrated, we simply assume that the projections p
i
are measured in the normalized homogeneous image coor-
dinate framework. The perspective imaging equation reads
λ
i
p
i
= Rq
i
+ t, i = 1, 2, ··· , n, (1)
where λ
i
denotes the depth factor of the i-th point.
2.2. Rotation Parameterization
A critical issue is how to parameterize the rotation matrix
R, such that the orthonormal constraint RR
T
= I and the
determinant constraint det(R) = 1 could be satisfied.
There are various parameterization methods for R, such
as the Euler angle, rotation axis-angle, Cayley and unit
quaternion parameterization. To facilitate global optimiza-
tion via polynomial system solving, we advocate instead the
non-unit quaternion parameterization, which is free of any
trigonometric function. Specifically, letting s = a
2
+ b
2
+
c
2
+ d
2
, the non-unit quaternion parameterization reads
R =
1
s
a
2
+b
2
c
2
d
2
2bc2ad 2bd+2ac
2bc+2ad a
2
b
2
+c
2
d
2
2cd2ab
2bd2ac 2cd+2ab a
2
b
2
c
2
+d
2
, (2)
where a, b, c, d are the four unknown parameters. It is s-
traightforward to verify that the parameterization in Eq.(2)
satisfies RR
T
= I and det(R) = 1.
At first sight, the above parameterization is unattractive
at all. First of all, one has to make sure that s is rigorously
positive, i.e., a, b, c, d are not simultaneously zero, so as to
avoid the singularity of dividing by zero. Secondly, Eq.(2)
introduces a fractional term
1
s
, which is dicult to handle.
Fortunately, we have recognized that a, b, c, d in E-
q.(2) assume scale and sign ambiguity, i.e., R(a, b, c, d) =
R(ka, kb, kc, kd ), for any nonzero k. It is possible to exploit
this property to resolve the concerns on the non-unit quater-
nion representation.
2.3. The Unconstrained Minimization Problem
Since the absolute scale of a, b, c, d in Eq.(2) is arbitrary,
we can fix it by using the reciprocal of the average depth,
i.e., s
1
1
n
n
i=1
λ
i
=
1
¯
λ
. Due to the chirality condition [9],
the average depth
¯
λ is rigorously positive. The possibility
of dividing by zero has thus been naturally avoided.
Now we multiply
1
¯
λ
at both sides of Eq.(1) and obtain the
following equation
ˆ
λ
i
u
i
v
i
1
=
r
T
1
r
T
2
r
T
3
q
i
+
ˆ
t
1
ˆ
t
2
ˆ
t
3
, i = 1, 2, ··· , n, (3)
in which
ˆ
λ
i
=
λ
i
¯
λ
,
ˆ
t
1
ˆ
t
2
ˆ
t
3
T
=
1
¯
λ
t, and
r
T
1
r
T
2
r
T
3
=
a
2
+b
2
c
2
d
2
2bc2ad 2bd+2ac
2bc+2ad a
2
b
2
+c
2
d
2
2cd2ab
2bd2ac 2cd+2ab a
2
b
2
c
2
+d
2
. (4)
It is straightforward to recognize that
n
i=1
ˆ
λ
i
= n
n
i=1
λ
i
n
i=1
λ
i
= n. (5)
In addition, from Eq.(3), we have
ˆ
λ
i
= r
T
3
q
i
+
ˆ
t
3
, i = 1, 2, ··· , n. (6)
By combining Eq.(5) and Eq.(6), we can express
ˆ
t
3
via
ˆ
t
3
= 1 r
T
3
(
1
n
n
i=1
q
i
) = 1 r
T
3
¯
q, (7)
where
¯
q represents the centroid of the 3D points.
After plugging
ˆ
λ
i
= 1 + r
T
3
(q
i
¯
q) = 1 + r
T
3
˜
q
i
back into
Eq.(3), we have the following equation
(1 + r
T
3
˜
q
i
)
u
i
v
i
=
r
T
1
r
T
2
q
i
+
ˆ
t
1
ˆ
t
2
, i = 1, 2, ··· , n, (8)
where
˜
q
i
denotes the i-th 3D point after centralization.
Until now, we have implicitly assumed that the projec-
tions are noise-free. Due to noise, Eq.(8) could not be com-
pletely satisfied in general. Therefore, we directly minimize
the sum of the squared error as our cost function
min
a,b,c,d,
ˆ
t
1
,
ˆ
t
2
n
i=1
[(1 + r
T
3
˜
q
i
)u
i
r
T
1
q
i
ˆ
t
1
]
2
+
n
i=1
[(1 + r
T
3
˜
q
i
)v
i
r
T
2
q
i
ˆ
t
2
]
2
.
(9)
Although it is only an algebraic error, as will be demon-
strated in the experiment section, its accuracy is very close
to that of minimizing the reprojection error, i.e. the gold-
standard in multiview geometry [9].
Before really solving Eq.(9), we can easily project out
ˆ
t
1
and
ˆ
t
2
in closed-form as follows
ˆ
t
1
= ¯u + r
T
3
(
1
n
n
i=1
u
i
˜
q
i
) r
T
1
¯
q,
ˆ
t
2
= ¯v + r
T
3
(
1
n
n
i=1
v
i
˜
q
i
) r
T
2
¯
q,
(10)
23462346

in which [¯u, ¯v]
T
is the centroid of the image projections in
the normalized image coordinate system.
Now letting α
α
α = [1, a
2
, ab, ac, ad, b
2
, bc, bd, c
2
, cd, d
2
]
T
and plugging Eq.(10) into Eq.(9), we rewrite the cost func-
tion into the matrix form
min
a,b,c,d
f (a, b, c, d) = ||Mα
α
α||
2
2
= α
α
α
T
M
T
Mα
α
α,
(11)
where M isa2n×11 matrix that can be constructed by using
p
i
and q
i
.
Eq.(11) is our ultimate optimization problem, which
does not include any trigonometric function nor any con-
straint. In addition, it suers from no degeneracy of camera
pose, and is independent of the 3D point configuration.
2.4. Relation to Existing Works
In the previous section, we have used the non-unit
quaternion to parameterize the rotation, and fixed its scale
by using the reciprocal of the average depth. Actually, we
are able to interpret some existing works in terms of how to
fix the scale of Eq.(2).
The unit quaternion was used in [20]. It is nothing but
to fix the scale in Eq.(2) by using the unit norm constraint
a
2
+ b
2
+ c
2
+ d
2
= 1. According to [20], the PnP problem
can be formulated into a constrained optimization problem
min
a,b,c,d
ˆ
α
α
α
T
ˆ
M
T
ˆ
M
ˆ
α
α
α, s.t., a
2
+ b
2
+ c
2
+ d
2
= 1,
(12)
in which
ˆ
M isa2n×10 data matrix and
ˆ
α
α
α equals α
α
α after
removing the first element.
Similar to our formulation, it does not suer from any
degeneracy. To find its global optimum, [20] used convex
relaxation techniques, which usually provide an approxi-
mate solution, rather than the guaranteed global optimum.
The Cayley parameterization was used in [10]. Through
some basic operations, one can verify that the Cayley pa-
rameterization is the same as R(1, b, c, d), that is, fixing the
scale of Eq.(2) by using a = 1. The major advantage lies
in that the polynomial system in [10] is simpler to solve, s-
ince there remain only three variables. Unfortunately, this
scheme is unstable in case of near-Cayley-degenerate ro-
tations (a 0), and inapplicable at all in case of Cayley-
degenerate rotations (a = 0).
3. Global Optimization Method
Global optimization has attracted a lot of attention in
multiview geometry, see [8] for a review. Such popular
techniques as branch-and-bound and convex relaxation are
usually time-consuming and only capable of retrieving one
(approximate) optimal solution. Here, we prefer to solve the
polynomial system of the first-order optimality condition of
Eq.(11), so as to identify all stationary points.
18 16 14 12 10 8 6 4 2 0
0
500
1000
1500
2000
2500
3000
Symmetric GB
Symmetric GB + Polish
Blind GB
DLS Resultant
(a) Random
18 16 14 12 10 8 6 4 2 0
0
500
1000
1500
2000
2500
3000
3500
Symmetric GB
Symmetric GB + Polish
Blind GB
DLS Resultant
(b) Near degenerate
18 16 14 12 10 8 6 4 2 0
0
500
1000
1500
2000
2500
3000
Symmetric GB
Symmetric GB + Polish
Blind GB
DLS Resultant
(c) Degenerate
Figure 1. Numerical stability of the polynomial system solvers.
The investigated solvers include the blind GB solver without utiliz-
ing symmetry (Blind GB), the GB solver using two-fold symme-
try (Symmetric GB), the Symmetric GB followed by one damped
Newton polishing step (Symmetric GB + Polish) and the resul-
tant based solver used in DLS [10]. We randomly generate 50
ordinary-3D points and simulate their noise-free projections. Fully
random, near-Cayley-degenerate and Cayley-degenerate rotations
are used in (a), (b) and (c), respectively. The horizontal axis shows
the log
10
value of the absolute error between the ground-truth unit-
norm quaternion and the estimated quaternion after normalization,
while the vertical axis shows the counts over 5,000 independent
runs. The instability problem of DLS in (b) and (c) is obvious.
By calculating the derivative of Eq.(11) with respect to
a, b, c, d, the first-order optimality condition reads
f
a
= 0,
f
b
= 0,
f
c
= 0,
f
d
= 0,
(13)
which is composed of four three-degree polynomials with
respect to a, b, c, d.
3.1. A Blind Gr
¨
obner Basis Solver
Although solving multivariate polynomial systems is
challenging in general, the multiview geometry communi-
ty has achieved much progress by means of the Gr
¨
obner
basis (GB) technique [3]. Kukelova et al.[13] even devel-
oped an automatic generator of GB solvers, which facili-
tates the solving of polynomial systems arising from geo-
metric computer vision problems. The basic procedure is
first to determine the Gr
¨
obner bases and the monomial bases
of the quotient ring under the graded reverse lexicographi-
cal ordering, and then to construct the elimination template
that determines which polynomials from the ideal should be
added so as to build the action matrix. Finally, the solution-
s to the original polynomial system are extracted from the
eigen-factorization of the action matrix. The readers are re-
ferred to [13] for more details on the automatic generator
and to [3] for general theories.
By using the automatic generator in [13], we have found
that the polynomial system in Eq.(13) has at most 81 so-
lutions. The size of the elimination template is 575×656,
while that of the action matrix is 81×81. The generated GB
solver takes about 37.2 milliseconds (ms).
One might be interested in solving the polynomial sys-
tem arising from the first-order optimality condition of E-
q.(12), so as to retrieve the guaranteed optimal solution(s).
23472347

Due to the unit-norm constraint, a Lagrange multiplier has
to be introduced, thus leading to a three-degree polynomial
system with respect to ve variables. The GB solver auto-
matically generated by [13] is much more complex (e.g., the
elimination template is of size 1523×1603) and thus much
slower. This verifies the advantages of our non-unit quater-
nion parameterization and our unconstrained formulation.
3.2. Utilizing Two-Fold Symmetry
By carefully investigating Eq.(11), we have further not-
ed that the polynomial system in Eq.(13) is of odd-degree.
Specifically, the polynomials in Eq.(13) include three-
degree and one-degree monomials only. It therefore as-
sumes two-fold symmetry, that is, w = 0 is a trivial solu-
tion, and, if any non-zero w is a solution, so is -w, where
w = [a, b, c, d]
T
. Actually, instead of 81 solutions, there
are at most 40 independent solutions to Eq.(13), which in-
dicates the complexity of the PnP problem with general n.
Very recently, Ask et al.[2] developed general tech-
niques to make use of p-fold symmetry (p = 2 in our prob-
lem) arising from some minimal problems. The basic idea is
to directly eliminate the trivial all-zero solution and careful-
ly generate new equations such that the symmetry could be
preserved. By using symmetry, the size of the elimination
template and that of the action matrix could be drastically
reduced, which in return improves computational speed and
numerical stability.
In [2], one has to solve an integer linear system to extrac-
t the symmetric solutions from the action matrix, which is
very slow. When implementing the two-fold symmetry GB
solver for Eq.(13), we have improved the solution extraction
operation by using the problem structure. We refer the read-
ers to our source code and the supplementary materials for
the implementation details. With an elimination template
of size 348×376 and an action matrix of size 40×40, our
two-fold symmetry GB solver takes about 18.5 ms, about
twice as fast as the blind GB solver. As shown in Fig.1, its
numerical stability is also stronger than the blind version.
It is worthy of mentioning that the ve-variable polyno-
mial system from Eq.(12) does not assume full symmetry,
because the introduced Lagrange multiplier is always pos-
itive. Actually, Eq.(13) is the first fully symmetric system,
arising from a non-minimal problem, that we know of.
3.3. Solution Polishing and Extraction
After obtaining all stationary points of Eq.(11), we can
further improve the numerical stability by polishing them
via a single damped Newton step. Specifically, assuming
that w is a stationary point, we polish w through the updat-
ing rule w w Δw. The increment Δw is determined
by Δw = (
2
f
2
w
+ μI)
1
f
w
, in which the damped factor μ is
chosen such that f (w Δw) f (w).
As shown in Fig.1, the polishing strategy can drastically
improve the numerical precision, although only one damped
Newton step is used. In addition, the computational cost
is almost negligible, because the dimension of M
T
M in E-
q.(11)isfixed.
After the polishing step, we only retain those real and
physically feasible stationary points with positive definite
Hessian, i.e. minima. When n 6, the PnP problem has a
unique solution in general. Therefore, we choose the sta-
tionary point with smallest objective value in Eq.(11) as the
final solution.
In the slightly redundant scenarios with 4 n 5, it is
a little bit complicated. We have observed a few extreme
cases, in which two widely dierent stationary points have
almost the same objective value, yet the objective value of
the correct stationary point is even slightly larger. There-
fore, when 4 n 5, we return all remaining minima to
the end user, who might be able to choose the correct one
by using, e.g., motion coherence in a tracking scenario. In
the minimal n = 3 case, we use the same strategy.
4. Experiment Results
In this section, we experimentally investigate our op-
timal solution to the PnP problem, referred to as OPnP,
and compare it with the state-of-the-art solutions. For the
ordinary-3D case and the quasi-singular case, we consid-
er two multi-stage methods, including EPnP+GN together
with a few Gauss-Newton steps [14] and RPnP [15], as well
as three direct minimization based methods, including the
direct least square solution (DLS)[10], the approximate op-
timal solution by using SDP convex relaxation (SDP)[20]
and the popular iterative method by Lu et al.[16], denot-
ed by LHM in short. For the planar case, we include in-
to comparison EPnP without Gauss-Newton steps, RPnP,
DLS and SDP. In addition, the iterative method in [19]
specialized to the planar case is also considered, which is
denoted by SP+LHM. The authors of DLS [10] provided
a remedy to conquer the degeneracy of the Cayley repre-
sentation by solving DLS three times under dierently ro-
tated 3D points. We include into comparison this remedy
(DLS+++) as well.
Considering that our objective function is only alge-
braically meaningful, it might be of great interest to com-
pare it with the reprojection error minimization method.
Unfortunately, the branch-and-bound method in [17] is very
slow. In addition, it returns a single solution, which might
be totally dierent from the ground-truth when 4 n 5.
Therefore, we minimize the reprojection error by using the
Levenberg-Marquardt method, starting from the solution(s)
from OPnP
. We denote it by
OPnP+LM.
W
e
implement our OPnP solution in MATLAB
2
, which
2
Our source code and scripts to reproduce all results are available at
https://sites.google.com/site/yinqiangzheng/.
23482348

Figures
Citations
More filters
Journal ArticleDOI

Pose Estimation for Augmented Reality: A Hands-On Survey

TL;DR: This paper aims at presenting a brief but almost self-contented introduction to the most important approaches dedicated to vision-based camera localization along with a survey of several extension proposed in the recent years.
Proceedings ArticleDOI

Segmentation-Driven 6D Object Pose Estimation

TL;DR: This paper introduces a segmentation-driven 6D pose estimation framework where each visible part of the objects contributes a local pose prediction in the form of 2D keypoint locations and uses a predicted measure of confidence to combine these pose candidates into a robust set of 3D-to-2D correspondences.
Book ChapterDOI

UPnP: An optimal O(n) solution to the absolute pose problem with universal applicability

TL;DR: This paper presents the first PnP solution that unifies all the above desirable properties within a single algorithm, and compares its result to state-of-the-art minimal, non-minimal, central, and non-central PNP algorithms, and demonstrates universal applicability, competitive noise resilience, and superior computational efficiency.
Book ChapterDOI

RelocNet: Continuous Metric Learning Relocalisation Using Neural Nets

TL;DR: A method of learning suitable convolutional representations for camera pose retrieval based on nearest neighbour matching and continuous metric learning-based feature descriptors, which is able to generalise in a meaningful way, and outperforms related methods across several experiments.
Proceedings ArticleDOI

Automatic Extrinsic Calibration of a Camera and a 3D LiDAR Using Line and Plane Correspondences

TL;DR: This paper presents an algorithm to estimate the similarity transformation between the LiDAR and the camera for the applications where only the correspondences between laser points and pixels are concerned, and proves that parallel planar targets with parallel boundaries provide the same constraints in the algorithm.
References
More filters
Journal ArticleDOI

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Book

Multiple view geometry in computer vision

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

Multiple View Geometry in Computer Vision.

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Journal ArticleDOI

EPnP: An Accurate O(n) Solution to the PnP Problem

TL;DR: A non-iterative solution to the PnP problem—the estimation of the pose of a calibrated camera from n 3D-to-2D point correspondences—whose computational complexity grows linearly with n, which can be done in O(n) time by expressing these coordinates as weighted sum of the eigenvectors of a 12×12 matrix.
Journal ArticleDOI

Least-squares estimation of transformation parameters between two point patterns

TL;DR: The proposed theorem is a strict solution of the problem, and it always gives the correct transformation parameters even when the data is corrupted.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What have the authors contributed in "Revisiting the pnp problem: a fast, general and optimal solution" ?

In this paper, the authors revisit the classical perspective-n-point ( PnP ) problem, and propose the first non-iterative O ( n ) solution that is fast, generally applicable and globally optimal. Experiment results have demonstrated that, in terms of accuracy, their proposed solution is definitely better than the state-ofthe-art O ( n ) methods, and even comparable with the reprojection error minimization method. 

By using a unusual non-unit quaternion representation to parameterize rotation, the authors formulate the PnP problem into an unconstrained optimization problem. 

Throughout this paper, matrices, vectors and scalars are denoted by using capital letters, bold lowercase letters and plain lowercase letters, respectively. 

The basic idea is to directly eliminate the trivial all-zero solution and carefully generate new equations such that the symmetry could be preserved. 

After obtaining all stationary points of Eq.(11), the authors can further improve the numerical stability by polishing them via a single damped Newton step. 

When implementing the two-fold symmetry GB solver for Eq.(13), the authors have improved the solution extraction operation by using the problem structure. 

As pointed out in [15], its accuracy is low for slightly redundant cases with n = 4 or n = 5, due to its underlying linearization scheme. 

By calculating the derivative of Eq.(11) with respect to a, b, c, d, the first-order optimality condition reads∂ f ∂a = 0, ∂ f ∂b = 0, ∂ f ∂c = 0, ∂ f ∂d = 0, (13)which is composed of four three-degree polynomials with respect to a, b, c, d. 

For the ordinary-3D case and the quasi-singular case, the authors consider two multi-stage methods, including EPnP+GN together with a few Gauss-Newton steps [14] andRPnP [15], as well as three direct minimization based methods, including the direct least square solution (DLS) [10], the approximate optimal solution by using SDP convex relaxation (SDP) [20] and the popular iterative method by Lu et al. [16], denoted by LHM in short. 

the multi-stage methods first estimate the coordinates of some (or all) points in the camera framework, and transform the PnP problem into the 3D-3D absolute pose problem, for which there exist closed-form solutions [21]. 

It is nothing but to fix the scale in Eq.(2) by using the unit norm constraint a2 + b2 + c2 + d2 = 1. According to [20], the PnP problem can be formulated into a constrained optimization problemmin a,b,c,dα̂T M̂T M̂α̂, s.t., a2 + b2 + c2 + d2 = 1, (12)in which M̂ is a 2n×10 data matrix and α̂ equals α after removing the first element. 

It is widely known that minimizing the reprojection error is the best criterion, which leads to a challenging nonconvex fractional programming problem.