scispace - formally typeset
Open AccessProceedings ArticleDOI

Image Registration with Uncalibrated Cameras in Hybrid Vision Systems

Datong Chen, +1 more
- Vol. 1, pp 427-432
Reads0
Chats0
TLDR
A non-linear approach for registering images in an HVS without requiring calibration of cameras is proposed and a robust patch level registration algorithm is proposed by exploiting a constraint on large 3D spatial planes.
Abstract
This paper addresses the problem of robust registering of images among perspective and omnidirectional cameras in a hybrid vision system (HVS). Nonlinearity in an HVS introduced by omnidirectional cameras poses challenges for computing pixel correspondences among images. In previous HVSs, cameras must be calibrated by performing registration. In this paper, we propose a non-linear approach for registering images in an HVS without requiring calibration of cameras. We first discuss the homographies between omnidirectional and perspective images under a local planar assumption. We then propose a robust patch level registration algorithm by exploiting a constraint on large 3D spatial planes. The proposed approach enables an HVS for applications that require quick deployment or active cameras. Experimental results have demonstrated feasibility of the proposed approach

read more

Content maybe subject to copyright    Report

Image Registration with Uncalibrated Cameras in Hybrid Vision Systems
Datong Chen, Jie Yang
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
Abstract
This paper addresses the problem of robust registering of
images among perspective and omnidirectional cameras in
a hybrid vision system (HVS). Nonlinearity in an HVS in-
troduced by omnidirectional cameras poses challenges for
computing pixel correspondences among images. In previ-
ous HVSs, cameras must be calibrated by performing reg-
istration. In this paper, we propose a non-linear approach
for registering images in an HVS without requiring calibra-
tion of cameras. We first discuss the homographies between
omnidirectional and perspective images under a local pla-
nar assumption. We then propose a robust patch level reg-
istration algorithm by exploiting a constraint on large 3D
spatial planes. The proposed approach enables an HVS for
applications that require quick deployment or active cam-
eras. Experimental results have demonstrated feasibility of
the proposed approach.
1 Introduction
Recent demands on video surveillance in a large area have
activated research interest in camera networks. A hybrid vi-
sion system (HVS) is a camera network that consists of om-
nidirectional and perspective cameras. Such a system takes
advantage of a large view scope from omnidirectional cam-
eras and higher resolution from perspective cameras. For
example, Chen et. al. [3] proposed an HVS architecture
in which an omnidirectional was mounted on the ceiling of
the center of a large room and several perspective cameras
were mounted on surrounding side walls. The omnidirec-
tional camera not only provides a good reference for cam-
eras in the camera network but also minimizes the possi-
bility of occlusions in a tracking process. The perspective
camera can capture more detail information in higher res-
olutions. However, nonlinearity in an HVS introduced by
omnidirectional cameras poses challenges for many exist-
ing computer vision techniques, including the technique of
computing pixel correspondences between omnidirectional
and perspective views.
In the previous literature, correspondences of higher res-
olution side-view images and lower resolution top-view im-
ages are computed using 2D perspective homography [4].
The approach requires the pre-calibration of the intrinsic
parameters of an omnidirectional camera. The most com-
monly used omnidirectional camera is a catadioptric cam-
era, which is composed of a perspective camera and a mirror
and provides a single effective viewpoint [10]. The calibra-
tion of a catadioptric camera has been addressed by many
other researchers [7, 2, 9, 13]. After an omnidirectional
camera is calibrated, 2D perspective homography assumes
that the scene in front of each camera is planar and registers
perspective view images under the reference of a distorted
omnidirectional-view image [8, 4].
A major drawback of the 2D approach is that the calibra-
tion step involves manual interaction or specially designed
calibration tags with specific patterns or shapes. This limits
applications of an HVS system when a quick deployment
is required or auto-zooming cameras are employed. In ad-
dition, the existing registration methods can make large er-
rors due to the fact of the non-planar scene. Some efforts
were made to provide 3D information in an HVS by per-
forming calibration of extrinsic parameters among cameras.
Sturm analyzed catadioptric cameras and perspective cam-
eras within a common scene [14]. Chen et al. proposed a
manual solution based on pre-measured points in real 3D
space [3]. Stereo methods [1, 15, 5, 11, 8] were proposed
for object detection and reconstruction in an HVS. Calibra-
tion of the omnidirectional camera is required by all these
methods.
In this paper, we propose an automatic approach of im-
age registration with uncalibrated cameras in an HVS. We
first discuss the geometric correspondence of a planar ob-
ject between a perspective camera and a catadioptric cam-
era and give two homography matrices from both directions.
We then propose an algorithm to register an image of a per-
spective camera to a catadioptric image under the assump-
tion that local image patches are the projections of planar
surfaces. In the proposed algorithm, non-linear 2D registra-
tion is performed at a local patch level. A robust estimation
methodology is proposed for propagating the homography
of patches to their neighborhood. We demonstrate feasibil-
ity of the proposed method through experiments.
1
Proceedings of the Seventh IEEE Workshop on Applications of Computer Vision (WACV/MOTION’05)
0-7695-2271-8/05 $ 20.00 IEEE

perspective camera
catadioptric camera
perspective camera
perspective camera
perspective camera
perspective camera perspective camera
Figure 1: An illustration of an HVS in an indoor public
environment
2 Mathematic models of an HVS
Let’s consider a simple HVS that consists of only one cata-
dioptric camera and several perspective cameras. Fig. 1
illustrates such a system installed in an indoor public envi-
ronment. We will limit our discussion to such an HVS in
the rest of this paper, though the results can be extended to
more complex systems.
2.1 A catadioptric camera model
A commercial catadioptric camera can be modeled as a
combination of a paraboloid mirror and lenses (see Fig. 2).
To exploit the optical characteristics of a catadioptric sys-
tem in the spatial domain, we can follow the processes by
which a catadioptric acquires an optical signal from a spa-
tial point P =(X, Y, Z)
T
shown in Figure 2. Without los-
ing generality, let’s select the focus of the paraboloid mirror
as the origin O
c
. The signal from point P =(X, Y, Z)
T
is firstly reflected at P
m
=(X
m
,Y
m
,Z
m
)
T
on the mirror
and then is projected on the image plane at p =(x, y, Z
c
)
T
.
To simplify this projection process, we assume the camera
is focused on a virtual focal plane F. The mirror point
P
m
is first transformed onto focal plane F at the point
P
f
=(X
f
,Y
f
,Z
f
)
T
and projected on the image plane,
which can be modeled by the following equations:
P
m
= α
P
P
P
f
= P
m
+ T
f
p = R
I
P
f
, (1)
where the scale factor α
P
is a function of the spatial point
P . T
f
=(0, 0,Z
f
Z
m
)
T
denotes the translation between
the focal plane F and the mirror. R
I
models the perspective
projection from the focal plane to the image plane. The
P
m
P
f
P=(X,Y,Z)
T
plane F
virtual focal
image plane
O
c
Z
c
X ,Y
cc
p =(x,y,Zc)
paraboloid mirror
T
pin-hole
Figure 2: A model of catadioptric camera
focal plane Z = Z
f
has only one parameter. In such a
paraboloidal mirror based catadioptric system, there are two
parameters α
P
and T
f
that are 3D spatial point dependent.
The other parameters, R
I
and Z
f
, consist only of constant
values.
Furthermore, for a point on the paraboloidal mirror
P
m
=(X
m
,Y
m
,Z
m
)
T
, the paraboloid can be described
as:
Z
m
= f
1
4f
(X
2
m
+ Y
2
m
), (2)
where f is the focal length of the mirror.
2.2 A perspective camera model
There are many different approaches for modeling a per-
spective camera. In this paper, we use the linear pin-hole
model. In this model, the geometric relationship between a
spatial point
ˆ
P =(
ˆ
X,
ˆ
Y,
ˆ
Z)
T
and its projection on image
plane ˆp =(f
1
ˆx, f
1
ˆy, f
1
)
T
can be modeled as:
ˆ
Z ˆp =
ˆ
P. (3)
To simplify the discussion in this paper, we assume that
the principle point is located at the image center, the aspect
ratio of the optical axis is 1, and the focus length f
1
(which
is a scalar of the system) equals 1. However, the results in
this paper can be extended to more complex linear pin-hole
models.
2
Proceedings of the Seventh IEEE Workshop on Applications of Computer Vision (WACV/MOTION’05)
0-7695-2271-8/05 $ 20.00 IEEE

2.3 Corresponding points in an HVS
A advantage of an HVS system is that the catadioptric cam-
era can provide a global view of a scene. Therefore, the
calibrations of most HVS systems rely on the correspond-
ing point pairs between a perspective image and the cata-
dioptric image. Let us assume that a spatial point P under
the catadioptric camera coordinate system corresponds to a
point
ˆ
P under the coordinate system of a perspective cam-
era. The transform between this point pair can be defined
as:
ˆ
P = R
e
P + T
e
. (4)
Substituting Eq. 1 and 3 into 4, we have
ˆ
Z ˆp =
Rp+ T
α
P
+ T
e
, (5)
where R = R
e
R
1
I
, T = R
e
T
f
, and T
e
are homographic
related parameters which need to be estimated. p and ˆp are
projections of the spatial point on the catadioptric and per-
spective image planes. In general, these homographic re-
lated parameters are non-computable since both parameters
ˆ
Z and α
P
contain unknown depth information of the spa-
tial point. In the next sections, we give a specific solution
of estimating these homographic related parameters.
3 Homography of a planar object in
an HVS
Without losing generality, we can assume that most of the
local regions in images represent planer surfaces in a scene.
The homography from catadioptric image to a perspective
image can be modeled using a 3 × 4 matrix as proposed in
[14]:
ˆp =(x, y, 1)
T
= H
1
(x, y, x
2
+ y
2
1
4
, 1)
T
. (6)
The homography from a perspective image to a catadiop-
tric image, which will be used in a registration task, is
a little complex. Suppose that a target object has a pla-
nar surface WP + b =0under the catadioptric coor-
dinate system. Representing this surface using pixels on
the catadioptric image plane defined by Eq. 1, we have
1
α
P
WR
1
e
(Rp+ T )+b =0. This is a useful constraint on
the scale factor α
P
:
α
P
=
WR
1
e
(Rp+ T )
b
. (7)
Substitute Eq. 7 into Eq. 5, we obtain a 3 × 6 homography
matrix:
p = H
2
x
2
, ˆy
2
, ˆxˆy, ˆx, ˆy, 1)
T
. (8)
This homography actually has a similar constraint to Eq. 6.
However, due to the ambiguity when mapping a perspective
image back to the the catadioptric surface, we do not have
a “linear” equation. We can search the homographic ma-
trix 8 under the constraint 6. The algorithm can be briefly
described as:
1. Initialize H
t
2
;
2. Compute all the corresponding p
t
= H
t
2
ˆp
t
;
3. Register p back to the perspective image to obtain ˆp
t+1
;
4. Update H
t+1
2
= H
t
2
+ λcorrelationp
t
, ˆp
t+1
);
5. Loop to step 2 until the stop condition is satisfied.
4 Patch-level image registration
Estimation of the homography matrix for a planar surface
is a traditional image registration task [6]. The difficulty
arises in that the original scene captured by cameras is not
planar as assumed in a 2D approach. To address this prob-
lem, we divide an image from a perspective camera into
small patches B
i
, and assume that each patch corresponds
to a planar surface in 3D space. Therefore, the image regis-
tration is performed at patch level.
To address this patch level registration, we propose an
algorithm consisting of three main iterative steps: patch
selection, patch registration, and homography propagation,
which is outlined as the following:
Algorithm of robust homography propagation at a
patch level
1. Partition a perspective image into n partitions (patches)
and label all the patches as unregistered;
2. Select an unregistered patch B with the highest variance;
3. Register the patch B using the technique described in the
last section;
4. Propagate the homography of patch B to its unregistered
neighbors, which are located in the same 3D spatial
plane;
5. If there are un-registered patches, go to step 2; else end.
In this algorithm, we partition an image into patches of
the same size. The algorithm iteratively performs from step
2 to step 4 until all the patches are registered. The details of
steps 3 and 4 are discussed as the following.
4.1 Registration in a Haar feature space
Patch registration step 3 is performed in a Haar feature
space. Haar wavelets are chosen since they are able to
model texture in different scales and can be computed very
efficiently. The Haar wavelets decompose a given image
patch B into four sub-bands: lower frequency band B
l
, ver-
tical high frequency band B
v
, horizontal high frequency
3
Proceedings of the Seventh IEEE Workshop on Applications of Computer Vision (WACV/MOTION’05)
0-7695-2271-8/05 $ 20.00 IEEE

Figure 3: An image patch and its Haar decomposition.
The four bands in the image on the right correspond to B
l
(top-left), B
v
(top-right), B
h
(bottom-left) and B
d
(bottom-
right)
band B
h
, and diagonal high frequency band B
d
. Figure
3 illustrates a Haar decomposition of a large image patch.
For a point p =(x, y) in the image patch B, its Haar fea-
ture values are defined as:
B
l
x,y
=
1
4
(B
2x,2y
+ B
2x,2y+1
+ B
2x+1,2y
+ B
2x+1,2y+1
) ,
B
v
x,y
=
1
4
(B
2x,2y
B
2x,2y+1
+ B
2x+1,2y
B
2x+1,2y+1
) ,
B
h
x,y
=
1
4
(B
2x,2y
+ B
2x,2y+1
B
2x+1,2y
B
2x+1,2y+1
) ,
B
d
x,y
=
1
4
(B
2x,2y
B
2x,2y+1
B
2x+1,2y
+ B
2x+1,2y+1
) .
The registration task is to minimize the following objec-
tive function with respect to the homography matrix H and
global photometric parameters θ =(a
j
,b
j
):
f(H, θ)=
j(l,v,h,d)
ˆpB
(B
j
ˆp
a
j
I
j
Hp
b
j
)
2
, (9)
where I
j
is the Haar feature image from the catadioptric
camera. This minimization involves non-linear constraints
and can be solved by the Levenverg-Marquardt [6, 12] tech-
nique.
4.2 Robust homography propagation
Homography propagation step 4 propagates the homogra-
phy H obtained from the registration of patch B in step
3 to its neighbors. From Eq. 7, we can observe that a
homography matrix contains the factor of W and b asso-
ciated with a spatial plane. Therefore, patches coming from
the same 3D spatial plane should share the same homogra-
phy. Firstly, we use H to initialize a set of homographies
S
H
= H
1
= H and set the seed homography as H
= H.
Then, we register the unregistered patches in the 8 neigh-
bors of the patch B. The seed homography H
is used as an
initialization in the Levenverg-Marquardt based registration
algorithm. The resulted homographies are added into the set
S
H
= H
1
,...,H
n
. The new seed homography H
h
ij
is then updated as:
h
ij
=
n
k=1
β
k
ij
h
k
ij
n
k=1
β
k
ij
,
where β
k
ij
=
1
|
h
k
ij
m
ij
|
σ
ij
0 otherwise
, and the mean m
ij
is
defined as:
m
ij
=argmin
m
r
m
ij
,
r
m
ij
= Med
k=1,...,n
h
k
ij
m
.
The Med denotes the median operator. The variance σ
ij
is
computed as: σ
ij
=1.48 ×
1+
5
n1
r
m
ij
ij
. The thresh-
old δ is a tradeoff between precision and robustness of the
registration performance.
We iteratively propagate the registration to the 8 neigh-
bors of each new registered patch until the weights β
k
ij
of
all the new registered patches are equal to zero.
This registration algorithm mainly focuses on register-
ing large background planes such as walls and ground floor.
Small foreground objects are usually not planar enough and
have too low resolution. Therefore the registration results
can be noisier.
5 Experimental results
The proposed approach is evaluated on images obtained
from a catadioptric camera and a perspective camera. Fig. 4
shows two of these images: (a) a top-view image from a Cy-
clovision’s catadioptric camera; (b) a side-view image from
a SONY perspective camera. The catadioptric image has a
resolution of 640 × 480. The resolution of the perspective
image is 800 × 600.
During the patch level registration process, we first esti-
mate the translation and the scale parameters and then the
global photometric parameters. Finally, we estimate the ho-
mograghy. In Fig. 5, we display the registration results
for the proposed methods in different patch sizes: (a) im-
age registration results using 2D perspective homography;
(b) image registration results using non-linear homography
with only one partition; (c) image registration results us-
ing non-linear homography with 2 × 2 partitions; (d) image
registration results using non-linear homography with 4 × 4
partitions. The traditional 2D perspective registration does
not work well without pre-calibrating and warping the cata-
dioptric image. The proposed approach gives better reg-
istration results using non-linear homography. Comparing
with the result in (b), the result in (c) illustrates clearly that
4
Proceedings of the Seventh IEEE Workshop on Applications of Computer Vision (WACV/MOTION’05)
0-7695-2271-8/05 $ 20.00 IEEE

(a)
(b)
Figure 4: Experiment images: (a) a top-view image from
a Cyclovision’s catadioptric camera; (b) a side-view image
from a SONY perspective camera.
there are three 3D spatial “planes” in the scene: the wall
combined with foreground objects, the wall (on the right),
and the ground floor. When using 16 partitions, the wall
combined with foreground objects part is better registered.
However, the registrations of the patches on the wall (on
the right) become noisy due to not enough texture in some
partitions.
To evaluate the registration results more precisely, we
manually label 60 corresponding points {(p
i
, ˆp
i
)} in both
catadioptric image and perspective image, which are shown
in Fig. 6. According to this ground truth, the registration
error is measured by the sum of the translations of the 60
points between the registered coordinates and the coordi-
nates in the ground truth:
E =
60
i=1
p
i
H ˆp
i
. (10)
Fig. 7 shows the registration errors using different num-
ber of partitions. We can observe that the registration error
decreases as the local patch size decreases.
(a) (b)
(c) (d)
Figure 5: Image registration result comparisons: (a) image
registration result using 2D perspective homography; (b)
image registration result using the proposed method with
only 1 partition; (c) image registration results using the pro-
posed method with 4 partitions; (d) image registration re-
sults using the proposed method with 16 partitions.
Figure 6: Corresponding points in the ground truth.
5
Proceedings of the Seventh IEEE Workshop on Applications of Computer Vision (WACV/MOTION’05)
0-7695-2271-8/05 $ 20.00 IEEE

Citations
More filters

CareMedia: Automated Video and Sensor Analysis for Geriatric Care

TL;DR: An algorithm for dining activity analysis in a nursing home is described and a hidden Markov model is proposed to characterize different stages in dining activities with certain temporal order, which could be successful in assisting caregivers in assessments of resident's activity levels over time.

Matching of omnidirectional and perspective images using the hybrid fundamental matrix

TL;DR: An automatic hybrid matching system mixing images coming from central catadioptric systems and conventional cameras is presented and the feasibility of this system is shown.
Journal ArticleDOI

Multi-view structure-from-motion for hybrid camera scenarios

TL;DR: A weighting strategy is proposed for iterative linear triangulation which improves the structure estimation accuracy and introduces the normalization matrices for lifted coordinates so that normalization and denormalization can be performed linearly for omnidirectional images.
Journal ArticleDOI

Calibration method for a central catadioptric-perspective camera system

TL;DR: Three-dimensional reconstruction results of the calibration pattern show a high accuracy and validate the feasibility of the novel calibration method presented, in which the central catadioptric camera has a hyperbolic mirror.
Journal ArticleDOI

HOPIS: Hybrid Omnidirectional and Perspective Imaging System for Mobile Robots

TL;DR: A generalized stereo approach is developed via the construction of virtual cameras that simplifies the computation of epipolar geometry for the hybrid imaging system, but also facilitates the stereo matching between the heterogeneous image formation.
References
More filters
Book

Practical Methods of Optimization

TL;DR: The aim of this book is to provide a Discussion of Constrained Optimization and its Applications to Linear Programming and Other Optimization Problems.
Frequently Asked Questions (15)
Q1. What contributions have the authors mentioned in the paper "Image registration with uncalibrated cameras in hybrid vision systems" ?

This paper addresses the problem of robust registering of images among perspective and omnidirectional cameras in a hybrid vision system ( HVS ). In this paper, the authors propose a non-linear approach for registering images in an HVS without requiring calibration of cameras. The authors first discuss the homographies between omnidirectional and perspective images under a local planar assumption. The authors then propose a robust patch level registration algorithm by exploiting a constraint on large 3D spatial planes. 

The authors assume that local patches of the images represent planar 3D surfaces, which is reasonable in most general cases. Furthermore, the authors also proposed a robust patch level registration algorithm by exploiting the constraint that patches from the same 3D planar surface share the same homography. The dependence of the registration results to the size and properties of local patches indicates that more work needs to be focused on irregular image partition in the future. 

After an omnidirectional camera is calibrated, 2D perspective homography assumes that the scene in front of each camera is planar and registers perspective view images under the reference of a distorted omnidirectional-view image [8, 4]. 

The transform between this point pair can be defined as:P̂ = ReP + Te. (4)Substituting Eq. 1 and 3 into 4, the authors haveẐp̂ = R p + TαP + Te, (5)where R = ReR−1I , T = −ReTf , and Te are homographic related parameters which need to be estimated. 

The omnidirectional camera not only provides a good reference for cameras in the camera network but also minimizes the possibility of occlusions in a tracking process. 

A major drawback of the 2D approach is that the calibration step involves manual interaction or specially designed calibration tags with specific patterns or shapes. 

The authors iteratively propagate the registration to the 8 neighbors of each new registered patch until the weights β kij of all the new registered patches are equal to zero. 

The Haar wavelets decompose a given image patch B into four sub-bands: lower frequency band B l, vertical high frequency band B v, horizontal high frequency3Proceedings of the Seventh IEEE Workshop on Applications of Computer Vision (WACV/MOTION’05) 

for a point on the paraboloidal mirror Pm = (Xm, Ym, Zm)T , the paraboloid can be described as:Zm = f − 14f (X 2 m + Y 2 m), (2)where f is the focal length of the mirror. 

The dependence of the registration results to the size and properties of local patches indicates that more work needs to be focused on irregular image partition in the future. 

Compute all the corresponding pt = Ht2p̂t; 3. Register p back to the perspective image to obtain p̂ t+1; 4. Update H t+12 = H t 2 + λcorrelation(p̂t, p̂t+1); 5. Loop to step 2 until the stop condition is satisfied. 

due to the ambiguity when mapping a perspective image back to the the catadioptric surface, the authors do not have a “linear” equation. 

The signal from point P = (X, Y, Z)T is firstly reflected at Pm = (Xm, Ym, Zm)T on the mirror and then is projected on the image plane at p = (x, y, Zc)T . 

To address this patch level registration, the authors propose an algorithm consisting of three main iterative steps: patch selection, patch registration, and homography propagation, which is outlined as the following:Algorithm of robust homography propagation at a patch level1. 

For a point p = (x, y) in the image patch B, its Haar feature values are defined as:Blx,y = 1 4 (B2x,2y + B2x,2y+1 + B2x+1,2y + B2x+1,2y+1) ,Bvx,y = 1 4 (B2x,2y − B2x,2y+1 + B2x+1,2y − B2x+1,2y+1) ,Bhx,y = 1 4 (B2x,2y + B2x,2y+1 − B2x+1,2y − B2x+1,2y+1) ,Bdx,y = 1 4(B2x,2y − B2x,2y+1 − B2x+1,2y + B2x+1,2y+1) .