scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Distributed calibration of pan-tilt camera network using multi-layered belief propagation

TL;DR: A technique for distributed self-calibration of pan-tilt camera network using multi-layered belief propagation to obtain globally consistent estimates of the camera parameters for each camera with respect to a global world coordinate system.
Abstract: In this paper, we present a technique for distributed self-calibration of pan-tilt camera network using multi-layered belief propagation. Our goal is to obtain globally consistent estimates of the camera parameters for each camera with respect to a global world coordinate system. The network configuration changes with time as the cameras can pan and tilt. We also give a distributed algorithm for automatically finding which cameras have overlapping views at a certain point in time. We argue that using belief propagation it is sufficient to have correspondences between three cameras at a time for calibrating a larger set of (static) cameras with overlapping views. Our method gives an accurate and globally consistent estimate of the camera parameters of each camera in the network.

Summary (3 min read)

1. Introduction

  • The authors present a distributed algorithm for selfcalibration of a pan-tilt camera network using multi-layered belief propagation.
  • As the cameras can pan and tilt, the camera network contains various mutually exclusive sub-networks, where, all cameras in a sub-network view a common region.
  • The authors then propagate belief between sub-networks to obtain the globally consistent and accurate estimates of the camera parameters for each camera in the network.
  • That the camera network be calibrated with respect to a global WCS so that tasks such as 3D-tracking, recognition of objects, activities and events can be effectively performed.
  • The authors distributed calibration also leads to making the system scalable, as large camera networks spanning a wide geographical area would contain mutually exclusive sub-networks, thereby, no communication and computation among the cameras of these sub-networks would be necessary for calibration.

3. Distributed calibration of pan-tilt camera

  • The authors also assume that each camera has a processing unit attached with it and that there exists an underlying communication network such that each camera can communicate with every other camera.
  • In a pan-tilt camera network, there may exist many such mutually exclusive graphs at any point in time.
  • To perform multi-layered belief propagation between two graphs containing the same camera in different pan-tilt positions, the authors need to bring the cameras to their home (zero pan and zero tilt) position in both the graphs.
  • The authors also propose a protocol in Section 9, for aligning all the cameras’ home positions to a global WCS, to get a globally consistent estimate of the camera’s home position (zero pan, zero tilt position).
  • In the next section, the authors give a method for automatically finding correspondences between three images.

4. Automatically finding corresponding points

  • The authors propose a method for automatically finding corresponding points in three images.
  • But, as the number of images increase, the error in correspondences also increase.
  • First, compute the SIFT features in all three images and then, compute the SIFT matches between the pairs I1−I2, I1−I3 and I2−I3.
  • Figure 1 shows the common points found between three images taken by three different cameras.

5. Finding the graphs

  • Starting with the camera with the smallest number that does not belong to any graph currently, say Ci, find the camera with the next smallest number, say Cj , that has an overlap with Ci and which does not belong to any graph.
  • In general, there will be more than one graph in the pan-tilt camera network.
  • Moreover, each graph will be a complete graph.
  • In a wide area pan-tilt camera network it is possible that two sets of cameras are geographically so far apart that there will be no overlapping view between these two sets of cameras.

6. Camera calibration within a graph

  • The authors assume that the cameras in a graph, say Gk, remain static for a certain time period.
  • Thus, standard multi-camera self-calibration techniques can be used for calibrating the cameras within a graph.
  • The crucial point here is to automatically find multiview correspondences at each node.
  • The corresponding points between the nodes ofGik are found automatically as discussed in Section 4.
  • Belief propagation (discussed in Section 7) between the nodes of Gik gives a consistent estimate of the camera parameters for each camera in Gik.

7. Belief Propagation within a graph

  • For distributed calibration of cameras in a graph, sayGk, multi-camera self-calibration is carried out at each node, using the automatically found corresponding points.
  • As has been shown in [3], belief propagation can be directly applied on a graph which has cameras viewing a common scene as its nodes.
  • Μ̃i,k and Σ̃i,k are the estimates of the camera parameters after belief propagation within graph Gk.
  • The covariance matrix is calculated based on the forward covariance propagation from bundle adjustment.

7.1. Multi-layered Belief Propagation

  • Since the graphs are dynamic and the same camera Ci can be a part of two graphs, say Gk−1 and Gk, in different pan-tilt orientations at different points in time, the authors perform belief propagation between graphs at each node, Ci, which is common in both Gk−1 and Gk.
  • Similarly, the authors can get to the pan-tilt position as: Pθφ = H−1 ∗ Phome.
  • In case, the pan-tilt view of the camera does not have any overlap with the home position’s view, a sequence of homographies can be used, again calculated automatically, as shown in Figure 2.
  • The home position is calculated in each graph using the imageto-image homography before applying the update equations for multi-layered belief propagation.

8. Forming new graphs

  • The multi-layered belief propagation mechanism can be utilized only if the graphs change across time.
  • Each camera will have information of all other cameras about the landmark they are viewing.
  • This also makes their system scalable as the correspondences have to be calculated among only those cameras which view the same landmark and in step 3, the messages have to be passed only between those cameras which can have overlapping views in some pan-tilt configuration.
  • In the current time period these cameras are not considered for calibration and therefore, remain idle.
  • In the next time period, they shall repeat the above protocol and become part of graphs with ≥ 3 nodes and hence, will be used for calibration and multi-layered belief propagation.

9. Aligning cameras to a global world coordinate system

  • The authors want the position and orientation of each camera’s home position with respect to a global WCS.
  • Moreover, belief propagation can be carried out only if all the cameras are aligned with respect to a common coordinate system in the world.
  • These two conditions establish a common coordinate system at the lowest numbered camera, say Ci, in each graph formed in the camera network.
  • All other cameras in Gj are aligned to this common coordinate system.
  • In case the global WCS is not pre-specified, the lowest numbered camera in the network may be assumed to be at the origin of the global WCS.

10. Results and Discussion

  • The authors use 6 SONY EVI-D70 PTZ cameras for their experiments.
  • If the authors randomly select one camera (all its parameters) from each node, for example, P1 from node C2, P2 from C3 and P3 from C1, then as seen in Figure 3(b) and (c) the reprojection error is high and vary based on which camera is selected from which node.
  • The reprojection error statistics are given in Table 1.
  • The authors consider five 3-cliques of the graph for calibrating this graph.
  • Multi-layered belief propagation at the nodes of the graph results in consistent and accurate camera parameters as seen in Figure 4.

11. Conclusion

  • The authors have presented a multi-layered belief propagation based distributed algorithm for self-calibration of a pantilt camera network.
  • The authors have shown that by using multi- layered belief propagation it is possible to get accurate and globally consistent estimates of the camera parameters for each pan-tilt camera in the network with respect to a global world coordinate system.
  • The authors system does not require that all the cameras should have overlapping views at all times.
  • The authors method gives an accurate and globally consistent estimate of the camera parameters for the home position of each camera and using the method for automatically finding correspondences in two views, homographies between the home view and any pan/tilt view can be automatically computed.
  • Therefore, it is possible to obtain accurate and globally consistent camera parameters for any pan/tilt position of the pan-tilt cameras in the network with respect to a global world coordinate system.

Did you find this useful? Give us your feedback

Figures (12)

Content maybe subject to copyright    Report

Distributed Calibration of Pan-Tilt Camera Network using Multi-Layered Belief
Propagation
Ayesha Choudhary
1
Gaurav Sharma
2
Santanu Chaudhury
1
Subhashis Banerjee
1
1
Indian Institute of Technology, Delhi, India.
2
University of Caen, France.
{ayesha, suban}@cse.iitd.ernet.in santanuc@ee.iitd.ernet.in gaurav.sharma@info.unicaen.fr
Abstract
In this paper, we present a technique for distributed self-
calibration of pan-tilt camera network using multi-layered
belief propagation. Our goal is to obtain globally consistent
estimates of the camera parameters for each camera with
respect to a global world coordinate system. The network
configuration changes with time as the cameras can pan and
tilt. We also give a distributed algorithm for automatically
finding which cameras have overlapping views at a certain
point in time. We argue that using belief propagation it is
sufficient to have correspondences between three cameras
at a time for calibrating a larger set of (static) cameras with
overlapping views. Our method gives an accurate and glob-
ally consistent estimate of the camera parameters of each
camera in the network.
1. Introduction
In this paper, we present a distributed algorithm for self-
calibration of a pan-tilt camera network using multi-layered
belief propagation. The goal of our distributed calibration
algorithm is to obtain a globally consistent and accurate
estimate of each camera’s parameters (intrinsic as well as
extrinsic) with respect to a global world coordinate sys-
tem (WCS). As the cameras can pan and tilt, the camera
network contains various mutually exclusive sub-networks,
where, all cameras in a sub-network view a common re-
gion. For distributed calibration, we perform multi-camera
self-calibration at each camera in a sub-network and apply
belief propagation to obtain consistent camera parameters
in each sub-network. We then propagate belief between
sub-networks to obtain the globally consistent and accurate
estimates of the camera parameters for each camera in the
network.
In general, pan-tilt camera networks are well-suited for
wide area surveillance. Automated surveillance requires
The work was done when Gaurav Sharma was with Multimedia lab,
Indian Institute of Technology, Delhi.
that the camera network be calibrated with respect to a
global WCS so that tasks such as 3D-tracking, recogni-
tion of objects, activities and events can be effectively per-
formed. Moreover, this also requires that the camera param-
eters be consistent and accurate with respect to one another,
which cannot be achieved by individually calibrating each
camera. Self-calibration of a pan-tilt camera network is nec-
essary as it is, in general, difficult and impractical to use an
external calibration object.
Distributed calibration is advantageous for pan-tilt cam-
era network, as it is more robust against failures. In case of
failure of a camera, the information can be retrieved from
its neighbors. Moreover, unlike failure of the central server
which may lead to shutting down of the system, failure of
a camera does not impact the complete network. Also, in
case of distributed calibration, addition of new cameras in
the network does not require re-calibration of the complete
camera network. Our distributed calibration also leads to
making the system scalable, as large camera networks span-
ning a wide geographical area would contain mutually ex-
clusive sub-networks, thereby, no communication and com-
putation among the cameras of these sub-networks would
be necessary for calibration. Therefore, in effect, cameras
which do not view a common scene in any of their pan-tilt
positions do not affect each other. Therefore, our distributed
algorithm calibrates the complete camera network by cali-
brating smaller sub-networks, making the system scalable.
Distributed calibration of the camera network may lead
to inconsistencies in the estimation of the camera parame-
ters since these parameters are computed at each node of
the network. We use belief propagation to leverage on the
information at each node of the camera network to arrive at
a consistent and accurate estimate of the camera parameters
of each camera in the network.
The configuration of a pan-tilt camera network is dy-
namic. The various sub-networks that exist in the system
change across time, that is, cameras in different pan-tilt po-
sitions become a part of different sub-networks across time.
Moreover, within a fixed time interval, a camera can be a
part of only one sub-network. We give a technique to au-

tomatically find the sub-networks as well as a method to
automatically control the cameras so that they become parts
of different sub-networks across time, which is essential for
propagating belief across various sub-networks. We discuss
the related work in the next section.
2. Related Work
Multi-camera calibration is a well-studied problem in
computer vision. Pan-tilt camera network calibration has
also become an important area of research. Most of the
multi-camera calibration methods are based on centralized
processing. As camera networks are becoming larger, dis-
tributed algorithms are becoming a necessity. Recently,
in [10], an online distributed algorithm has been proposed
for cluster based calibration of a large wireless static cam-
era network using features detected on known moving tar-
get objects. They assume that the intrinsic parameters are
known and that each target object has known multiple dis-
tinctive features. In [7], 3D features and geographic hash
tables are used while in [5] object motion is used for cali-
bration. Very recently, authors in [4], have proposed a dis-
tributed algorithm for calibration of a camera sensor net-
work, where they assume that one of the cameras is cali-
brated and use epipolar geometry based algorithms at each
node to obtain its calibration parameters. They show that a
globally consistent solution can be reached in a distributed
manner by solving a set of linear equations.
In [1], a method for self-calibration of purely rotating
cameras using infinite homography constraint is proposed.
Davis et al. [2] present a method for calibrating pan-tilt
cameras and introduce a complete model of the pan-tilt ro-
tation occurring around arbitrary axes. Both these methods
are for calibrating a single camera and not for calibration
of a pan-tilt camera network. Authors in [12], estimate both
the internal and external parameters of a pan-tilt camera net-
work without requiring any special calibration object. But,
their method is feature based and estimates the camera pa-
rameters by using the complete set of images captured at
each pan-tilt-zoom configuration of the camera.
Radke et al. [3], give a distributed calibration method
for a static camera network using belief propagation. They
assume that the cameras form a graph where cameras are
the nodes and an edge exists between the nodes if they
have overlapping views. In their case, since the cameras
are static, the configuration of the network does not change
with time and the cameras form one connected graph. We
extend this approach for distributed calibration of pan-tilt
camera network using multi-layered belief propagation. In
our case, many mutually exclusive graphs exist at the same
time and the same camera may belong to many different
graphs across time. We also address the issues of automat-
ically finding the various graphs in the system. In [3], they
assume that the camera network forms a connected graph,
whereas we give a method for automatically controlling the
cameras to create connected graphs. Also, we propose the
use of multi-layered belief propagation, first within a graph
for a consistent measure of the camera parameters within
the graph, and then between multiple graphs to get a consis-
tent estimate of the camera parameters in the pan-tilt camera
network.
The methods in [3, 10, 7, 4] are for distributed calibration
of static camera networks while we propose a technique for
distributed calibration of pan-tilt camera network. More-
over, unlike [4, 10], we do not require that the internal or
external parameters of any camera be known and do not re-
quire any external calibration object. Also, unlike [12], our
method does not consider every pan-tilt configuration of any
camera in the network.
3. Distributed calibration of pan-tilt camera
network: an overview
We assume that the camera network has N 3 cameras
and each camera has a unique number n {1, 2, . . . , N}
associated with it. We also assume that each camera has a
processing unit attached with it and that there exists an un-
derlying communication network such that each camera can
communicate with every other camera. A sub-network in a
pan-tilt camera network consists of cameras viewing a com-
mon area. The cameras which have overlapping views form
a complete graph G = (V, E) where, the cameras C
i
V
and edge e
ij
E between cameras C
i
and C
j
for all cam-
eras in the graph. In a pan-tilt camera network, there may
exist many such mutually exclusive graphs at any point in
time. Moreover, if a camera pans and/or tilts, then it may
cease to remain a part of one graph and become a part of
another graph. In Section 5, we give a distributed algorithm
for finding these graphs automatically.
We assume that the cameras remain in a certain pan-tilt
position for a fixed period of time. During this time in-
terval, the cameras in each graph are considered as static
cameras. Corresponding points between the views of the
cameras in each graph are found automatically and multi-
camera self-calibration is performed at each node of the
graph. It is well-known that finding automatic correspon-
dences between multiple views is not an easy problem. We
show that by using multi-layered belief propagation it is
sufficient to have correspondences between only three cam-
eras at a time for consistent calibration of a larger N > 3
static camera network. In Section 6, we give the method
to calibrate a large N > 3 (static) camera network us-
ing multi-layered belief propagation by iteratively calibrat-
ing its 3-cliques. We discuss belief propagation and multi-
layered belief propagation in Section 7 and discuss how
multi-layered belief propagation is applied at each camera
in the network. Since the information is combined from

Figure 1. Example of common points found in three images. Note:
All images are best viewed in color and at a high resolution.
graphs containing the cameras in various pan-tilt configura-
tions, it is unlikely that belief propagation will get stuck in
a local minima and hence, globally consistent estimates are
achieved.
In Section 8, we give a protocol for automatically con-
trolling the cameras so that they become a part of various
sub-networks across time which is necessary for distributed
calibration of the pan-tilt camera network. Otherwise, the
network will remain divided into mutually exclusive sub-
networks and there will be no exchange of information
between various pan-tilt views of the same camera across
time. To perform multi-layered belief propagation between
two graphs containing the same camera in different pan-tilt
positions, we need to bring the cameras to their home (zero
pan and zero tilt) position in both the graphs. We show that
the camera matrix for the home position of the camera can
be computed by automatically finding pairwise correspon-
dences to compute the homography or a sequence of homo-
graphies between the camera’s pan-tilt view and the home
view. We also propose a protocol in Section 9, for aligning
all the cameras’ home positions to a global WCS, to get a
globally consistent estimate of the camera’s home position
(zero pan, zero tilt position). In the next section, we give a
method for automatically finding correspondences between
three images. The same method can be used for finding cor-
respondences automatically between a pair of images.
4. Automatically finding corresponding points
between three images
We propose a method for automatically finding corre-
sponding points in three images. It can also be used to find
correspondences in a pair of images or more than three im-
ages. But, as the number of images increase, the error in
correspondences also increase. Let I
1
, I
2
and I
3
be three
images taken by three different cameras of the same scene.
We perform the following steps to automatically find corre-
spondences between the three images. First, compute the
SIFT features in all three images and then, compute the
SIFT matches between the pairs I
1
I
2
, I
1
I
3
and I
2
I
3
.
Next, find the common SIFT matches between these three
pairs, denoted by X = {x
1
, x
2
, x
3
} for points in I
1
, I
2
and
I
3
respectively. Further, refine these points by fitting funda-
mental matrices between pairs of images and taking points
which are common in all the three images. This is done by
first fitting fundamental matrix to the pairs F
12
= {x
1
, x
2
},
F
13
= {x
1
, x
3
} and F
23
= {x
2
, x
3
} and then, finding the
common points between the inliers in F
12
, F
13
and F
23
, say
y
1
, y
2
and y
3
. If the number of points are 50, then we say
that there exists overlap between the three images and y
1
,
y
2
and y
3
are the correspondences in the three views. Fig-
ure 1 shows the common points found between three images
taken by three different cameras.
5. Finding the graphs
We develop an algorithm to automatically find the graphs
in the network. Starting with the camera with the smallest
number that does not belong to any graph currently, say C
i
,
find the camera with the next smallest number, say C
j
, that
has an overlap with C
i
and which does not belong to any
graph. Form a graph G = (V, E) where, V = {C
i
, C
j
} is
the set of nodes and e
ij
E is the edge between C
i
and
C
j
. Incrementally, find all those cameras (by automatically
finding the corresponding points) which have overlapping
views with C
i
and C
j
and are not a part of any graph cur-
rently. Add them as nodes of G and add edges between all
the nodes of G. Continue till either there is no camera that
does not belong to a graph in the system or no other camera
has overlapping views with the nodes in graph G.
Repeat this with all the cameras in the network that are
not a part of any graph. In general, there will be more than
one graph in the pan-tilt camera network. Moreover, each
graph will be a complete graph. A priori knowledge of the
camera network topology can be used to reduce the amount
of communication across cameras as well as the number of
computations for SIFT matches. For example, in a wide
area pan-tilt camera network it is possible that two sets of
cameras are geographically so far apart that there will be no
overlapping view between these two sets of cameras. There-
fore, no communication or computation needs to be carried
out between such mutually exclusive and distant camera
sub-sets.
6. Camera calibration within a graph
We assume that the cameras in a graph, say G
k
, remain
static for a certain time period. Thus, standard multi-camera
self-calibration techniques can be used for calibrating the
cameras within a graph. In a distributed system, multi-
camera calibration is carried out at each node of the graph,
G
k
. The crucial point here is to automatically find multi-
view correspondences at each node. Since this is not an
easy task, we show that it is possible to calibrate a graph of
size N > 3 by calibrating its 3cliques and using multi-
layered belief propagation to reach a consistent estimate of
the camera parameters of all the cameras in the graph.
We consider all possible 3-cliques of the graph G
k
. Let

Figure 2. These images are from one pan-tilt camera taken at different pan and tilt positions. To find the homography between (a) and (f),
where (f) is the home position, we find a sequence of homographies: between (a) and (b), then (b)=(c) and (d) and then (d) = (e) and (f).
The point correspondences for finding the homographies are automatically found as explained in text.
G
i
k
be the i
th
3-clique of G
k
. The corresponding points be-
tween the nodes of G
i
k
are found automatically as discussed
in Section 4. Standard multi-camera self-calibration tech-
nique is used at each node of G
i
k
to get estimates of camera
parameters of each camera in G
i
k
. Belief propagation (dis-
cussed in Section 7) between the nodes of G
i
k
gives a con-
sistent estimate of the camera parameters for each camera in
G
i
k
. This is done for each of the 3-cliques of G
k
, which will
not be more than
n
3
for a graph of size n. Therefore, there
will be
n
3
estimates of each camera after belief propaga-
tion is carried out within each 3-clique. Then, multi-layered
belief propagation at each node of G
k
is carried out between
the estimates of the camera parameters of that node in the
various (at most
n
3
) 3-cliques. If this procedure is carried
out iteratively, then it is not necessary to calibrate all the
n
3
3-cliques. It is possible that a consistent estimate of the
camera parameters for each camera in G
k
can be reached
with a lesser number of 3-cliques than
n
3
. Thus, we are
able to calibrate the complete graph of N > 3 cameras
without knowing multi-view correspondences among all the
nodes of the graph. Figure 4 shows a result of this tech-
nique for calibrating a graph of five cameras by using five
3-cliques of the graph. An important point to be noted here
is that the camera matrices have to be aligned to a common
WCS for this graph before propagating belief at a node be-
tween the subgraphs. The common WCS for this graph can
be a predefined WCS or we can take the lowest numbered
camera in the graph to be at the origin of the WCS.
7. Belief Propagation within a graph
For distributed calibration of cameras in a graph, say G
k
,
multi-camera self-calibration is carried out at each node, us-
ing the automatically found corresponding points. There-
fore, at each node C
i
of G
k
, we obtain an estimate of the
camera parameters P
k
j
for all j cameras in G
k
. Let y
i
be
the true camera parameters for the i
th
camera. Our aim is
to find y
i
from the estimates of the camera parameters com-
puted at each node of G
k
, using belief propagation. The
estimates of the camera parameters of all cameras in G
k
computed at each node are considered as the beliefs at each
node. In general, belief propagation algorithm is used for
solving inference problems based on local message pass-
ing [11]. Each node updates its beliefs by using the esti-
mates it receives from its neighbors in the form of “mes-
sages”. These beliefs are iteratively updated until there is
no change in the belief at a node. As has been shown in [3],
belief propagation can be directly applied on a graph which
has cameras viewing a common scene as its nodes. In this
case, the update equations are of the form:
˜
Σ
i,k
1
i,k
+
X
jN(i,k)
Σ
1
j,k
]
1
˜µ
i,k
˜
Σ
i,k
1
i,k
µ
i,k
+
X
jN(i,k)
Σ
1
j,k
µ
j,k
] (1)
Here, µ
i,k
and Σ
i,k
are the estimate and covariance of the
camera parameters computed at the i
th
camera C
i
in the k
th
graph, G
k
. N (i, k) denotes the set of neighbors of camera
C
i
in graph G
k
. Moreover, the i
th
node, C
i
receives µ
j,k
and Σ
j,k
from C
j
, its j
th
neighbor, j N (i, k). ˜µ
i,k
and
˜
Σ
i,k
are the estimates of the camera parameters after belief
propagation within graph G
k
. The covariance matrix is cal-
culated based on the forward covariance propagation from
bundle adjustment. We consider the diagonal terms of the
covariance matrix only, resulting in it being a diagonal ma-
trix which is positive definite and invertible. Moreover, we
use all the 11 camera parameters [6] as the belief at a node.
7.1. Multi-layered Belief Propagation
Since the graphs are dynamic and the same camera C
i
can be a part of two graphs, say G
k1
and G
k
, in different
pan-tilt orientations at different points in time, we perform
belief propagation between graphs at each node, C
i
, which
is common in both G
k1
and G
k
. Here, the belief at C
i
in
G
k1
is the estimate of the camera matrix of C
i
(after be-
lief propagation within G
k1
) at its home position, obtained
by using the homography between C
i
s view in G
k1
and
the image taken at the home position of C
i
. Similarly, the
belief at C
i
in G
k
is the estimate of camera matrix of C
i
(af-
ter belief propagation within G
k
) at home position obtained
using homography between the view of C
i
in G
k
and the
home view of C
i
.
As is well-known [6], two views of a camera in differ-
ent pan-tilt positions are related by a 3 × 3 image to image

homography. Therefore, we automatically compute the ho-
mography between the pan/tilt view and the home view of
a camera by automatically finding corresponding points be-
tween the two images, using SIFT matches further refined
by fitting fundamental matrices to the points obtained, as
described in Section 4. This homography is then used to
get the camera matrix of the home position from the camera
matrix of the pan-tilt position. Let P
θφ
be the camera matrix
at pan θ and tilt φ position, P
home
be the camera matrix at
the home position, and H be the homography between the
home view and the pan-tilt view. Then, if x = P
home
X,
x
0
= P
θφ
X and x = Hx
0
, P
home
= H P
θφ
. Similarly,
we can get to the pan-tilt position as: P
θφ
= H
1
P
home
.
In case, the pan-tilt view of the camera does not have any
overlap with the home position’s view, a sequence of ho-
mographies can be used, again calculated automatically, as
shown in Figure 2. Let ˜µ
i,k
be the estimate of the cam-
era parameters of C
i
after belief propagation within graph
G
k
, where C
i
is in pan θ
k
and tilt φ
k
position. Homogra-
phy or a sequence of homographies is used to calculate the
camera parameters for the home position of C
i
, denoted by
P
i
home
,k
. These parameters, taken as a vector, are the belief
at C
i
in G
k
denoted by µ
i
home
,k
. Let ˜µ
k1
i
home
and
˜
Σ
k1
i
home
be
the estimates of the camera parameters and the covariance
matrix after the (k 1)
th
iteration, at the home position
of C
i
, of multi-layered belief propagation between k 1
graphs containing C
i
in different pan-tilt positions. The
home position is calculated in each graph using the image-
to-image homography before applying the update equations
for multi-layered belief propagation. The belief is updated
using Equations 2.
˜
Σ
k
i
home
[(
˜
Σ
k1
i
home
)
1
+ Σ
1
i
home
,k
]
1
˜µ
k
i
home
˜
Σ
k
i
home
[(
˜
Σ
k1
i
home
)
1
˜µ
k1
i
home
+Σ
1
i
home
,k
µ
i
home
,k
](2)
where, ˜µ
k
i
home
denotes the estimate of the camera parame-
ters and
˜
Σ
k
i
home
is the estimate of the covariance matrix of
the home position of C
i
after the k
th
iteration.
8. Forming new graphs
The multi-layered belief propagation mechanism can be
utilized only if the graphs change across time. We de-
velop a protocol for automatically controlling the pan-tilt
of the cameras so that the network configuration changes
after a fixed time period. We define a set of landmarks
L = {L
1
, L
2
, . . . , L
m
} in the scene with respect to the
global WCS. Initially, the graphs are found using the tech-
nique discussed in Section 5. Once the estimate of the cam-
era parameters for cameras have been computed in each
of these graphs by multi-camera self-calibration and belief
propagation within each graph, these cameras are aligned
to the global WCS. The camera parameter estimates after
alignment are then used for controlling the cameras to form
new graphs in the network. The protocol is:
1. For each camera, compute the pan-tilt rotations re-
quired to view all the landmarks. (It is possible that
a camera may not be able to view all the landmarks,
therefore, only those that are visible are considered).
2. For each camera, rotate by the smallest pan-tilt angles
such that it views a landmark other than the one it is
currently viewing.
3. Send a message to all the other cameras about the new
landmark that it is viewing. If it is known a priori that
two cameras will never have overlapping views, they
need not inform each other about the new landmark
they are viewing, thereby reducing unnecessary com-
munication.
4. Each camera will have information of all other cam-
eras about the landmark they are viewing. It takes into
consideration all the cameras, say set S, that are view-
ing the same landmark as itself.
5. For each camera, check whether the cameras in its set
S form a graph by using the procedure given in Sec-
tion 5.
This also makes our system scalable as the correspondences
have to be calculated among only those cameras which view
the same landmark and in step 3, the messages have to be
passed only between those cameras which can have over-
lapping views in some pan-tilt configuration. In general,
these will be much smaller in number compared to the size
of the camera network. The above algorithm ensures that
the graphs in the camera network change over time. This
is essential because if the graphs remained static, since they
are mutually exclusive no information would be shared be-
tween the graphs and it would not be possible to calibrate
the complete network. It is possible that there will be cam-
eras which do not have overlapping views with any other
camera or graphs that have less than 3 cameras. In the cur-
rent time period these cameras are not considered for cal-
ibration and therefore, remain idle. In the next time pe-
riod, they shall repeat the above protocol and become part
of graphs with 3 nodes and hence, will be used for cali-
bration and multi-layered belief propagation.
9. Aligning cameras to a global world coordi-
nate system
We want the position and orientation of each camera’s
home position with respect to a global WCS. Moreover, be-
lief propagation can be carried out only if all the cameras
are aligned with respect to a common coordinate system in
the world. For the cameras to align themselves to a global

Citations
More filters
Journal ArticleDOI
TL;DR: In this study, the authors discuss various issues and problems in video analytics, proposed solutions and present some of the important current applications of video analytics.
Abstract: Video, rich in visual real-time content, is however, difficult to interpret and analyse. Video collections necessarily have large data volume. Video analytics strives to automatically discover patterns and correlations present in the large volume of video data, which can help the end-user to take informed and intelligent decisions as well as predict the future based on the patterns discovered across space and time. In this study, the authors discuss various issues and problems in video analytics, proposed solutions and present some of the important current applications of video analytics.

12 citations

Proceedings ArticleDOI
01 Mar 2016
TL;DR: This work proposes a camera network configuration that includes a stereo pair with known baseline separation, and analytically demonstrates Euclidean auto calibration of such network under mild conditions, and compares favorably with the well known Zhang and Pollefeys methods in terms of shape recovery.
Abstract: Metric auto calibration of a camera network from multiple views has been reported by several authors. Resulting 3D reconstruction recovers shape faithfully, but not scale. However, preservation of scale becomes critical in applications, such as multi-party telepresence, where multiple 3D scenes need to be fused into a single coordinate system. In this context, we propose a camera network configuration that includes a stereo pair with known baseline separation, and analytically demonstrate Euclidean auto calibration of such network under mild conditions. Further, we experimentally validate our theory using a four-camera network. Significantly, our method not only recovers scale, but also compares favorably with the well known Zhang and Pollefeys methods in terms of shape recovery.

6 citations


Cites background from "Distributed calibration of pan-tilt..."

  • ...At the same time, metric auto calibration has been obtained in the stochastic framework [35], [36], [37]....

    [...]

Posted Content
TL;DR: In this paper, the authors propose a camera network configuration that includes a stereo pair with known baseline separation, and analytically demonstrate Euclidean auto calibration of such network under mild conditions.
Abstract: Metric auto calibration of a camera network from multiple views has been reported by several authors. Resulting 3D reconstruction recovers shape faithfully, but not scale. However, preservation of scale becomes critical in applications, such as multi-party telepresence, where multiple 3D scenes need to be fused into a single coordinate system. In this context, we propose a camera network configuration that includes a stereo pair with known baseline separation, and analytically demonstrate Euclidean auto calibration of such network under mild conditions. Further, we experimentally validate our theory using a four-camera network. Importantly, our method not only recovers scale, but also compares favorably with the well known Zhang and Pollefeys methods in terms of shape recovery.

5 citations

Book ChapterDOI
30 Jun 2015
TL;DR: This paper proposes a novel framework for real-time, distributed, multi-object tracking in a PTZ camera network with this capability and provides a tool to mark an object of interest such that the object is tracked at a certain size as it moves in the view of various cameras across space and time.
Abstract: A visual surveillance system should have the ability to view an object of interest at a certain size so that important information related to that object can be collected and analyzed as the object moves in the area observed by multiple cameras. In this paper, we propose a novel framework for real-time, distributed, multi-object tracking in a PTZ camera network with this capability. In our framework, the user is provided a tool to mark an object of interest such that the object is tracked at a certain size as it moves in the view of various cameras across space and time. The pan, tilt and zoom capabilities of the PTZ cameras are leveraged upon to ensure that the object of interest remains within the predefined size range as it is seamlessly tracked in the PTZ camera network. In our distributed system, each camera tracks the objects in its view using particle filter tracking and multi-layered belief propagation is used for seamlessly tracking objects across cameras.

4 citations

Proceedings ArticleDOI
12 Dec 2010
TL;DR: A novel probabilistic Latent Semantic Analysis based algorithm for pair-wise interaction recognition is proposed and presented as an application of the distributed composite event recognition framework, where the events are interactions between pairs of objects.
Abstract: In this paper, we propose a real-time distributed framework for composite event recognition in a calibrated pan-tilt camera network. A composite event comprises of events that occur simultaneously or sequentially at different locations across time. Distributed composite event recognition requires distributed multi-camera multi-object tracking and distributed multi-camera event recognition. We apply belief propagation to reach a consensus on the global identities of the objects in the pan-tilt camera network and to arrive at a consensus on the event recognized by multiple cameras simultaneously observing it. We propose a hidden Markov model based approach for composite event recognition. We also propose a novel probabilistic Latent Semantic Analysis based algorithm for pair-wise interaction recognition and present an application of our distributed composite event recognition framework, where the events are interactions between pairs of objects.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Book
01 Jan 2000
TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.
Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

15,558 citations

01 Jan 2011
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

14,708 citations

Posted Content
TL;DR: In this article, the authors compare the performance of loopy belief propagation with the exact ones in four real world networks, including two real-world networks: ALARM and QMR, and find that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals.
Abstract: Recently, researchers have demonstrated that loopy belief propagation - the use of Pearls polytree algorithm IN a Bayesian network WITH loops OF error- correcting codes.The most dramatic instance OF this IS the near Shannon - limit performance OF Turbo Codes codes whose decoding algorithm IS equivalent TO loopy belief propagation IN a chain - structured Bayesian network. IN this paper we ask : IS there something special about the error - correcting code context, OR does loopy propagation WORK AS an approximate inference schemeIN a more general setting? We compare the marginals computed using loopy propagation TO the exact ones IN four Bayesian network architectures, including two real - world networks : ALARM AND QMR.We find that the loopy beliefs often converge AND WHEN they do, they give a good approximation TO the correct marginals.However,ON the QMR network, the loopy beliefs oscillated AND had no obvious relationship TO the correct posteriors. We present SOME initial investigations INTO the cause OF these oscillations, AND show that SOME simple methods OF preventing them lead TO the wrong results.

1,539 citations

Proceedings Article
30 Jul 1999
TL;DR: This paper compares the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two real-world networks: ALARM and QMR, and finds that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals.
Abstract: Recently, researchers have demonstrated that "loopy belief propagation" -- the use of Pearl's polytree algorithm in a Bayesian network with loops -- can perform well in the context of error-correcting codes. The most dramatic instance of this is the near Shannon-limit performance of "Turbo Codes" -- codes whose decoding algorithm is equivalent to loopy belief propagation in a chain-structured Bayesian network. In this paper we ask: is there something special about the error-correcting code context, or does loopy propagation work as an approximate inference scheme in a more general setting? We compare the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two real-world networks: ALARM and QMR. We find that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals. However, on the QMR network, the loopy beliefs oscillated and had no obvious relationship to the correct posteriors. We present some initial investigations into the cause of these oscillations, and show that some simple methods of preventing them lead to the wrong results.

1,532 citations


"Distributed calibration of pan-tilt..." refers methods in this paper

  • ...In general, belief propagation algorithm is used for solving inference problems based on local message passing [11]....

    [...]

Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Distributed calibration of pan-tilt camera network using multi-layered belief propagation" ?

In this paper, the authors present a technique for distributed selfcalibration of pan-tilt camera network using multi-layered belief propagation. 

The authors show that by using multi-layered belief propagation it is sufficient to have correspondences between only three cameras at a time for consistent calibration of a larger N > 3 static camera network. 

To perform multi-layered belief propagation between two graphs containing the same camera in different pan-tilt positions, the authors need to bring the cameras to their home (zero pan and zero tilt) position in both the graphs. 

The authors have shown that by using multi-layered belief propagation it is possible to get accurate and globally consistent estimates of the camera parameters for each pan-tilt camera in the network with respect to a global world coordinate system. 

Corresponding points between the views of the cameras in each graph are found automatically and multicamera self-calibration is performed at each node of the graph. 

Homography or a sequence of homographies is used to calculate the camera parameters for the home position of Ci, denoted by Pihome,k. 

The authors show that the camera matrix for the home position of the camera can be computed by automatically finding pairwise correspondences to compute the homography or a sequence of homographies between the camera’s pan-tilt view and the home view. 

Very recently, authors in [4], have proposed a distributed algorithm for calibration of a camera sensor network, where they assume that one of the cameras is calibrated and use epipolar geometry based algorithms at each node to obtain its calibration parameters.