phase are very limited (duration of 5 min.). Furthermor e, the
handling of severe occlus ions is out of the scope of his paper.
The novelty of our approach is threefold. (1) We propose
an approach for background subtraction, derived from
improved Gaussian mixture models (GMMs), in which
the update of the background is achieved recursively. This
approach is combined with a motion detection procedure,
which can adapt robustly to illumination changes, maintain-
ing a high sensitivity to new incoming foreground objects.
(2) We also propose an algorithm able to deal with strong,
moving casted shadows. One of the evaluation datasets is
specifically shadow-oriented. (3) Finally, a new algorithm
able to tackle the problems raised by severe occlusions
among cars, and between cars and trucks is proposed.
We include experimental results with varying weat her
conditions, on sunny days with moving directional shadows
and heavy traffic. We o btain vehicle counti ng and classifica-
tion results much better than those of ILD systems, which are
currently the most widely used systems for these types of
traffic measurements, while keeping the main advantages
of vision-based systems, i.e., not requiring the cumbe rsome
operation or installation of equipment at the roadside or the
need for additional technology such as laser scanners, tags,
or GPS.
2 Related Work
Robust background subtraction, shadows management, and
occlusion care are the three main scientific contributions of
our work.
2.1 Background Subtraction
The main aim of this section is to provide a brief summary of
the state-of-the-art moving object detection methods based
on a reference image. The existing methods of backgrou nd
subtraction can be divided according to two categories:
7
non-
param
etric and parametric methods. Parametric approaches
use a series of parameters that determines the characteristics
of the statistical functions of the model, whereas nonpara-
metric approaches automate the selection of the model
parameters as a function of the observed data during training.
2.1.1 Nonparametric methods
The classification procedure is generally divided into two
parts: a training period of time and a detection period.
The nonparametric methods are efficient when the training
period is sufficiently long. During this period, the setting
up of a background model consists in saving the possible
states of a pixel (intensity, color, and so on).
Median value model. This adaptive model was developed
by Greenhill et al. in Ref.
8 for moving objects extraction
during degraded illumination changes. Referring to the
different states of each pixel during a training period, a
background model is thus elaborated. The background is
continuously updated for every new frame so that a vector
of the median values (intensities, color, and so on) is built
from the N∕2 last frames, where N is the number of frames
used during the training period. The classification back-
ground/object is simply obtained by thresholding the dis-
tance between the value of the pixel to classify and its
counterpart in the background model . In order to take into
account the illumination changes, the threshold considers
the width of the interval containing the pixel values.
This method based on the median operator is more robust
than that based on running average.
Codebook. The codebook method is the most famous non-
parametric method. In Ref.
9, Kim et al. suggest modeling
the
background based on a sequence of observations of each
pixel during a period of several minutes. Then, similar occur-
rences of a given pixel are represented according to a vector
called codeword. Two codewords are considered as different
if the distance, in the vectorial space, exceeds a given thresh-
old. A codebook, which is a set of codewords, is built for
every pixel. The classification background/object is based
on a simple difference between the current value of each
pixel and each of the corresponding codewords.
2.1.2 Parametric methods
Most of the moving objects extraction methods are based on
the temporal evolution of each pixel of the image. A
sequence of frames is used to build a background model
for every pixel. Intensity, color, or some texture characteris-
tics could be used for the pixel. The detection process con-
sists in independently classifying every pixel in the object/
background classes, according to the current observations.
Gaussian model. In Ref.
10, Wren et al. suggest to
adapt
the threshold on each pixel by modeling the intensity
distribution for every pixel with a Gaussian distribution.
This model could adapt to slow changes in the scene, like
progressive illumination changes. The background is
updated recursively thanks to an adaptive filter. Different
extensions of this model were developed by changing the
characteristics at pixel level. Gordon et al.
11
represent
each
pixel with four components: the three color components
and the depth.
Gaussian mixture model. An improvement of the pre-
vious model consists in modeling the temporal evolution
with a GMM. Stauffer and Grimson
12,13
model the color
of
each pixel with a Gaussian mixture. The number of
Gaussians must be adjusted according to the complexity
of the scene. In order to simplify calculations, the covariance
matrix is consi dered as diagonal because the three color
channels are taken into account independently. The GMM
model is updated at each iteration using the k-mean algo-
rithm. Harville et al.
14
suggest to use GMM in a space com-
bining the depth and YUV space. They improve the method
by controlling the training rate according to the activity in the
scene. However, its response is very sensitive to sudden var-
iations of the background like global illumination changes. A
low training rate will produce numerous false detections dur-
ing an illumination change period, whereas a high training
rate will include moving objects in the background model.
Markov model. In order to consider the temporal evolu-
tion of a pixel, the order of arrival of the gray levels on
this pixel is useful information. A solution consists in mod-
eling the gray level evolution for each pixel by a Markov
chain. Rittscher et al.
15
use a Markov chain with three states:
object
, background, and shadow. All the parameters of the
chain, initial, transition, and observation probabilities, are