scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Detecting moving objects, ghosts, and shadows in video streams

TL;DR: A general-purpose method is proposed that combines statistical assumptions with the object-level knowledge of moving objects, apparent objects (ghosts), and shadows acquired in the processing of the previous frames to improve object segmentation and background update.
Abstract: Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as traffic monitoring, human motion capture, and video surveillance. How to correctly and efficiently model and update the background model and how to deal with shadows are two of the most distinguishing and challenging aspects of such approaches. The article proposes a general-purpose method that combines statistical assumptions with the object-level knowledge of moving objects, apparent objects (ghosts), and shadows acquired in the processing of the previous frames. Pixels belonging to moving objects, ghosts, and shadows are processed differently in order to supply an object-based selective update. The proposed approach exploits color information for both background subtraction and shadow detection to improve object segmentation and background update. The approach proves fast, flexible, and precise in terms of both pixel accuracy and reactivity to background changes.

Summary (2 min read)

Introduction

  • Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as traffic monitoring, human motion capture and video surveillance.
  • In particular, while the fast execution and flexibility in different scenarios should ∗ Corresponding author.
  • The detection accuracy can be measured in terms of correctly and incorrectly classified pixels during normal conditions of the object’s motion (i.e. the “stationary background” case).
  • Most of the approaches use a statistical combination of frames to compute the background model (see Table I).
  • The main contribution of this proposal is the integration of knowledge of detected objects, shadows and ghosts in the segmentation process to enhance both object segmentation and background update.

II. DETECTING MOVING OBJECTS, GHOSTS AND SHADOWS

  • The first aim of their proposal is to detect real moving objects with high accuracy, limiting false negatives (object’s pixels that are not detected) as much as possible.
  • It(p) is the value of point p in the color space.
  • In their approach, after background subtraction, a set of points called foreground points is detected and then merged into labeled blobs according to their connectivity.
  • Therefore, in order to discriminate MVO shadows from ghost shadows, the authors use information about connectivity between objects and shadows.
  • In conclusion, by including Eq. 5 in Eq. 6, the background model remains unchanged for those points that belong to detected MVOs or their shadow.

III. SHADOW DETECTION

  • By shadow detection the authors mean the process of classification of foreground pixels as “shadow points” based on their appearance with respect to the reference frame, the background.
  • In fact, points belonging to both moving objects and shadows are detected by background subtraction by means of Eq. 7. H| ) (9) The lower bound α is used to define a maximum value for the darkening effect of shadows on the background, and is approximately proportional to the light source intensity.
  • A detailed comparison of this method with others proposed in the literature is reported in [17].
  • To demonstrate this, Fig. 3(c) shows a later frame of the same sequence (frame #230), where the person moves from the area.

IV. RESULTS EVALUATIONS

  • In the following, the authors describe some relevant cases.
  • Moreover, results comparable with those of Sakbot cannot be achieved by only adopting selectivity at pixel-level, being the usual approach that excludes from the background update pixels detected as in motion[1][8][2].
  • In fact, if the value of the detected foreground points is never used to update the background, the background will never be modified; consequently, the ghost will be detected forever (Fig. 4, first column, lower image).
  • Instead, the FP curves account for false positives, differing much depending on the different background reactivity.
  • This approach consequently allows fast detection of moving objects, which for many applications is performed in real time even on common PCs; this in turn allows successive higher-level tasks such as tracking and classification to be easily performed in real time.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

© 2003 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in
any current or future media, including reprinting/republishing this material for advertising or promotional purposes,
creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of
this work in other works.

DETECTING MOVING OBJECTS, GHOSTS AND SHADOWS IN VIDEO STREAMS 1
Detecting Moving Objects, Ghosts and Shadows in
Video Streams
Rita Cucchiara
1
,Costantino Grana
1
, Massimo Piccardi
2
, Andrea Prati
1
Abstract
Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as
traffic monitoring, human motion capture and video surveillance. How to correctly and efficiently model and update the background
model and how to deal with shadows are two of the most distinguishing and challenging aspects of such approaches. This work
proposes a general-purpose method which combines statistical assumptions with the object-level knowledge of moving objects,
apparent objects (ghosts) and shadows acquired in the processing of the previous frames. Pixels belonging to moving objects,
ghosts and shadows are processed differently in order to supply an object-based selective update. The proposed approach exploits
color information for both background subtraction and shadow detection to improve object segmentation and background update.
The approach proves fast, flexible and precise in terms of both pixel accuracy and reactivity to background changes.
Keywords
Background modeling, color segmentation, reactivity to changes, shadow detection, video surveillance, object-level knowledge
I. INTRODUCTION
D
ETECTION of moving objects in video streams is the first relevant step of information extraction in
many computer vision applications, including video surveillance, people tracking, traffic monitoring
and semantic annotation of videos. In these applications, robust tracking of objects in the scene calls for a
reliable and effective moving object detection that should be characterized by some important features: high
precision, with the two meanings of accuracy in shape detection and reactivity to changes in time; flexibility
in different scenarios (indoor, outdoor) or different light conditions; and efficiency, in order for detection to
be provided in real-time. In particular, while the fast execution and flexibility in different scenarios should
Corresponding author.
1
Dipartimento di Ingegneria dell’Informazione, Universit
`
a di Modena e Reggio Emilia, Via Vignolese, 905 - 41100 Modena - Italy - phone:
+39-059-2056136 - fax: +39-059-2056126 - e-mail:{rita.cucchiara/andrea.prati}@unimo.it, grana@dsi.unimo.it
2
Department of Computer Systems, Faculty of IT, University of Technology, Sydney - Broadway NSW 2007 - Australia - phone: +61-2-9514-
7942 - fax: +61-2-9514-1807 - e-mail: massimo@it.uts.edu.au

2 DETECTING MOVING OBJECTS, GHOSTS AND SHADOWS IN VIDEO STREAMS
be considered basic requirements to be met, precision is another important goal. In fact, a precise moving
object detection makes tracking more reliable (the same object can be identified more reliably from frame to
frame if its shape and position are accurately detected) and faster (multiple hypotheses on the object’s identity
during time can be pruned more rapidly). In addition, if object classification is required by the application,
precise detection substantially supports correct classification.
In this work, we assume that the models of the target objects and their motion are unknown, so as to achieve
maximum application independence. In the absence of any a priori knowledge about target and environment,
the most widely adopted approach for moving object detection with fixed camera is based on background
subtraction [1][2][3][4][5][6][7][8][9]. An estimate of the background (often called a background model) is
computed and evolved frame by frame: moving objects in the scene are detected by the difference between
the current frame and the current background model. It is well known that background subtraction carries two
problems for the precision of moving object detection. The first problem is that the model should reflect the
real background as accurately as possible, to allow the system accurate shape detection of moving objects.
The detection accuracy can be measured in terms of correctly and incorrectly classified pixels during normal
conditions of the object’s motion (i.e. the “stationary background” case). The second problem is that the
background model should immediately reflect sudden scene changes such as the start or stop of objects, so as
to allow detection of only the actual moving objects with high reactivity (the “transient background” case).
If the background model is neither accurate nor reactive, background subtraction causes the detection of
false objects, often referred to as “ghosts” [1][3]. In addition, moving object segmentation with background
suppression is affected by the problem of shadows [4][10]. Indeed, we would like the moving object detection
to not classify shadows as belonging to foreground objects, since the appearance and geometrical properties
of the object can be distorted, which in turn affects many subsequent tasks such as object classification and
the assessment of moving object position (normally considered to be the shape centroid). Moreover, the
probability of object undersegmentation (where more than one object is detected as a single object) increases
due to connectivity via shadows between different objects.

CUCCHIARA, GRANA, PICCARDI AND PRATI 3
Feature Systems
Statistics Minimum and maximum values [1]
Median [11][12], *
Single Gaussian [5][4][13]
Multiple Gaussians [14][10][3]
Eigenbackground approximation [15][6]
Minimization of Gaussian differences [7]
Adaptivity [1][6][5][8][16][2], *
Selectivity [10][2][8][1], *
Shadow [4][10], *
Ghost [1][3], *
High-frequency Temporal filtering [14][15][6]
illumination changes Size filtering *
Sudden global [1], *
illumination changes
TABLE I
COMPARED BACKGROUND SUBTRACTION APPROACHES. OUR APPROACH IS REFERRED WITH *.
Many works have been proposed in the literature as a solution to an efficient and reliable background
subtraction. Table I is a classification of the most relevant papers based on the features used. Most of the
approaches use a statistical combination of frames to compute the background model (see Table I). Some of
these approaches propose to combine the current frame and previous models with recursive filtering (adaptiv-
ity in Table I) to update the background model. Moreover, many authors propose to use pixel selectivity by
excluding from the background update process those pixels detected as in motion. Finally, problems carried
by shadows have been addressed [4][10][17]. In this paper we propose a novel simple method that exploits
all these features, combining them so as to efficiently provide detection of moving objects, ghosts and shad-
ows. The main contribution of this proposal is the integration of knowledge of detected objects, shadows
and ghosts in the segmentation process to enhance both object segmentation and background update. The
resulting method proves to be accurate and reactive, and at the same time fast and flexible in the applications.
II. DETECTING MOVING OBJECTS, GHOSTS AND SHADOWS
The first aim of our proposal is to detect real moving objects with high accuracy, limiting false negatives
(object’s pixels that are not detected) as much as possible. The second aim is to extract pixels of moving

4 DETECTING MOVING OBJECTS, GHOSTS AND SHADOWS IN VIDEO STREAMS
objects with the maximum responsiveness possible, avoiding detection of transient spurious objects, such as
cast shadows, static objects or noise.
To accomplish these aims, we propose a taxonomy of the objects of interest in the scene, using the following
definitions (see also Fig. 1):
Moving visual object (MVO): set of connected points belonging to object characterized by non-null mo-
tion.
Uncovered Background: the set of visible scene points currently not in motion.
Background (B): is the computed model of the background.
Ghost (G): a set of connected points detected as in motion by means of background subtraction, but not
corresponding to any real moving object.
Shadow: a set of connected background points modified by a shadow cast over them by a moving object.
Shadows can be further classified as MVO shadow (MVO
SH
), that is, a shadow connected with an MVO and
hence sharing its motion, and ghost shadow (G
SH
), being a shadow not connected with any real MVO.
Static cast shadows are neither detected nor considered since they do not affect moving object segmentation
if background subtraction is used: in fact, static shadows are included in the background model. A ghost
shadow can be a shadow cast either by a ghost or an MVO: the shape and/or position of the MVO with
respect to the light source can lead to the shadow not being connected to the object that generates it.
Our proposal makes use of the the explicit knowledge of all the above five categories for a precise segmen-
tation and an effective background model update. We call our approach Sakbot (Statistical And Knowledge-
Based ObjecT detection) since it exploits statistics and knowledge of the segmented objects to improve both
background modeling and moving object detection. Sakbot is depicted in Fig. 1, reporting the aforementioned
taxonomy. Sakbot’s processing is the first step for different further processes, such as object classification,
tracking, video annotation and so on.
Let us call p a point of the video frame at time t (I
t
). I
t
(p) is the value of point p in the color space. Since
images are acquired by standard color cameras or decompressed from videos with standard formats, the basic

Citations
More filters
Journal ArticleDOI
TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.
Abstract: This survey reviews advances in human motion capture and analysis from 2000 to 2006, following a previous survey of papers up to 2000 [T.B. Moeslund, E. Granum, A survey of computer vision-based human motion capture, Computer Vision and Image Understanding, 81(3) (2001) 231-268.]. Human motion capture continues to be an increasingly active research area in computer vision with over 350 publications over this period. A number of significant research advances are identified together with novel methodologies for automatic initialization, tracking, pose estimation, and movement recognition. Recent research has addressed reliable tracking and pose estimation in natural scenes. Progress has also been made towards automatic understanding of human actions and behavior. This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.

2,738 citations


Cites background or methods from "Detecting moving objects, ghosts, a..."

  • ...Year First author Initialisation Tracking Pose estimation Recognition 2003 Allen [15] 2003 Azoz * [22] 2003 Babu [23] 2003 Barron [28] * 2003 Buxton [48] 2003 Capellades [52] * 2003 Carranza * * [53] 2003 Cheung * * [59] 2003 Chowdhury [64] 2003 Chu * [65] 2003 Comaniciu [67] 2003 Cucchiara [69] 2003 Davis [79] 2003 Demirdjian * [87] 2003 Demirdjian * [89] 2003 Efros [94] 2003 Elgammal [95] 2003 Elgammal [96] 2003 Elgammal [99] 2003 Eng [101] * 2003 Foster * [110] 2003 Gerard * [114] 2003 Gonzalez [121] * 2003 Herda * [141] 2003 Jepson [177] 2003 Koschan [197] 2003 Krahnstoever [200] * * 2003 Liebowitz * [219] 2003 Masoud [231] 2003 Mikić * * [238] 2003 Mitchelson [241] 2003 Mitchelson * [242] 2003 Mittal [244] 2003 Moeslund * * [245] 2003 Moeslund * [249] 2003 Moeslund * [250] 2003 Monnet [256] 2003 Parameswaran [277] 2003 Plänkers * [289] 2003 Polat [290] 2003 Prati [293] 2003 Shah [325] * * 2003 Shakhnarovich [326] 2003 Sidenbladh * [333] * 2003 Sminchisescu * [343] 2003 Sminchisescu * [344] 2003 Song [350] * * 2003 Starck [352] * 2003 Störring [357] 2003 Vasvani [375] 2003 Vecchio [376] 2003 Viola [381] 2003 Wang [387] 2003 Wang [388] * * 2003 Wang * * [389] 2003 Wang * [390] 2003 Wang [391] 2003 Wu [398] 2003 Yang [405] 2003 Zhao [419] 2003 Zhong [423] ∑ Total=61 5 22 20 14...

    [...]

  • ...Using standard filtering techniques based on connected component analysis, size, median filter, morphology, and proximity can improve the result [69,96,128,232,408,420]....

    [...]

  • ...Classifiers have been based on color, gradients [232], flow information [69], and hysteresis thresholding [101]....

    [...]

  • ...[69] use only one value to represent each background pixel, but still good results (and speed) can be obtained due to advanced classification and updating....

    [...]

  • ..., YUV [394], HSV [69] and normalized RGB [232], since this allows for detecting shadow-pixels wrongly classified as objectpixels [293]....

    [...]

Proceedings ArticleDOI
10 Oct 2004
TL;DR: A review of the main methods and an original categorisation based on speed, memory requirements and accuracy can effectively guide the designer to select the most suitable method for a given application in a principled way.
Abstract: Background subtraction is a widely used approach for detecting moving objects from static cameras. Many different methods have been proposed over the recent years and both the novice and the expert can be confused about their benefits and limitations. In order to overcome this problem, this paper provides a review of the main methods and an original categorisation based on speed, memory requirements and accuracy. Such a review can effectively guide the designer to select the most suitable method for a given application in a principled way. Methods reviewed include parametric and non-parametric background density estimates and spatial correlation approaches.

2,346 citations


Cites background from "Detecting moving objects, ghosts, a..."

  • ...Cucchiara et al. in [ 4 ] argued that such a median value provides an adequate background model even if the n frames are subsampled with respect to the original frame rate by a factor of 10. In addition, [4] proposed to compute the median on a special set of values containing the last n, sub-sampled frames and w times the last computed median value....

    [...]

  • ...Cucchiara et al. in [4] argued that such a median value provides an adequate background model even if the n frames are subsampled with respect to the original frame rate by a factor of 10. In addition, [ 4 ] proposed to compute the median on a special set of values containing the last n, sub-sampled frames and w times the last computed median value....

    [...]

Journal ArticleDOI
TL;DR: Efficiency figures show that the proposed technique for motion detection outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate.
Abstract: This paper presents a technique for motion detection that incorporates several innovative mechanisms. For example, our proposed technique stores, for each pixel, a set of values taken in the past at the same location or in the neighborhood. It then compares this set to the current pixel value in order to determine whether that pixel belongs to the background, and adapts the model by choosing randomly which values to substitute from the background model. This approach differs from those based upon the classical belief that the oldest values should be replaced first. Finally, when the pixel is found to be part of the background, its value is propagated into the background model of a neighboring pixel. We describe our method in full details (including pseudo-code and the parameter values used) and compare it to other background subtraction techniques. Efficiency figures show that our method outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate. We also analyze the performance of a downscaled version of our algorithm to the absolute minimum of one comparison and one byte of memory per pixel. It appears that even such a simplified version of our algorithm performs better than mainstream techniques.

1,777 citations


Cites background from "Detecting moving objects, ghosts, a..."

  • ...A variant consists of including, in the background, groups of connected foreground pixels that hav e been found static for a long time, as in [69]....

    [...]

Journal ArticleDOI
TL;DR: It is demonstrated that trackers can be evaluated objectively by survival curves, Kaplan Meier statistics, and Grubs testing, and it is found that in the evaluation practice the F-score is as effective as the object tracking accuracy (OTA) score.
Abstract: There is a large variety of trackers, which have been proposed in the literature during the last two decades with some mixed success. Object tracking in realistic scenarios is a difficult problem, therefore, it remains a most active area of research in computer vision. A good tracker should perform well in a large number of videos involving illumination changes, occlusion, clutter, camera motion, low contrast, specularities, and at least six more aspects. However, the performance of proposed trackers have been evaluated typically on less than ten videos, or on the special purpose datasets. In this paper, we aim to evaluate trackers systematically and experimentally on 315 video fragments covering above aspects. We selected a set of nineteen trackers to include a wide variety of algorithms often cited in literature, supplemented with trackers appearing in 2010 and 2011 for which the code was publicly available. We demonstrate that trackers can be evaluated objectively by survival curves, Kaplan Meier statistics, and Grubs testing. We find that in the evaluation practice the F-score is as effective as the object tracking accuracy (OTA) score. The analysis under a large variety of circumstances provides objective insight into the strengths and weaknesses of trackers.

1,604 citations


Cites background from "Detecting moving objects, ghosts, a..."

  • ...For static scenes, background intensity representation is an old solution [83], [84] only suited for standardized circumstances, improved with background intensity prediction with simple statistics [85]–[87]....

    [...]

Proceedings ArticleDOI
18 Jan 2004
TL;DR: This paper compares various background subtraction algorithms for detecting moving vehicles and pedestrians in urban traffic video sequences, considering approaches varying from simple techniques such as frame differencing and adaptive median filtering, to more sophisticated probabilistic modeling techniques.
Abstract: Identifying moving objects from a video sequence is a fundamental and critical task in many computer-vision applications. A common approach is to perform background subtraction, which identifies moving objects from the portion of a video frame that differs significantly from a background model. There are many challenges in developing a good background subtraction algorithm. First, it must be robust against changes in illumination. Second, it should avoid detecting non-stationary background objects such as swinging leaves, rain, snow, and shadow cast by moving objects. Finally, its internal background model should react quickly to changes in background such as starting and stopping of vehicles. In this paper, we compare various background subtraction algorithms for detecting moving vehicles and pedestrians in urban traffic video sequences. We consider approaches varying from simple techniques such as frame differencing and adaptive median filtering, to more sophisticated probabilistic modeling techniques. While complicated techniques often produce superior performance, our experiments show that simple techniques such as adaptive median filtering can produce good results with much lower computational complexity.

794 citations

References
More filters
Proceedings ArticleDOI
23 Jun 1999
TL;DR: This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model, resulting in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes.
Abstract: A common method for real-time segmentation of moving regions in image sequences involves "background subtraction", or thresholding the error between an estimate of the image without moving objects and the current image. The numerous approaches to this problem differ in the type of background model used and the procedure used to update the model. This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model. The Gaussian, distributions of the adaptive mixture model are then evaluated to determine which are most likely to result from a background process. Each pixel is classified based on whether the Gaussian distribution which represents it most effectively is considered part of the background model. This results in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. This system has been run almost continuously for 16 months, 24 hours a day, through rain and snow.

7,660 citations


"Detecting moving objects, ghosts, a..." refers methods in this paper

  • ...Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as traffic monitoring, human motion capture and video surveillance....

    [...]

Journal ArticleDOI
TL;DR: Pfinder is a real-time system for tracking people and interpreting their behavior that uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions.
Abstract: Pfinder is a real-time system for tracking people and interpreting their behavior. It runs at 10 Hz on a standard SGI Indy computer, and has performed reliably on thousands of people in many different physical locations. The system uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions. Pfinder has been successfully used in a wide range of applications including wireless interfaces, video databases, and low-bandwidth coding.

4,280 citations

Journal ArticleDOI
TL;DR: This paper focuses on motion tracking and shows how one can use observed motion to learn patterns of activity in a site and create a hierarchical binary-tree classification of the representations within a sequence.
Abstract: Our goal is to develop a visual monitoring system that passively observes moving objects in a site and learns patterns of activity from those observations. For extended sites, the system will require multiple cameras. Thus, key elements of the system are motion tracking, camera coordination, activity classification, and event detection. In this paper, we focus on motion tracking and show how one can use observed motion to learn patterns of activity in a site. Motion segmentation is based on an adaptive background subtraction method that models each pixel as a mixture of Gaussians and uses an online approximation to update the model. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This yields a stable, real-time outdoor tracker that reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. While a tracking system is unaware of the identity of any object it tracks, the identity remains the same for the entire tracking sequence. Our system leverages this information by accumulating joint co-occurrences of the representations within a sequence. These joint co-occurrence statistics are then used to create a hierarchical binary-tree classification of the representations. This method is useful for classifying sequences, as well as individual instances of activities in a site.

3,631 citations


"Detecting moving objects, ghosts, a..." refers background in this paper

  • ...…reliable and effective moving object detection that should be characterized by some important features: high precision, with the two meanings of accuracy in shape detection and reactivity to changes in time; flexibility in different scenarios (indoor, outdoor) or different light conditions; and…...

    [...]

Journal ArticleDOI
TL;DR: W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts and to create models of people's appearance so that they can be tracked through interactions such as occlusions.
Abstract: W/sup 4/ is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts (head, hands, feet, torso) and to create models of people's appearance so that they can be tracked through interactions such as occlusions. It can determine whether a foreground region contains multiple people and can segment the region into its constituent people and track them. W/sup 4/ can also determine whether people are carrying objects, and can segment objects from their silhouettes, and construct appearance models for them so they can be identified in subsequent frames. W/sup 4/ can recognize events between people and objects, such as depositing an object, exchanging bags, or removing an object. It runs at 25 Hz for 320/spl times/240 resolution images on a 400 MHz dual-Pentium II PC.

2,870 citations


"Detecting moving objects, ghosts, a..." refers background in this paper

  • ...Selectivity [10][2][8][1], * Shadow [4][10], * Ghost [ 1 ][3], * High-frequency • Temporal filtering [14][15][6] illumination changes • Size filtering * Sudden global [1], * illumination changes...

    [...]

  • ...Selectivity [10][2][8][ 1 ], * Shadow [4][10], * Ghost [1][3], * High-frequency • Temporal filtering [14][15][6] illumination changes • Size filtering * Sudden global [1], * illumination changes...

    [...]

  • ...false objects, often referred to as “ghosts” [ 1 ][3]....

    [...]

  • ...Feature Systems Statistics • Minimum and maximum values [ 1 ] • Median [11][12], * • Single Gaussian [5][4][13] • Multiple Gaussians [14][10][3] • Eigenbackground approximation [15][6] • Minimization of Gaussian differences [7]...

    [...]

  • ...usual approach that excludes from the background update pixels detected as in motion[ 1 ][8][2]....

    [...]

Book ChapterDOI
26 Jun 2000
TL;DR: A novel non-parametric background model that can handle situations where the background of the scene is cluttered and not completely static but contains small motions such as tree branches and bushes is presented.
Abstract: Background subtraction is a method typically used to segment moving regions in image sequences taken from a static camera by comparing each new frame to a model of the scene background. We present a novel non-parametric background model and a background subtraction approach. The model can handle situations where the background of the scene is cluttered and not completely static but contains small motions such as tree branches and bushes. The model estimates the probability of observing pixel intensity values based on a sample of intensity values for each pixel. The model adapts quickly to changes in the scene which enables very sensitive detection of moving targets. We also show how the model can use color information to suppress detection of shadows. The implementation of the model runs in real-time for both gray level and color imagery. Evaluation shows that this approach achieves very sensitive detection with very low false alarm rates.

2,432 citations

Frequently Asked Questions (12)
Q1. What are the contributions mentioned in the paper "Detecting moving objects, ghosts and shadows in video streams" ?

This work proposes a general-purpose method which combines statistical assumptions with the object-level knowledge of moving objects, apparent objects ( ghosts ) and shadows acquired in the processing of the previous frames. 

The shadow detection algorithm the authors have defined in Sakbot aims to prevent moving cast shadows being misclassified as moving objects (or parts of them), thus improving the background update and reducing the undersegmentation problem. 

H| ) (9)The lower bound α is used to define a maximum value for the darkening effect of shadows on the background, and is approximately proportional to the light source intensity. 

Sakbot’s processing is the first step for different further processes, such as object classification, tracking, video annotation and so on. 

if feedback from the tracking level to the object detection level could be exploited, it is likely that the object classification could be improved by verification of temporal consistency. 

Optical flow computation is a highly time-consuming process; however, the authors compute it only when and where necessary, that is only on the blobs resulting from background subtraction (thus a small percentage of image points). 

The authors call their approach Sakbot (Statistical And KnowledgeBased ObjecT detection) since it exploits statistics and knowledge of the segmented objects to improve both background modeling and moving object detection. 

The average optical flow computed over all the pixels of an MVO blob is the figure the authors use to discriminate between MVOs and ghosts: in fact, MVOs should have significant motion, while ghosts should have a near-to-zero average optical flow since their motion is only apparent. 

in order to discriminate MVO shadows from ghost shadows, the authors use information about connectivity between objects and shadows. 

A ghost shadow can be a shadow cast either by a ghost or an MVO: the shape and/or position of the MVO with respect to the light source can lead to the shadow not being connected to the object that generates it. 

As an example, the use of a statistic background, using only BS in equation 6 (Fig. 4, second column, lower image), almost correctly updates the new background only after about forty frames, even with still considerable errors (the black area). 

Until about Frame #100 (Fig.4, second column, upper image) the moving object still substantiallycovers the area where it was stopped, preventing separation from its forming ghost.