scispace - formally typeset
Search or ask a question
Book ChapterDOI

Non-parametric Model for Background Subtraction

TL;DR: A novel non-parametric background model that can handle situations where the background of the scene is cluttered and not completely static but contains small motions such as tree branches and bushes is presented.
Abstract: Background subtraction is a method typically used to segment moving regions in image sequences taken from a static camera by comparing each new frame to a model of the scene background. We present a novel non-parametric background model and a background subtraction approach. The model can handle situations where the background of the scene is cluttered and not completely static but contains small motions such as tree branches and bushes. The model estimates the probability of observing pixel intensity values based on a sample of intensity values for each pixel. The model adapts quickly to changes in the scene which enables very sensitive detection of moving targets. We also show how the model can use color information to suppress detection of shadows. The implementation of the model runs in real-time for both gray level and color imagery. Evaluation shows that this approach achieves very sensitive detection with very low false alarm rates.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,727 citations

Journal ArticleDOI
TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.
Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

5,318 citations

Journal ArticleDOI
TL;DR: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed, which employs a metric derived from the Bhattacharyya coefficient as similarity measure, and uses the mean shift procedure to perform the optimization.
Abstract: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed. The feature histogram-based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated using the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples, the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only a few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking.

4,996 citations


Cites background from "Non-parametric Model for Background..."

  • ...In controlled environments with fixed camera, additional geometric constraints (such as the expected scale) and background subtraction [24] can be exploited to improve the tracking process....

    [...]

Journal ArticleDOI
TL;DR: W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts and to create models of people's appearance so that they can be tracked through interactions such as occlusions.
Abstract: W/sup 4/ is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts (head, hands, feet, torso) and to create models of people's appearance so that they can be tracked through interactions such as occlusions. It can determine whether a foreground region contains multiple people and can segment the region into its constituent people and track them. W/sup 4/ can also determine whether people are carrying objects, and can segment objects from their silhouettes, and construct appearance models for them so they can be identified in subsequent frames. W/sup 4/ can recognize events between people and objects, such as depositing an object, exchanging bags, or removing an object. It runs at 25 Hz for 320/spl times/240 resolution images on a 400 MHz dual-Pentium II PC.

2,870 citations


Cites background or methods from "Non-parametric Model for Background..."

  • ...However, one could replace it with a more robust but slow detection algorithm such as in [14], [10]....

    [...]

  • ...Other methods developed in our laboratory [21], [10] have more sensitivity in detecting foreground regions, but are computationally more intensive....

    [...]

  • ...[10] uses a nonparametric background model by estimating the probability of observing pixel intensity values based on a sample of intensity values for each pixel....

    [...]

Journal ArticleDOI
TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.

2,738 citations


Cites background or methods from "Non-parametric Model for Background..."

  • ...Year First author Initialisation Tracking Pose estimation Recognition 2003 Allen [15] 2003 Azoz * [22] 2003 Babu [23] 2003 Barron [28] * 2003 Buxton [48] 2003 Capellades [52] * 2003 Carranza * * [53] 2003 Cheung * * [59] 2003 Chowdhury [64] 2003 Chu * [65] 2003 Comaniciu [67] 2003 Cucchiara [69] 2003 Davis [79] 2003 Demirdjian * [87] 2003 Demirdjian * [89] 2003 Efros [94] 2003 Elgammal [95] 2003 Elgammal [96] 2003 Elgammal [99] 2003 Eng [101] * 2003 Foster * [110] 2003 Gerard * [114] 2003 Gonzalez [121] * 2003 Herda * [141] 2003 Jepson [177] 2003 Koschan [197] 2003 Krahnstoever [200] * * 2003 Liebowitz * [219] 2003 Masoud [231] 2003 Mikić * * [238] 2003 Mitchelson [241] 2003 Mitchelson * [242] 2003 Mittal [244] 2003 Moeslund * * [245] 2003 Moeslund * [249] 2003 Moeslund * [250] 2003 Monnet [256] 2003 Parameswaran [277] 2003 Plänkers * [289] 2003 Polat [290] 2003 Prati [293] 2003 Shah [325] * * 2003 Shakhnarovich [326] 2003 Sidenbladh * [333] * 2003 Sminchisescu * [343] 2003 Sminchisescu * [344] 2003 Song [350] * * 2003 Starck [352] * 2003 Störring [357] 2003 Vasvani [375] 2003 Vecchio [376] 2003 Viola [381] 2003 Wang [387] 2003 Wang [388] * * 2003 Wang * * [389] 2003 Wang * [390] 2003 Wang [391] 2003 Wu [398] 2003 Yang [405] 2003 Zhao [419] 2003 Zhong [423] ∑ Total=61 5 22 20 14...

    [...]

  • ...Using standard filtering techniques based on connected component analysis, size, median filter, morphology, and proximity can improve the result [69,96,128,232,408,420]....

    [...]

  • ...Elgammal et al. [96] use a kernel-based approach where they represent a background pixel by the individual pixels of the last N frames....

    [...]

  • ...[96] use a kernel-based approach where they represent a background pixel by the individual pixels of the last N frames....

    [...]

  • ...The slow changes in the scene can be updated recursively by including the current pixel value into the model as a weighted combination [69,96,232,354]....

    [...]

References
More filters
Proceedings ArticleDOI
23 Jun 1999
TL;DR: This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model, resulting in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes.
Abstract: A common method for real-time segmentation of moving regions in image sequences involves "background subtraction", or thresholding the error between an estimate of the image without moving objects and the current image. The numerous approaches to this problem differ in the type of background model used and the procedure used to update the model. This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model. The Gaussian, distributions of the adaptive mixture model are then evaluated to determine which are most likely to result from a background process. Each pixel is classified based on whether the Gaussian distribution which represents it most effectively is considered part of the background model. This results in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. This system has been run almost continuously for 16 months, 24 hours a day, through rain and snow.

7,660 citations


"Non-parametric Model for Background..." refers background or methods in this paper

  • ...A comparison between the proposed model and a Gaussian mixture model [6, 7 ] was also presented....

    [...]

  • ...In this section we describe a set of experiments performed to compare the detection performance of the proposed background model as described in section 2 and a mixture of Gaussian model as described in [6, 7 ]....

    [...]

  • ...In [6, 7 ] a generalization to the previous approach was presented....

    [...]

Journal ArticleDOI
TL;DR: Pfinder is a real-time system for tracking people and interpreting their behavior that uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions.
Abstract: Pfinder is a real-time system for tracking people and interpreting their behavior. It runs at 10 Hz on a standard SGI Indy computer, and has performed reliably on thousands of people in many different physical locations. The system uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions. Pfinder has been successfully used in a wide range of applications including wireless interfaces, video databases, and low-bandwidth coding.

4,280 citations


"Non-parametric Model for Background..." refers methods in this paper

  • ...This basic adaptive model is used in [ 1 ], also Kalman filtering for adaptation is used in [2,3,4]....

    [...]

MonographDOI
David Scott1
17 Aug 1992

3,024 citations

Proceedings Article
01 Aug 1997
TL;DR: A mixture-of-Gaussians classification model for each pixel is learned using an unsupervised technique--an efficient, incremental version of EM, which identifies and eliminates shadows much more effectively than other techniques such as thresholding.
Abstract: "Background subtraction" is an old technique for finding moving objects in a video sequence--for example, cars driving on a freeway. The idea is that subtracting the current image from a time-averaged background image will leave only nonstationary objects. It is, however, a crude approximation to the task of classifying each pixel of the current image; it fails with slow-moving objects and does not distinguish shadows from moving objects. The basic idea of this paper is that we can classify each pixel using a model of how that pixel looks when it is part of different classes. We learn a mixture-of-Gaussians classification model for each pixel using an unsupervised technique--an efficient, incremental version of EM. Unlike the standard image-averaging approach, this automatically updates the mixture component for each class according to likelihood of membership; hence slow-moving objects are handled perfectly. Our approach also identifies and eliminates shadows much more effectively than other techniques such as thresholding. Application of this method as part of the Roadwatch traffic surveillance project is expected to result in significant improvements in vehicle identification and tracking.

1,003 citations


"Non-parametric Model for Background..." refers methods in this paper

  • ...In [ 5 ] a mixture of three Normal distributions was used to model the pixel value for traffic surveillance applications....

    [...]

Proceedings ArticleDOI
23 Jun 1998
TL;DR: In this paper, the authors describe a vision system that monitors activity in a site over extended periods of time using a distributed set of sensors to cover the site, and an adaptive tracker detects multiple moving objects in the sensors.
Abstract: We describe a vision system that monitors activity in a site over extended periods of time. The system uses a distributed set of sensors to cover the site, and an adaptive tracker detects multiple moving objects in the sensors. Our hypothesis is that motion tracking is sufficient to support a range of computations about site activities. We demonstrate using the tracked motion data to calibrate the distributed sensors, to construct rough site models, to classify detected objects, to learn common patterns of activity for different object classes, and to detect unusual activities.

626 citations