scispace - formally typeset
Book ChapterDOI

M2Tracker: A Multi-view Approach to Segmenting and Tracking People in a Cluttered Scene Using Region-Based Stereo

Reads0
Chats0
TLDR
A system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other and a scheme for combining evidences gathered from different camera pairs using occlusion analysis so as to obtain a globally optimum detection and tracking of objects.
Abstract
We present a system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other. The system improves upon existing systems in many ways including: (1)We do not assume that a foreground connected component belongs to only one object; rather, we segment the views taking into account color models for the objects and the background. This helps us to not only separate foreground regions belonging to different objects, but to also obtain better background regions than traditional background subtraction methods (as it uses foreground color models in the algorithm). (2) It is fully automatic and does not require any manual input or initializations of any kind. (3) Instead of taking decisions about object detection and tracking from a single view or camera pair, we collect evidences from each pair and combine the evidence to obtain a decision in the end. This helps us to obtain much better detection and tracking as opposed to traditional systems.Several innovations help us tackle the problem. The first is the introduction of a region-based stereo algorithm that is capable of finding 3D points inside an object if we know the regions belonging to the object in two views. No exact point matching is required. This is especially useful in wide baseline camera systems where exact point matching is very difficult due to self-occlusion and a substantial change in viewpoint. The second contribution is the development of a scheme for setting priors for use in segmentation of a view using bayesian classification. The scheme, which assumes knowledge of approximate shape and location of objects, dynamically assigns priors for different objects at each pixel so that occlusion information is encoded in the priors. The third contribution is a scheme for combining evidences gathered from different camera pairs using occlusion analysis so as to obtain a globally optimum detection and tracking of objects.The system has been tested using different density of people in the scene which helps us to determine the number of cameras required for a particular density of people.

read more

Citations
More filters
Journal ArticleDOI

A survey of advances in vision-based human motion capture and analysis

TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.
Journal ArticleDOI

A survey on visual surveillance of object motion and behaviors

TL;DR: This paper reviews recent developments and general strategies of the processing framework of visual surveillance in dynamic scenes, and analyzes possible research directions, e.g., occlusion handling, a combination of two and three-dimensional tracking, and fusion of information from multiple sensors, and remote surveillance.
Journal ArticleDOI

Evaluating multiple object tracking performance: the CLEAR MOT metrics

TL;DR: This work introduces two intuitive and general metrics to allow for objective comparison of tracker characteristics, focusing on their precision in estimating object locations, their accuracy in recognizing object configurations and their ability to consistently label objects over time.
Journal ArticleDOI

M 2 Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene

TL;DR: A system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized surveillance cameras located far from each other and the use of occlusion analysis to combine evidence from different camera pairs is presented.
Journal ArticleDOI

Segmentation and Tracking of Multiple Humans in Crowded Environments

TL;DR: A model-based approach to interpret the image observations by multiple partially occluded human hypotheses in a Bayesian framework is proposed, which defines a joint image likelihood for multiple humans based on the appearance of the humans, the visibility of the body obtained by occlusion reasoning, and foreground/background separation.
References
More filters
Journal ArticleDOI

Pfinder: real-time tracking of the human body

TL;DR: Pfinder is a real-time system for tracking people and interpreting their behavior that uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions.

A System for Video Surveillance and Monitoring

TL;DR: An overview of theVSAM system, which uses multiple, cooperative video sensors to provide continuous coverage of people and vehicles in a cluttered environment, is presented.
Proceedings ArticleDOI

Multi-camera multi-person tracking for EasyLiving

TL;DR: In this article, the authors used two sets of color stereo cameras for tracking multiple people during live demonstrations in a living room, and the stereo images were used for locating people and the color images are used for maintaining their identities.
Proceedings ArticleDOI

W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people

TL;DR: W/sup 4/ is a real time visual surveillance system for detecting and tracking people and monitoring their activities in an outdoor environment that employs a combination of shape analysis and tracking to locate people and their parts and to create models of people's appearance so that they can be tracked through interactions such as occlusion.
Journal ArticleDOI

Stereo correspondence through feature grouping and maximal cliques

TL;DR: The authors propose a method for solving the stereo correspondence problem by extracting local image structures and matching similar such structures between two images using a benefit function.
Related Papers (5)