Consistent labeling of tracked objects in multiple cameras with overlapping fields of view

doi:10.1109/TPAMI.2003.1233912

Home
/
Papers
/
Consistent labeling of tracked objects in multiple cameras with overlapping fields of view

Journal Article•DOI•

Consistent labeling of tracked objects in multiple cameras with overlapping fields of view

Sohaib Khan, Mubarak Shah¹•Institutions (1)

01 Oct 2003-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 25, Iss: 10, pp 1355-1360

TL;DR: It is shown that, if the FOV lines are known, it is possible to disambiguate between multiple possibilities for correspondence, and once these lines are initialized, the homography between the views can also be recovered.

read less

Abstract: We address the issue of tracking moving objects in an environment covered by multiple uncalibrated cameras with overlapping fields of view, typical of most surveillance setups. In such a scenario, it is essential to establish correspondence between tracks of the same object, seen in different cameras, to recover complete information about the object. We call this the problem of consistent labeling of objects when seen in multiple cameras. We employ a novel approach of finding the limits of field of view (FOV) of each camera as visible in the other cameras. We show that, if the FOV lines are known, it is possible to disambiguate between multiple possibilities for correspondence. We present a method to automatically recover these lines by observing motion in the environment, Furthermore, once these lines are initialized, the homography between the views can also be recovered. We present results on indoor and outdoor sequences containing persons and vehicles.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

Cites methods from "Consistent labeling of tracked obje..."

...2001; Cai and Aggarwal 1999] or computed automatically [Lee et al. 2000; Khan and Shah 2003] from the...
[...]

Journal Article•DOI•

Multicamera People Tracking with a Probabilistic Occupancy Map

[...]

François Fleuret¹, Jérôme Berclaz¹, R. Lengagne², Pascal Fua³•Institutions (3)

École Normale Supérieure¹, GE Security², École Polytechnique Fédérale de Lausanne³

01 Feb 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is demonstrated that the generative model can effectively handle occlusions in each time frame independently, even when the only data available comes from the output of a simple background subtraction algorithm and when the number of individuals is unknown a priori.

...read moreread less

Abstract: Given two to four synchronized video streams taken at eye level and from different angles, we show that we can effectively combine a generative model with dynamic programming to accurately follow up to six individuals across thousands of frames in spite of significant occlusions and lighting changes. In addition, we also derive metrically accurate trajectories for each of them. Our contribution is twofold. First, we demonstrate that our generative model can effectively handle occlusions in each time frame independently, even when the only data available comes from the output of a simple background subtraction algorithm and when the number of individuals is unknown a priori. Second, we show that multiperson tracking can be reliably achieved by processing individual trajectories separately over long sequences, provided that a reasonable heuristic is used to rank these individuals and that we avoid confusing them with one another.

...read moreread less

865 citations

Journal Article•DOI•

Intelligent multi-camera video surveillance: A review

[...]

Xiaogang Wang¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Jan 2013-Pattern Recognition Letters

TL;DR: This paper reviews the recent development of relevant technologies from the perspectives of computer vision and pattern recognition, and discusses how to face emerging challenges of intelligent multi-camera video surveillance.

...read moreread less

695 citations

Cites methods from "Consistent labeling of tracked obje..."

...Khan and Shah (2003) propose a method to automatically recover FOV lines, which are the boundaries of the FOV of a camera in another camera views, by observing the motions of objects....
[...]

Proceedings Article•DOI•

Tracking across multiple cameras with disjoint views

[...]

Javed¹, Rasheed¹, Shafique¹, Shah¹•Institutions (1)

University of Central Florida¹

13 Oct 2003

TL;DR: This work presents a novel approach for establishing object correspondence across non-overlapping cameras, which exploits the redundance in paths that people and cars tend to follow, e.g. roads, walk-ways or corridors, by using motion trends and appearance of objects, to establish correspondence.

...read moreread less

Abstract: Conventional tracking approaches assume proximity in space, time and appearance of objects in successive observations. However, observations of objects are often widely separated in time and space when viewed from multiple non-overlapping cameras. To address this problem, we present a novel approach for establishing object correspondence across non-overlapping cameras. Our multicamera tracking algorithm exploits the redundance in paths that people and cars tend to follow, e.g. roads, walk-ways or corridors, by using motion trends and appearance of objects, to establish correspondence. Our system does not require any inter-camera calibration, instead the system learns the camera topology and path probabilities of objects using Parzen windows, during a training phase. Once the training is complete, correspondences are assigned using the maximum a posteriori (MAP) estimation framework. The learned parameters are updated with changing trajectory patterns. Experiments with real world videos are reported, which validate the proposed approach.

...read moreread less

531 citations

Journal Article•DOI•

Tracking Multiple Occluding People by Localizing on Multiple Scene Planes

[...]

Saad M. Khan¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

01 Mar 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A planar homographic occupancy constraint is developed that fuses foreground likelihood information from multiple views, to resolve occlusions and localize people on a reference scene plane in the framework of plane to plane homologies.

...read moreread less

Abstract: Occlusion and lack of visibility in crowded and cluttered scenes make it difficult to track individual people correctly and consistently, particularly in a single view. We present a multi-view approach to solving this problem. In our approach we neither detect nor track objects from any single camera or camera pair; rather evidence is gathered from all the cameras into a synergistic framework and detection and tracking results are propagated back to each view. Unlike other multi-view approaches that require fully calibrated views our approach is purely image-based and uses only 2D constructs. To this end we develop a planar homographic occupancy constraint that fuses foreground likelihood information from multiple views, to resolve occlusions and localize people on a reference scene plane. For greater robustness this process is extended to multiple planes parallel to the reference plane in the framework of plane to plane homologies. Our fusion methodology also models scene clutter using the Schmieder and Weathersby clutter measure, which acts as a confidence prior, to assign higher fusion weight to views with lesser clutter. Detection and tracking are performed simultaneously by graph cuts segmentation of tracks in the space-time occupancy likelihood data. Experimental results with detailed qualitative and quantitative analysis, are demonstrated in challenging multi-view, crowded scenes.

...read moreread less

369 citations

Cites background from "Consistent labeling of tracked obje..."

...In spite of the current body of knowledge, we believe monocular methods have limited ability to handle occlusions involving several objects, generally two or three, because the single viewpoint is intrinsically unable to observe the hidden areas....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Learning patterns of activity using real-time tracking

[...]

Chris Stauffer¹, W.E.L. Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper focuses on motion tracking and shows how one can use observed motion to learn patterns of activity in a site and create a hierarchical binary-tree classification of the representations within a sequence.

...read moreread less

Abstract: Our goal is to develop a visual monitoring system that passively observes moving objects in a site and learns patterns of activity from those observations. For extended sites, the system will require multiple cameras. Thus, key elements of the system are motion tracking, camera coordination, activity classification, and event detection. In this paper, we focus on motion tracking and show how one can use observed motion to learn patterns of activity in a site. Motion segmentation is based on an adaptive background subtraction method that models each pixel as a mixture of Gaussians and uses an online approximation to update the model. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This yields a stable, real-time outdoor tracker that reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. While a tracking system is unaware of the identity of any object it tracks, the identity remains the same for the entire tracking sequence. Our system leverages this information by accumulating joint co-occurrences of the representations within a sequence. These joint co-occurrence statistics are then used to create a hierarchical binary-tree classification of the representations. This method is useful for classifying sequences, as well as individual instances of activities in a site.

...read moreread less

3,631 citations

Additional excerpts

...camera-multiple-object tracking [ 13 , 15] is a problem that has received considerable attention in...
[...]

Journal Article•DOI•

Algorithms for cooperative multisensor surveillance

[...]

Robert T. Collins¹, Alan J. Lipton, Hironobu Fujiyoshi², Takeo Kanade¹•Institutions (2)

Carnegie Mellon University¹, Chubu University²

01 Oct 2001

TL;DR: This paper presents an overview of the issues and algorithms involved in creating this semiautonomous, multicamera surveillance system and its potential to improve the situational awareness of security providers and decision makers.

...read moreread less

Abstract: The Video Surveillance and Monitoring (VSAM) team at Carnegie Mellon University (CMU) has developed an end-to-end, multicamera surveillance system that allows a single human operator to monitor activities in a cluttered environment using a distributed network of active video sensors. Video understanding algorithms have been developed to automatically detect people and vehicles, seamlessly track them using a network of cooperating active sensors, determine their three-dimensional locations with respect to a geospatial site model, and present this information to a human operator who controls the system through a graphical user interface. The goal is to automatically collect and disseminate real-time information to improve the situational awareness of security providers and decision makers. The feasibility of real-time video surveillance has been demonstrated within a multicamera testbed system developed on the campus of CMU. This paper presents an overview of the issues and algorithms involved in creating this semiautonomous, multicamera surveillance system.

...read moreread less

693 citations

"Consistent labeling of tracked obje..." refers background in this paper

...Index Terms—Tracking, multiple cameras, multiperspective video, surveillance, camera handoff, sensor fusion. æ...
[...]

Proceedings Article•DOI•

Tracking across multiple cameras with disjoint views

[...]

Javed¹, Rasheed¹, Shafique¹, Shah¹•Institutions (1)

University of Central Florida¹

13 Oct 2003

...read moreread less

531 citations

"Consistent labeling of tracked obje..." refers background in this paper

...Most of the information needed can be extracted by observing motion over a period of time....
[...]
...0162-8828/03/$17.00 ß 2003 IEEE Published by the IEEE Computer Society in the general surveillance scenario....
[...]

Proceedings Article•DOI•

Real-time human motion analysis by image skeletonization

[...]

Hironobu Fujiyoshi¹, Alan J. Lipton•Institutions (1)

Carnegie Mellon University¹

19 Oct 1998

TL;DR: A process is described for analysing the motion of a human target in a video stream, where a "star" skeleton is produced and two motion cues are determined: body posture, and cyclic motion of skeleton segments.

...read moreread less

Abstract: In this paper a process is described for analysing the motion of a human target in a video stream. Moving targets are detected and their boundaries extracted. From these, a "star" skeleton is produced. Two motion cues are determined from this skeletonization: body posture, and cyclic motion of skeleton segments. These cues are used to determine human activities such as walking or running, and even potentially, the target's gait. Unlike other methods, this does not require an a priori human model, or a large number of "pixels on target". Furthermore, it is computationally inexpensive, and thus ideal for real-world video applications such as outdoor video surveillance.

...read moreread less

464 citations

"Consistent labeling of tracked obje..." refers background in this paper

...Results of experiments with both indoor and outdoor sequences were presented....
[...]

Proceedings Article•DOI•

A hierarchical approach to robust background subtraction using color and gradient information

[...]

Omar Javed¹, Khurram Shafique¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

05 Dec 2002

TL;DR: This method provides the solution to some of the common problems that are not addressed by most background subtraction algorithms, such as fast illumination changes, repositioning of static background objects, and initialization of background model with moving objects present in the scene.

...read moreread less

Abstract: We present a background subtraction method that uses multiple cues to detect objects robustly in adverse conditions. The algorithm consists of three distinct levels, i.e., pixel level, region level and frame level. At the pixel level, statistical models of gradients and color are separately used to classify each pixel as belonging to background or foreground. In the region level, foreground pixels obtained from the color based subtraction are grouped into regions and gradient based subtraction is then used to make inferences about the validity of these regions. Pixel based models are updated based on decisions made at the region level. Finally, frame level analysis is performed to detect global illumination changes. Our method provides the solution to some of the common problems that are not addressed by most background subtraction algorithms, such as fast illumination changes, repositioning of static background objects, and initialization of background model with moving objects present in the scene.

...read moreread less

462 citations