Stable multi-target tracking in real-time surveillance video

doi:10.1109/CVPR.2011.5995667

Home
/
Papers
/
Stable multi-target tracking in real-time surveillance video

Proceedings Article•DOI•

Stable multi-target tracking in real-time surveillance video

Ben Benfold¹, Ian Reid¹•Institutions (1)

University of Oxford¹

20 Jun 2011-pp 3457-3464

TL;DR: This work presents a multi-target tracking system that is designed specifically for the provision of stable and accurate head location estimates and uses a more principled approach based on a Minimal Description Length (MDL) objective which accurately models the affinity between observations.

read less

Abstract: The majority of existing pedestrian trackers concentrate on maintaining the identities of targets, however systems for remote biometric analysis or activity recognition in surveillance video often require stable bounding-boxes around pedestrians rather than approximate locations. We present a multi-target tracking system that is designed specifically for the provision of stable and accurate head location estimates. By performing data association over a sliding window of frames, we are able to correct many data association errors and fill in gaps where observations are missed. The approach is multi-threaded and combines asynchronous HOG detections with simultaneous KLT tracking and Markov-Chain Monte-Carlo Data Association (MCM-CDA) to provide guaranteed real-time tracking in high definition video. Where previous approaches have used ad-hoc models for data association, we use a more principled approach based on a Minimal Description Length (MDL) objective which accurately models the affinity between observations. We demonstrate by qualitative and quantitative evaluation that the system is capable of providing precise location estimates for large crowds of pedestrians in real-time. To facilitate future performance comparisons, we make a new dataset with hand annotated ground truth head locations publicly available.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Visual Tracking: An Experimental Survey

[...]

Arnold W. M. Smeulders¹, Dung M. Chu¹, Rita Cucchiara², Simone Calderara², Afshin Dehghan³, Mubarak Shah³ - Show less +2 more•Institutions (3)

University of Amsterdam¹, University of Modena and Reggio Emilia², University of Florida³

01 Jul 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is demonstrated that trackers can be evaluated objectively by survival curves, Kaplan Meier statistics, and Grubs testing, and it is found that in the evaluation practice the F-score is as effective as the object tracking accuracy (OTA) score.

...read moreread less

Abstract: There is a large variety of trackers, which have been proposed in the literature during the last two decades with some mixed success. Object tracking in realistic scenarios is a difficult problem, therefore, it remains a most active area of research in computer vision. A good tracker should perform well in a large number of videos involving illumination changes, occlusion, clutter, camera motion, low contrast, specularities, and at least six more aspects. However, the performance of proposed trackers have been evaluated typically on less than ten videos, or on the special purpose datasets. In this paper, we aim to evaluate trackers systematically and experimentally on 315 video fragments covering above aspects. We selected a set of nineteen trackers to include a wide variety of algorithms often cited in literature, supplemented with trackers appearing in 2010 and 2011 for which the code was publicly available. We demonstrate that trackers can be evaluated objectively by survival curves, Kaplan Meier statistics, and Grubs testing. We find that in the evaluation practice the F-score is as effective as the object tracking accuracy (OTA) score. The analysis under a large variety of circumstances provides objective insight into the strengths and weaknesses of trackers.

...read moreread less

1,604 citations

Cites background from "Stable multi-target tracking in rea..."

...This category comprise of three videos from [1] with a fast-moving motorbike in the desert, a low contrast recording of a car on the highway, and a car-chase; three videos from the 3DPeS dataset [28] with varying illumination conditions and with complex object motion; one video from the dataset in [107] with surveillance of multiple people; and three complex videos from YouTube....
[...]

Posted Content•

MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking

[...]

Laura Leal-Taixé, Anton Milan, Ian Reid, Stefan Roth, Konrad Schindler - Show less +1 more

08 Apr 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: With MOTChallenge, the work toward a novel multiple object tracking benchmark aimed to address issues of standardization, and the way toward a unified evaluation framework for a more meaningful quantification of multi-target tracking is described.

...read moreread less

Abstract: In the recent past, the computer vision community has developed centralized benchmarks for the performance evaluation of a variety of tasks, including generic object and pedestrian detection, 3D reconstruction, optical flow, single-object short-term tracking, and stereo estimation. Despite potential pitfalls of such benchmarks, they have proved to be extremely helpful to advance the state of the art in the respective area. Interestingly, there has been rather limited work on the standardization of quantitative benchmarks for multiple target tracking. One of the few exceptions is the well-known PETS dataset, targeted primarily at surveillance applications. Despite being widely used, it is often applied inconsistently, for example involving using different subsets of the available data, different ways of training the models, or differing evaluation scripts. This paper describes our work toward a novel multiple object tracking benchmark aimed to address such issues. We discuss the challenges of creating such a framework, collecting existing and new data, gathering state-of-the-art methods to be tested on the datasets, and finally creating a unified evaluation system. With MOTChallenge we aim to pave the way toward a unified evaluation framework for a more meaningful quantification of multi-target tracking.

...read moreread less

667 citations

Cites methods from "Stable multi-target tracking in rea..."

...9 yes static high cloudy [10] ADL-Rundle-1 30 1920x1080 500 (00:17) 32 9306 18....
[...]
...For the 4 sequences filmed using a static camera, AVG-TownCentre, PETS09-S2L1, PETS09-S2L2 and TUDStadtmitte, the calibration files from the sources [5], [10], [20] are used to compute a 2D homography between the image plane and the ground plane....
[...]

Journal Article•DOI•

Continuous Energy Minimization for Multitarget Tracking

[...]

Anton Milan, Stefan Roth, Konrad Schindler¹•Institutions (1)

ETH Zurich¹

01 Jan 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work proposes an alternative formulation of multitarget tracking as minimization of a continuous energy that focuses on designing an energy that corresponds to a more complete representation of the problem, rather than one that is amenable to global optimization.

...read moreread less

Abstract: Many recent advances in multiple target tracking aim at finding a (nearly) optimal set of trajectories within a temporal window. To handle the large space of possible trajectory hypotheses, it is typically reduced to a finite set by some form of data-driven or regular discretization. In this work, we propose an alternative formulation of multitarget tracking as minimization of a continuous energy. Contrary to recent approaches, we focus on designing an energy that corresponds to a more complete representation of the problem, rather than one that is amenable to global optimization. Besides the image evidence, the energy function takes into account physical constraints, such as target dynamics, mutual exclusion, and track persistence. In addition, partial image evidence is handled with explicit occlusion reasoning, and different targets are disambiguated with an appearance model. To nevertheless find strong local minima of the proposed nonconvex energy, we construct a suitable optimization scheme that alternates between continuous conjugate gradient descent and discrete transdimensional jump moves. These moves, which are executed such that they always reduce the energy, allow the search to escape weak minima and explore a much larger portion of the search space of varying dimensionality. We demonstrate the validity of our approach with an extensive quantitative evaluation on several public data sets.

...read moreread less

616 citations

Journal Article•DOI•

Human motion trajectory prediction: a survey:

[...]

Andrey Rudenko¹, Andrey Rudenko², Luigi Palmieri¹, Michael Herman¹, Kris M. Kitani³, Dariu M. Gavrila⁴, Kai O. Arras¹ - Show less +3 more•Institutions (4)

Bosch¹, Örebro University², Carnegie Mellon University³, Delft University of Technology⁴

07 Jun 2020-The International Journal of Robotics Research

TL;DR: In this article, the ability of intelligent autonomous systems to perceive, understand, and anticipate human behavior becomes increasingly important in a growing number of intelligent systems in human environments, and the ability to do so is discussed.

...read moreread less

Abstract: With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand, and anticipate human behavior becomes increasingly important. Spec...

...read moreread less

547 citations

Book Chapter•DOI•

GMCP-Tracker: global multi-object tracking using generalized minimum clique graphs

[...]

Amir Roshan Zamir, Afshin Dehghan, Mubarak Shah

07 Oct 2012

TL;DR: This work proposes an approach to data association which incorporates both motion and appearance in a global manner and utilizes Generalized Minimum Clique Graphs to solve the optimization problem of the data association method.

...read moreread less

Abstract: Data association is an essential component of any human tracking system. The majority of current methods, such as bipartite matching, incorporate a limited-temporal-locality of the sequence into the data association problem, which makes them inherently prone to IDswitches and difficulties caused by long-term occlusion, cluttered background, and crowded scenes.We propose an approach to data association which incorporates both motion and appearance in a global manner. Unlike limited-temporal-locality methods which incorporate a few frames into the data association problem, we incorporate the whole temporal span and solve the data association problem for one object at a time, while implicitly incorporating the rest of the objects. In order to achieve this, we utilize Generalized Minimum Clique Graphs to solve the optimization problem of our data association method. Our proposed method yields a better formulated approach to data association which is supported by our superior results. Experiments show the proposed method makes significant improvements in tracking in the diverse sequences of Town Center [1], TUD-crossing [2], TUD-Stadtmitte [2], PETS2009 [3], and a new sequence called Parking Lot compared to the state of the art methods.

...read moreread less

411 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Histograms of oriented gradients for human detection

[...]

Navneet Dalal¹, Bill Triggs¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

20 Jun 2005

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

...read moreread less

31,952 citations

"Stable multi-target tracking in rea..." refers methods in this paper

...We make motion estimates by following corner features with pyramidal Kanade-Lucas-Tomasi (KLT) tracking [15, 8]....
[...]

Proceedings Article•

An iterative image registration technique with an application to stereo vision

[...]

Bruce D. Lucas¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

24 Aug 1981

TL;DR: In this paper, the spatial intensity gradient of the images is used to find a good match using a type of Newton-Raphson iteration, which can be generalized to handle rotation, scaling and shearing.

...read moreread less

Abstract: Image registration finds a variety of applications in computer vision. Unfortunately, traditional image registration techniques tend to be costly. We present a new image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. Our technique is taster because it examines far fewer potential matches between the images than existing techniques Furthermore, this registration technique can be generalized to handle rotation, scaling and shearing. We show how our technique can be adapted tor use in a stereo vision system.

...read moreread less

12,944 citations

Detection and Tracking of Point Features

[...]

Carlo Tomasi

01 Jan 1991

2,443 citations

Journal Article•DOI•

Evaluating multiple object tracking performance: the CLEAR MOT metrics

[...]

Keni Bernardin¹, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Feb 2008-Eurasip Journal on Image and Video Processing

TL;DR: This work introduces two intuitive and general metrics to allow for objective comparison of tracker characteristics, focusing on their precision in estimating object locations, their accuracy in recognizing object configurations and their ability to consistently label objects over time.

...read moreread less

Abstract: Simultaneous tracking of multiple persons in real-world environments is an active research field and several approaches have been proposed, based on a variety of features and algorithms. Recently, there has been a growing interest in organizing systematic evaluations to compare the various techniques. Unfortunately, the lack of common metrics for measuring the performance of multiple object trackers still makes it hard to compare their results. In this work, we introduce two intuitive and general metrics to allow for objective comparison of tracker characteristics, focusing on their precision in estimating object locations, their accuracy in recognizing object configurations and their ability to consistently label objects over time. These metrics have been extensively used in two large-scale international evaluations, the 2006 and 2007 CLEAR evaluations, to measure and compare the performance of multiple object trackers for a wide variety of tracking tasks. Selected performance results are presented and the advantages and drawbacks of the presented metrics are discussed based on the experience gained during the evaluations.

...read moreread less

2,286 citations

"Stable multi-target tracking in rea..." refers background or methods in this paper

...Multiple Object Tracking Accuracy (MOTA) is a combined measure which takes into account false positives, false negatives and identity switches (see [5] for details)....
[...]
...The Multiple Object Tracking Precision (MOTP) measures the precision with which objects are located using the intersection of the estimated region with the ground truth region....
[...]
...A comprehensive evaluation of the realtime tracker on two different datasets is performed using the standard CLEAR MOT [5] evaluation criteria....
[...]
...Both resulted in a significant reduction in the MOTA and either the precision or recall....
[...]

Proceedings Article•DOI•

Learning to associate: HybridBoosted multi-target tracker for crowded scene

[...]

Yuan Li¹, Chang Huang¹, Ram Nevatia¹•Institutions (1)

University of Southern California¹

20 Jun 2009

TL;DR: A learning-based hierarchical approach of multi-target tracking from a single camera by progressively associating detection responses into longer and longer track fragments (tracklets) and finally the desired target trajectories by virtue of a HybridBoost algorithm.

...read moreread less

Abstract: We propose a learning-based hierarchical approach of multi-target tracking from a single camera by progressively associating detection responses into longer and longer track fragments (tracklets) and finally the desired target trajectories. To define tracklet affinity for association, most previous work relies on heuristically selected parametric models; while our approach is able to automatically select among various features and corresponding non-parametric models, and combine them to maximize the discriminative power on training data by virtue of a HybridBoost algorithm. A hybrid loss function is used in this algorithm because the association of tracklet is formulated as a joint problem of ranking and classification: the ranking part aims to rank correct tracklet associations higher than other alternatives; the classification part is responsible to reject wrong associations when no further association should be done. Experiments are carried out by tracking pedestrians in challenging datasets. We compare our approach with state-of-the-art algorithms to show its improvement in terms of tracking accuracy.

...read moreread less

637 citations

"Stable multi-target tracking in rea..." refers background in this paper

...The first group covers feed-forward systems which use only current and past observations to estimate the current state [7, 1, 2]. The second group covers data association based methods which also use future information to estimate the current state, allowing ambiguities to be more easily resolved at the cost of increased latency [13, 11, 12 , 3, 20]....
[...]