Exploiting the circulant structure of tracking-by-detection with kernels

doi:10.1007/978-3-642-33765-9_50

Home
/
Papers
/
Exploiting the circulant structure of tracking-by-detection with kernels

Book Chapter•DOI•

Exploiting the circulant structure of tracking-by-detection with kernels

João F. Henriques¹, Rui Caseiro¹, Pedro Martins¹, Jorge Batista¹•Institutions (1)

University of Coimbra¹

07 Oct 2012-pp 702-715

TL;DR: Using the well-established theory of Circulant matrices, this work provides a link to Fourier analysis that opens up the possibility of extremely fast learning and detection with the Fast Fourier Transform, which can be done in the dual space of kernel machines as fast as with linear classifiers.

read less

Abstract: Recent years have seen greater interest in the use of discriminative classifiers in tracking systems, owing to their success in object detection. They are trained online with samples collected during tracking. Unfortunately, the potentially large number of samples becomes a computational burden, which directly conflicts with real-time requirements. On the other hand, limiting the samples may sacrifice performance. Interestingly, we observed that, as we add more and more samples, the problem acquires circulant structure. Using the well-established theory of Circulant matrices, we provide a link to Fourier analysis that opens up the possibility of extremely fast learning and detection with the Fast Fourier Transform. This can be done in the dual space of kernel machines as fast as with linear classifiers. We derive closed-form solutions for training and detection with several types of kernels, including the popular Gaussian and polynomial kernels. The resulting tracker achieves performance competitive with the state-of-the-art, can be implemented with only a few lines of code and runs at hundreds of frames-per-second. MATLAB code is provided in the paper (see Algorithm 1).

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

High-Speed Tracking with Kernelized Correlation Filters

[...]

João F. Henriques¹, Rui Caseiro¹, Pedro Martins¹, Jorge Batista¹•Institutions (1)

University of Coimbra¹

01 Mar 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new kernelized correlation filter is derived, that unlike other kernel algorithms has the exact same complexity as its linear counterpart, which is called dual correlation filter (DCF), which outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite being implemented in a few lines of code.

...read moreread less

Abstract: The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies—any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the discrete Fourier transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new kernelized correlation filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call dual correlation filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.

...read moreread less

4,994 citations

Cites background from "Exploiting the circulant structure ..."

...A comprehensive review of tracking-by-detection is outside the scope of this article, but we refer the interested reader to two excellent and very recent surveys [1], [2]....
[...]
...Recall that our goal is to learn and detect over translated image patches efficiently....
[...]

Proceedings Article•DOI•

Online Object Tracking: A Benchmark

[...]

Yi Wu¹, Jongwoo Lim², Ming-Hsuan Yang¹•Institutions (2)

University of California, Merced¹, Hanyang University²

23 Jun 2013

TL;DR: Large scale experiments are carried out with various evaluation criteria to identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

Abstract: Object tracking is one of the most important components in numerous applications of computer vision. While much progress has been made in recent years with efforts on sharing code and datasets, it is of great importance to develop a library and benchmark to gauge the state of the art. After briefly reviewing recent advances of online object tracking, we carry out large scale experiments with various evaluation criteria to understand how these algorithms perform. The test image sequences are annotated with different attributes for performance evaluation and analysis. By analyzing quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

3,828 citations

Cites methods from "Exploiting the circulant structure ..."

...Among the top 10 trackers, CSK has the highest speed where the proposed circulant structure plays a key role....
[...]
...2 CSK [27] H, T, DM DS Y M 362 CXT [18] H, BP, DM DS Y C 15....
[...]
...Recently the precision plot [6, 27] has been adopted to measure the overall tracking performance....
[...]

Journal Article•DOI•

Object Tracking Benchmark

[...]

Yi Wu¹, Jongwoo Lim², Ming-Hsuan Yang³•Institutions (3)

Nanjing University of Information Science and Technology¹, Hanyang University², University of California, Merced³

01 Sep 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria is carried out to identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

Abstract: Object tracking has been one of the most important and active research areas in the field of computer vision. A large number of tracking algorithms have been proposed in recent years with demonstrated success. However, the set of sequences used for evaluation is often not sufficient or is sometimes biased for certain types of algorithms. Many datasets do not have common ground-truth object positions or extents, and this makes comparisons among the reported quantitative results difficult. In addition, the initial conditions or parameters of the evaluated tracking algorithms are not the same, and thus, the quantitative results reported in literature are incomparable or sometimes contradictory. To address these issues, we carry out an extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria to understand how these methods perform within the same framework. In this work, we first construct a large dataset with ground-truth object positions and extents for tracking and introduce the sequence attributes for the performance analysis. Second, we integrate most of the publicly available trackers into one code library with uniform input and output formats to facilitate large-scale performance evaluation. Third, we extensively evaluate the performance of 31 algorithms on 100 sequences with different initialization settings. By analyzing the quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

2,974 citations

Proceedings Article•DOI•

Accurate scale estimation for robust visual tracking

[...]

Martin Danelljan¹, Gustav Häger¹, Fahad Shahbaz Khan¹, Michael Felsberg¹•Institutions (1)

Linköping University¹

01 Jan 2014

TL;DR: This paper presents a novel approach to robust scale estimation that can handle large scale variations in complex image sequences and shows promising results in terms of accuracy and efficiency.

...read moreread less

Abstract: Robust scale estimation is a challenging problem in visual object tracking. Most existing methods fail to handle large scale variations in complex image sequences. This paper presents a novel appro ...

...read moreread less

2,038 citations

Cites background or methods from "Exploiting the circulant structure ..."

...In recent years, tracking-by-detection methods [3, 9, 11, 19] have shown to provide excellent tracking performance....
[...]
...Given an image patch, the CSK tracker works by learning a kernelized least-squares classifier of the target appearance....
[...]
...Ours ASLA [14] SCM [20] Struck [9] TLD [15] EDFT [6] L1APG [1] DFT [17] LOT [16] CSK [11] LSHT [10] CT [19] Median OP 75....
[...]
...Most tracking-by-detection methods, such as the CSK and MOSSE, are limited to only estimating the target translation....
[...]
...0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 Location error threshold D is ta n c e P re c is io n Precision plot Ours [0.745] Struck [0.659] ASLA [0.612] SCM [0.610] TLD [0.509] LSHT [0.508] EDFT [0.505] CSK [0.502] L1APG [0.472] LOT [0.467] DFT [0.441] CT [0.344] 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 Overlap threshold O v e rl a p P re c is io n Success plot Ours [0.549] ASLA [0.492] SCM [0.477] Struck [0.430] TLD [0.356] LSHT [0.354] EDFT [0.350] CSK [0.350] L1APG [0.350] LOT [0.339] DFT [0.329] CT [0.239] Figure 2: Precision and success plots over all the 28 sequences....
[...]

Proceedings Article•DOI•

Hierarchical Convolutional Features for Visual Tracking

[...]

Chao Ma¹, Jia-Bin Huang², Xiaokang Yang¹, Ming-Hsuan Yang³•Institutions (3)

Shanghai Jiao Tong University¹, University of Illinois at Urbana–Champaign², University of California, Merced³

07 Dec 2015

TL;DR: This paper adaptively learn correlation filters on each convolutional layer to encode the target appearance and hierarchically infer the maximum response of each layer to locate targets.

...read moreread less

Abstract: Visual object tracking is challenging as target objects often undergo significant appearance changes caused by deformation, abrupt motion, background clutter and occlusion. In this paper, we exploit features extracted from deep convolutional neural networks trained on object recognition datasets to improve tracking accuracy and robustness. The outputs of the last convolutional layers encode the semantic information of targets and such representations are robust to significant appearance variations. However, their spatial resolution is too coarse to precisely localize targets. In contrast, earlier convolutional layers provide more precise localization but are less invariant to appearance changes. We interpret the hierarchies of convolutional layers as a nonlinear counterpart of an image pyramid representation and exploit these multiple levels of abstraction for visual tracking. Specifically, we adaptively learn correlation filters on each convolutional layer to encode the target appearance. We hierarchically infer the maximum response of each layer to locate targets. Extensive experimental results on a largescale benchmark dataset show that the proposed algorithm performs favorably against state-of-the-art methods.

...read moreread less

1,812 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Book•DOI•

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

[...]

Bernhard Schölkopf¹, Alexander J. Smola•Institutions (1)

Max Planck Society¹

01 Dec 2001

TL;DR: Learning with Kernels provides an introduction to SVMs and related kernel methods that provide all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms.

...read moreread less

Abstract: From the Publisher: In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory: the Support Vector Machine (SVM). This gave rise to a new class of theoretically elegant learning machines that use a central concept of SVMs-kernels--for a number of learning tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by the choice of the kernel function and the base algorithm. They are replacing neural networks in a variety of fields, including engineering, information retrieval, and bioinformatics. Learning with Kernels provides an introduction to SVMs and related kernel methods. Although the book begins with the basics, it also includes the latest research. It provides all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms and to understand and apply the powerful algorithms that have been developed over the last few years.

...read moreread less

7,880 citations

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

"Exploiting the circulant structure ..." refers methods in this paper

...[10] use branchand-bound optimization to find the maximum of a classifier’s response without 1 We refer the reader to 2 reviews: [8] is more in-depth, while [9, Sec....
[...]

Journal Article•DOI•

Incremental Learning for Robust Visual Tracking

[...]

David A. Ross¹, Jongwoo Lim², Ruei-Sung Lin³, Ming-Hsuan Yang²•Institutions (3)

University of Toronto¹, Honda², Motorola³

01 May 2008-International Journal of Computer Vision

TL;DR: A tracking method that incrementally learns a low-dimensional subspace representation, efficiently adapting online to changes in the appearance of the target, and includes a method for correctly updating the sample mean and a forgetting factor to ensure less modeling power is expended fitting older observations.

...read moreread less

Abstract: Visual tracking, in essence, deals with non-stationary image streams that change over time. While most existing algorithms are able to track objects well in controlled environments, they usually fail in the presence of significant variation of the object's appearance or surrounding illumination. One reason for such failures is that many algorithms employ fixed appearance models of the target. Such models are trained using only appearance data available before tracking begins, which in practice limits the range of appearances that are modeled, and ignores the large volume of information (such as shape changes or specific lighting conditions) that becomes available during tracking. In this paper, we present a tracking method that incrementally learns a low-dimensional subspace representation, efficiently adapting online to changes in the appearance of the target. The model update, based on incremental algorithms for principal component analysis, includes two important features: a method for correctly updating the sample mean, and a forgetting factor to ensure less modeling power is expended fitting older observations. Both of these features contribute measurably to improving overall tracking performance. Numerous experiments demonstrate the effectiveness of the proposed tracking algorithm in indoor and outdoor environments where the target objects undergo large changes in pose, scale, and illumination.

...read moreread less

3,151 citations

"Exploiting the circulant structure ..." refers background in this paper

...4] and IVT [22] that track through scale....
[...]

Proceedings Article•DOI•

Visual object tracking using adaptive correlation filters

[...]

David S. Bolme¹, J. Ross Beveridge¹, Bruce A. Draper¹, Yui Man Lui¹•Institutions (1)

Colorado State University¹

13 Jun 2010

TL;DR: A new type of correlation filter is presented, a Minimum Output Sum of Squared Error (MOSSE) filter, which produces stable correlation filters when initialized using a single frame, which enables the tracker to pause and resume where it left off when the object reappears.

...read moreread less

Abstract: Although not commonly used, correlation filters can track complex objects through rotations, occlusions and other distractions at over 20 times the rate of current state-of-the-art techniques. The oldest and simplest correlation filters use simple templates and generally fail when applied to tracking. More modern approaches such as ASEF and UMACE perform better, but their training needs are poorly suited to tracking. Visual tracking requires robust filters to be trained from a single frame and dynamically adapted as the appearance of the target object changes. This paper presents a new type of correlation filter, a Minimum Output Sum of Squared Error (MOSSE) filter, which produces stable correlation filters when initialized using a single frame. A tracker based upon MOSSE filters is robust to variations in lighting, scale, pose, and nonrigid deformations while operating at 669 frames per second. Occlusion is detected based upon the peak-to-sidelobe ratio, which enables the tracker to pause and resume where it left off when the object reappears.

...read moreread less

2,948 citations

"Exploiting the circulant structure ..." refers background or methods in this paper

...This is a kind of correlation filter that has been proposed recently, called Minimum Output Sum of Squared Error (MOSSE) [12, 15], with a single training image....
[...]
...The Minimum Output Sum of Squared Error (MOSSE) filter [12] has been shown to be competitive with the methods outlined before, but at a fraction of the complexity, and runs at impressive speeds....
[...]
...Also closely related are adaptive correlation filters, rooted on classical signal processing [15, 12]....
[...]
...It produces a linear classifier that does not make use of the Kernel Trick, so we can compute w explicitly, instead of implicitly as α. Plugging it into the KRLS equations, we obtain: This is a kind of correlation filter that has been proposed recently, called Minimum Output Sum of Squared Error (MOSSE) [12, 15], with a single training image....
[...]
...We called it MOSSE2....
[...]

Book•

Toeplitz and circulant matrices

[...]

Robert M. Gray¹•Institutions (1)

Stanford University¹

01 Jan 1977

TL;DR: The fundamental theorems on the asymptotic behavior of eigenvalues, inverses, and products of banded Toeplitz matrices and Toepler matrices with absolutely summable elements are derived in a tutorial manner in the hope of making these results available to engineers lacking either the background or endurance to attack the mathematical literature on the subject.

...read moreread less

Abstract: The fundamental theorems on the asymptotic behavior of eigenvalues, inverses, and products of banded Toeplitz matrices and Toeplitz matrices with absolutely summable elements are derived in a tutorial manner. Mathematical elegance and generality are sacrificed for conceptual simplicity and insight in the hope of making these results available to engineers lacking either the background or endurance to attack the mathematical literature on the subject. By limiting the generality of the matrices considered, the essential ideas and results can be conveyed in a more intuitive manner without the mathematical machinery required for the most general cases. As an application the results are applied to the study of the covariance matrices and their factors of linear models of discrete time random processes.

...read moreread less

2,404 citations

"Exploiting the circulant structure ..." refers background or methods in this paper

...There are a couple of different definitions of C(u) that we will find useful [19]....
[...]
...Since the product C(u)v represents convolution of vectors u and v [19], it can be computed in the Fourier domain, using...
[...]
...Some operations on matrices of the form C(u), like multiplication and inversion, can be done element-wise on the vectors u, if they are transformed to the Fourier domain [19]....
[...]
...The properties of circulant matrices make them particularly amenable to manipulation, since their sums, products and inverses are also circulant [19]....
[...]