Learning Policies for Adaptive Tracking with Deep Feature Cascades

doi:10.1109/ICCV.2017.21

Open AccessProceedings ArticleDOI

Learning Policies for Adaptive Tracking with Deep Feature Cascades

- pp 105-114

TLDR

In this paper, the authors formulate the adaptive tracking problem as a decision-making process, and learn an agent to decide whether to locate objects with high confidence on an early layer, or continue processing subsequent layers of a network.

Abstract:

Visual object tracking is a fundamental and time-critical vision task. Recent years have seen many shallow tracking methods based on real-time pixel-based correlation filters, as well as deep methods that have top performance but need a high-end GPU. In this paper, we learn to improve the speed of deep trackers without losing accuracy. Our fundamental insight is to take an adaptive approach, where easy frames are processed with cheap features (such as pixel values), while challenging frames are processed with invariant but expensive deep features. We formulate the adaptive tracking problem as a decision-making process, and learn an agent to decide whether to locate objects with high confidence on an early layer, or continue processing subsequent layers of a network. This significantly reduces the feedforward cost for easy frames with distinct or slow-moving objects. We train the agent offline in a reinforcement learning fashion, and further demonstrate that learning all deep layers (so as to provide good features for adaptive tracking) can lead to near real-time average tracking speed of 23 fps on a single CPU while achieving state-of-the-art performance. Perhaps most tellingly, our approach provides a 100X speedup for almost 50% of the time, indicating the power of an adaptive approach.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

A Twofold Siamese Network for Real-Time Object Tracking

Anfeng He, +3 more

TL;DR: The proposed SA-Siam outperforms all other real-time trackers by a large margin on OTB-2013/50/100 benchmarks and proposes a channel attention mechanism for the semantic branch.

...read moreread less

Proceedings ArticleDOI

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, +3 more

TL;DR: In this article, a Siamese-like tracking pipeline is proposed to exploit the rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. And the proposed transformer-assisted tracking framework is neat and trained in an end-to-end manner.

...read moreread less

Proceedings ArticleDOI

Unsupervised Deep Tracking

Ning Wang, +5 more

TL;DR: The proposed unsupervised tracker achieves the baseline accuracy of fully supervised trackers, which require complete and accurate labels during training, and exhibits a potential in leveraging unlabeled or weakly labeled data to further improve the tracking accuracy.

...read moreread less

Proceedings ArticleDOI

Graph Convolutional Tracking

Junyu Gao, +2 more

TL;DR: The GCT jointly incorporates two types of Graph Convolutional Networks into a siamese framework for target appearance modeling and adopts a spatial-temporal GCN to model the structured representation of historical target exemplars.

...read moreread less

Book ChapterDOI

Learning Dynamic Memory Networks for Object Tracking

Tianyu Yang, +1 more

TL;DR: In this paper, a dynamic memory network is proposed to adapt the template to the target's appearance variations during tracking, where an LSTM is used as a memory controller, where the input is the search feature map and the outputs are the control signals for the reading and writing process of the memory block.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Collapse

IEEE Transactions on Pattern Analysis an...

Online Object Tracking: A Benchmark

Yi Wu, +2 more

ECO: Efficient Convolution Operators for Tracking

Martin Danelljan, +3 more

Learning Policies for Adaptive Tracking with Deep Feature Cascades

Citations

A Twofold Siamese Network for Real-Time Object Tracking

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Unsupervised Deep Tracking

Graph Convolutional Tracking

Learning Dynamic Memory Networks for Object Tracking

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Reinforcement Learning: An Introduction

Histograms of oriented gradients for human detection

ImageNet Large Scale Visual Recognition Challenge

Related Papers (5)

Learning Multi-domain Convolutional Neural Networks for Visual Tracking

Fully-Convolutional Siamese Networks for Object Tracking

Object Tracking Benchmark

Online Object Tracking: A Benchmark

ECO: Efficient Convolution Operators for Tracking