Fully-Convolutional Siamese Networks for Object Tracking

doi:10.1007/978-3-319-48881-3_56

Open AccessBook ChapterDOI

Fully-Convolutional Siamese Networks for Object Tracking

- Vol. 9914, pp 850-865

TLDR

A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.

Abstract:

The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object’s appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves state-of-the-art performance in multiple benchmarks.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

High Performance Visual Tracking with Siamese Region Proposal Network

Bo Li, +4 more

TL;DR: The Siamese region proposal network (Siamese-RPN) is proposed which is end-to-end trained off-line with large-scale image pairs for visual object tracking and consists of SiAMESe subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch.

...read moreread less

Proceedings ArticleDOI

ECO: Efficient Convolution Operators for Tracking

Martin Danelljan, +3 more

TL;DR: This work revisit the core DCF formulation and introduces a factorized convolution operator, which drastically reduces the number of parameters in the model, and a compact generative model of the training sample distribution that significantly reduces memory and time complexity, while providing better diversity of samples.

...read moreread less

Posted Content

Exploring Simple Siamese Representation Learning

Xinlei Chen, +1 more

- 20 Nov 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Surprising empirical results are reported that simple Siamese networks can learn meaningful representations even using none of the following: (i) negative sample pairs, (ii) large batches, (iii) momentum encoders.

...read moreread less

Proceedings ArticleDOI

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

Bo Li, +5 more

TL;DR: This work proves the core reason Siamese trackers still have accuracy gap comes from the lack of strict translation invariance, and proposes a new model architecture to perform depth-wise and layer-wise aggregations, which not only improves the accuracy but also reduces the model size.

...read moreread less

Proceedings ArticleDOI

End-to-End Representation Learning for Correlation Filter Based Tracking

Jack Valmadre, +4 more

TL;DR: In this paper, the Correlation Filter learner is interpreted as a differentiable layer in a deep neural network, which enables learning deep features that are tightly coupled to the correlation filter.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Kaiming He, +3 more

TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.

...read moreread less

Proceedings ArticleDOI

FaceNet: A unified embedding for face recognition and clustering

Florian Schroff, +2 more

TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.

...read moreread less

Collapse

Related Papers (5)

Object Tracking Benchmark

Yi Wu, +2 more

- 01 Sep 2015 -

IEEE Transactions on Pattern Analysis an...

High-Speed Tracking with Kernelized Correlation Filters

João F. Henriques, +3 more

- 01 Mar 2015 -

IEEE Transactions on Pattern Analysis an...

Fully-Convolutional Siamese Networks for Object Tracking

Citations

High Performance Visual Tracking with Siamese Region Proposal Network

ECO: Efficient Convolution Operators for Tracking

Exploring Simple Siamese Representation Learning

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

End-to-End Representation Learning for Correlation Filter Based Tracking

References

ImageNet Classification with Deep Convolutional Neural Networks

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

ImageNet Large Scale Visual Recognition Challenge

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

FaceNet: A unified embedding for face recognition and clustering

Related Papers (5)

Object Tracking Benchmark

High-Speed Tracking with Kernelized Correlation Filters

Online Object Tracking: A Benchmark

ECO: Efficient Convolution Operators for Tracking

High Performance Visual Tracking with Siamese Region Proposal Network