SANet: Structure-Aware Network for Visual Tracking

doi:10.1109/CVPRW.2017.275

Open AccessProceedings ArticleDOI

SANet: Structure-Aware Network for Visual Tracking

Heng Fan, +1 more

- pp 2217-2224

Chats0

TLDR

SANet as mentioned in this paper utilizes recurrent neural network (RNN) to model object structure, and incorporate it into CNN to improve its robustness to similar distractors, considering that convolutional layers in different levels characterize the object from different perspectives.

Abstract:

Convolutional neural network (CNN) has drawn increasing interest in visual tracking owing to its powerfulness in feature extraction. Most existing CNN-based trackers treat tracking as a classification problem. However, these trackers are sensitive to similar distractors because their CNN models mainly focus on inter-class classification. To address this problem, we use self-structure information of object to distinguish it from distractors. Specifically, we utilize recurrent neural network (RNN) to model object structure, and incorporate it into CNN to improve its robustness to similar distractors. Considering that convolutional layers in different levels characterize the object from different perspectives, we use multiple RNNs to model object structure in different levels respectively. Extensive experiments on three benchmarks, OTB100, TC-128 and VOT2015, show that the proposed algorithm outperforms other methods. Code is released at www.dabi.temple.edu/hbling/code/SANet/SANet.html.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep visual tracking: Review and experimental comparison

Peixia Li, +3 more

- 01 Apr 2018 -

Pattern Recognition

TL;DR: The background of deep visual tracking is introduced, including the fundamental concepts of visual tracking and related deep learning algorithms, and the existing deep-learning-based trackers are categorize into three classes according to network structure, network function and network training.

...read moreread less

Proceedings ArticleDOI

Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking

Heng Fan, +1 more

TL;DR: C-RPN as discussed by the authors proposes a multi-stage tracking framework, which consists of a sequence of RPNs cascaded from deep high-level to shallow low-level layers in a Siamese network.

...read moreread less

Proceedings ArticleDOI

Graph Convolutional Tracking

Junyu Gao, +2 more

TL;DR: The GCT jointly incorporates two types of Graph Convolutional Networks into a siamese framework for target appearance modeling and adopts a spatial-temporal GCN to model the structured representation of historical target exemplars.

...read moreread less

Proceedings ArticleDOI

Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking

Heng Fan, +1 more

TL;DR: Zhang et al. as mentioned in this paper proposed a parallel tracking and verifying (PTAV) framework, which consists of two components, a tracker T and a verifier V, working in parallel on two separate threads.

...read moreread less

Book ChapterDOI

Real-Time MDNet

Ilchae Jung, +3 more

TL;DR: This work presents a fast and accurate visual tracking algorithm based on the multi-domain convolutional neural network (MDNet) that accelerates feature extraction procedure and learns more discriminative models for instance classification; it enhances representation quality of target and background by maintaining a high resolution feature map with a large receptive field per activation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Journal ArticleDOI

Finding Structure in Time

Jeffrey L. Elman

- 01 Mar 1990 -

Cognitive Science

TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.

...read moreread less

Collapse

Related Papers (5)

Learning Multi-domain Convolutional Neural Networks for Visual Tracking

Hyeonseob Nam, +1 more

Object Tracking Benchmark

Yi Wu, +2 more

- 01 Sep 2015 -

IEEE Transactions on Pattern Analysis an...

SANet: Structure-Aware Network for Visual Tracking

Citations

Deep visual tracking: Review and experimental comparison

Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking

Graph Convolutional Tracking

Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking

Real-Time MDNet

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Fully convolutional networks for semantic segmentation

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Finding Structure in Time

Related Papers (5)

Learning Multi-domain Convolutional Neural Networks for Visual Tracking

Object Tracking Benchmark

Online Object Tracking: A Benchmark

Fully-Convolutional Siamese Networks for Object Tracking

Hierarchical Convolutional Features for Visual Tracking