scispace - formally typeset
Book ChapterDOI

The Eighth Visual Object Tracking VOT2020 Challenge Results

Matej Kristan, +109 more
- pp 547-601
TLDR
A significant novelty is introduction of a new VOT short-term tracking evaluation methodology, and introduction of segmentation ground truth in the VOT-ST2020 challenge – bounding boxes will no longer be used in theVDT challenges.
Abstract
The Visual Object Tracking challenge VOT2020 is the eighth annual tracker benchmarking activity organized by the VOT initiative. Results of 58 trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The VOT2020 challenge was composed of five sub-challenges focusing on different tracking domains: (i) VOT-ST2020 challenge focused on short-term tracking in RGB, (ii) VOT-RT2020 challenge focused on “real-time” short-term tracking in RGB, (iii) VOT-LT2020 focused on long-term tracking namely coping with target disappearance and reappearance, (iv) VOT-RGBT2020 challenge focused on short-term tracking in RGB and thermal imagery and (v) VOT-RGBD2020 challenge focused on long-term tracking in RGB and depth imagery. Only the VOT-ST2020 datasets were refreshed. A significant novelty is introduction of a new VOT short-term tracking evaluation methodology, and introduction of segmentation ground truth in the VOT-ST2020 challenge – bounding boxes will no longer be used in the VOT-ST challenges. A new VOT Python toolkit that implements all these novelites was introduced. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).

read more

Citations
More filters
Journal ArticleDOI

RFN-Nest: An end-to-end residual fusion network for infrared and visible images

TL;DR: A residual fusion network (RFN) which is based on a residual architecture to replace the traditional fusion approach is proposed which delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation.
Posted Content

Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation

TL;DR: This work proposes a novel, flexible, and accurate refinement module called Alpha-Refine (AR), which can significantly improve the base trackers’ box estimation quality and leads to a series of strengthened trackers, among which the ARSiamRPN (AR strengthened Siam RPNpp) and the ARDiMP50 ( AR strengthened DiMP50) achieve good efficiency-precision trade-off.
Proceedings ArticleDOI

STMTrack: Template-free Visual Tracking with Space-time Memory Networks

TL;DR: Zhang et al. as mentioned in this paper proposed a novel tracking framework built on top of a space-time memory network that is competent to make full use of historical information related to the target for better adapting to appearance variations during tracking.
Posted Content

STMTrack: Template-free Visual Tracking with Space-time Memory Networks

TL;DR: A novel tracking framework built on top of a space-time memory network that is competent to make full use of historical information related to the target for better adapting to appearance variations during tracking is proposed.
Proceedings ArticleDOI

MixFormer: End-to-End Tracking with Iterative Mixed Attention

TL;DR: This paper proposes a compact tracking framework, termed as MixFormer, built upon transformers, to utilize the flexibility of attention operations, and proposes a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.
References
More filters
Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Book ChapterDOI

Microsoft COCO: Common Objects in Context

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Posted Content

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Book ChapterDOI

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

TL;DR: This work extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries and applies the depthwise separable convolution to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network.
Related Papers (5)