scispace - formally typeset
Open AccessPosted Content

LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking

Reads0
Chats0
TLDR
The LaSOT benchmark as discussed by the authors provides a high-quality benchmark for large-scale single object tracking, which consists of 1,400 sequences with more than 3.5M frames in total.
Abstract
In this paper, we present LaSOT, a high-quality benchmark for Large-scale Single Object Tracking. LaSOT consists of 1,400 sequences with more than 3.5M frames in total. Each frame in these sequences is carefully and manually annotated with a bounding box, making LaSOT the largest, to the best of our knowledge, densely annotated tracking benchmark. The average video length of LaSOT is more than 2,500 frames, and each sequence comprises various challenges deriving from the wild where target objects may disappear and re-appear again in the view. By releasing LaSOT, we expect to provide the community with a large-scale dedicated benchmark with high quality for both the training of deep trackers and the veritable evaluation of tracking algorithms. Moreover, considering the close connections of visual appearance and natural language, we enrich LaSOT by providing additional language specification, aiming at encouraging the exploration of natural linguistic feature for tracking. A thorough experimental evaluation of 35 tracking algorithms on LaSOT is presented with detailed analysis, and the results demonstrate that there is still a big room for improvements.

read more

Citations
More filters
Journal ArticleDOI

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

TL;DR: A large tracking database that offers an unprecedentedly wide coverage of common moving objects in the wild, called GOT-10k, and the first video trajectory dataset that uses the semantic hierarchy of WordNet to guide class population, which ensures a comprehensive and relatively unbiased coverage of diverse moving objects.
Proceedings ArticleDOI

ATOM: Accurate Tracking by Overlap Maximization

TL;DR: ATOM as discussed by the authors proposes a novel tracking architecture consisting of dedicated target estimation and classification components, which is trained to predict the overlap between the target object and an estimated bounding box.
Proceedings ArticleDOI

Siam R-CNN: Visual Tracking by Re-Detection

TL;DR: This work presents Siam R-CNN, a Siamese re-detection architecture which unleashes the full power of two-stage object detection approaches for visual object tracking, and combines this with a novel tracklet-based dynamic programming algorithm to model the full history of both the object to be tracked and potential distractor objects.
Proceedings ArticleDOI

The Seventh Visual Object Tracking VOT2019 Challenge Results

Matej Kristan, +179 more
TL;DR: The Visual Object Tracking challenge VOT2019 is the seventh annual tracker benchmarking activity organized by the VOT initiative; results of 81 trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years.
Proceedings ArticleDOI

Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking

TL;DR: C-RPN as discussed by the authors proposes a multi-stage tracking framework, which consists of a sequence of RPNs cascaded from deep high-level to shallow low-level layers in a Siamese network.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Journal ArticleDOI

Object tracking: A survey

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.
Related Papers (5)