Topic

Video tracking

About: Video tracking is a research topic. Over the lifetime, 37017 publications have been published within this topic receiving 735989 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Mining actionlet ensemble for action recognition with depth cameras

[...]

Jiang Wang¹, Zicheng Liu², Ying Wu¹, Junsong Yuan³•Institutions (3)

Northwestern University¹, Microsoft², Nanyang Technological University³

16 Jun 2012

TL;DR: An actionlet ensemble model is learnt to represent each action and to capture the intra-class variance, and novel features that are suitable for depth data are proposed.

...read moreread less

Abstract: Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations in the actions. In this paper, an actionlet ensemble model is learnt to represent each action and to capture the intra-class variance. In addition, novel features that are suitable for depth data are proposed. They are robust to noise, invariant to translational and temporal misalignments, and capable of characterizing both the human motion and the human-object interactions. The proposed approach is evaluated on two challenging action recognition datasets captured by commodity depth cameras, and another dataset captured by a MoCap system. The experimental evaluations show that the proposed approach achieves superior performance to the state of the art algorithms.

...read moreread less

1,578 citations

Book Chapter•DOI•

Real-time compressive tracking

[...]

Kaihua Zhang¹, Lei Zhang¹, Ming-Hsuan Yang²•Institutions (2)

Hong Kong Polytechnic University¹, University of California, Merced²

07 Oct 2012

TL;DR: A simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from the multi-scale image feature space with data-independent basis that performs favorably against state-of-the-art algorithms on challenging sequences in terms of efficiency, accuracy and robustness.

...read moreread less

Abstract: It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. While much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, these mis-aligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from the multi-scale image feature space with data-independent basis. Our appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is adopted to efficiently extract the features for the appearance model. We compress samples of foreground targets and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art algorithms on challenging sequences in terms of efficiency, accuracy and robustness.

...read moreread less

1,538 citations

Journal Article•DOI•

Struck: Structured Output Tracking with Kernels

[...]

Sam Hare, Stuart Golodetz¹, Amir Saffari, Vibhav Vineet², Ming-Ming Cheng³, Stephen Hicks¹, Philip H. S. Torr¹ - Show less +3 more•Institutions (3)

University of Oxford¹, Stanford University², Nankai University³

01 Oct 2016-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A framework for adaptive visual object tracking based on structured output prediction that is able to outperform state-of-the-art trackers on various benchmark videos and can easily incorporate additional features and kernels into the framework, which results in increased tracking performance.

...read moreread less

Abstract: Adaptive tracking-by-detection methods are widely used in computer vision for tracking arbitrary objects. Current approaches treat the tracking problem as a classification task and use online learning techniques to update the object model. However, for these updates to happen one needs to convert the estimated object position into a set of labelled training examples, and it is not clear how best to perform this intermediate step. Furthermore, the objective for the classifier (label prediction) is not explicitly coupled to the objective for the tracker (estimation of object position). In this paper, we present a framework for adaptive visual object tracking based on structured output prediction. By explicitly allowing the output space to express the needs of the tracker, we avoid the need for an intermediate classification step. Our method uses a kernelised structured output support vector machine (SVM), which is learned online to provide adaptive tracking. To allow our tracker to run at high frame rates, we (a) introduce a budgeting mechanism that prevents the unbounded growth in the number of support vectors that would otherwise occur during tracking, and (b) show how to implement tracking on the GPU. Experimentally, we show that our algorithm is able to outperform state-of-the-art trackers on various benchmark videos. Additionally, we show that we can easily incorporate additional features and kernels into our framework, which results in increased tracking performance.

...read moreread less

1,507 citations

Proceedings Article•DOI•

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

[...]

Bo Li¹, Wei Wu¹, Qiang Wang¹, Fangyi Zhang², Junliang Xing, Junjie Yan² - Show less +2 more•Institutions (2)

SenseTime¹, Chinese Academy of Sciences²

01 Jun 2019

TL;DR: This work proves the core reason Siamese trackers still have accuracy gap comes from the lack of strict translation invariance, and proposes a new model architecture to perform depth-wise and layer-wise aggregations, which not only improves the accuracy but also reduces the model size.

...read moreread less

Abstract: Siamese network based trackers formulate tracking as convolutional feature cross-correlation between target template and searching region. However, Siamese trackers still have accuracy gap compared with state-of-the-art algorithms and they cannot take advantage of feature from deep networks, such as ResNet-50 or deeper. In this work we prove the core reason comes from the lack of strict translation invariance. By comprehensive theoretical analysis and experimental validations, we break this restriction through a simple yet effective spatial aware sampling strategy and successfully train a ResNet-driven Siamese tracker with significant performance gain. Moreover, we propose a new model architecture to perform depth-wise and layer-wise aggregations, which not only further improves the accuracy but also reduces the model size. We conduct extensive ablation studies to demonstrate the effectiveness of the proposed tracker, which obtains currently the best results on four large tracking benchmarks, including OTB2015, VOT2018, UAV123, and LaSOT. Our model will be released to facilitate further studies based on this problem.

...read moreread less

1,478 citations

Proceedings Article•DOI•

You'll never walk alone: Modeling social behavior for multi-target tracking

[...]

Stefano Pellegrini¹, Andreas Ess¹, Konrad Schindler¹, L. Van Gool¹•Institutions (1)

ETH Zurich¹

01 Sep 2009

TL;DR: A model of dynamic social behavior, inspired by models developed for crowd simulation, is introduced, trained with videos recorded from birds-eye view at busy locations, and applied as a motion model for multi-people tracking from a vehicle-mounted camera.

...read moreread less

Abstract: Object tracking typically relies on a dynamic model to predict the object's location from its past trajectory. In crowded scenarios a strong dynamic model is particularly important, because more accurate predictions allow for smaller search regions, which greatly simplifies data association. Traditional dynamic models predict the location for each target solely based on its own history, without taking into account the remaining scene objects. Collisions are resolved only when they happen. Such an approach ignores important aspects of human behavior: people are driven by their future destination, take into account their environment, anticipate collisions, and adjust their trajectories at an early stage in order to avoid them. In this work, we introduce a model of dynamic social behavior, inspired by models developed for crowd simulation. The model is trained with videos recorded from birds-eye view at busy locations, and applied as a motion model for multi-people tracking from a vehicle-mounted camera. Experiments on real sequences show that accounting for social interactions and scene knowledge improves tracking performance, especially during occlusions.

...read moreread less

1,372 citations

Collapse

Network Information

Performance

Metrics

38,415

Papers

824,516

Citations

No. of papers in the topic in previous years
Year	Papers
2023	400
2022	998
2021	952
2020	1,219
2019	1,304
2018	1,327

Video tracking

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics