scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Computationally efficient deep tracker: Guided MDNet

TL;DR: The main objective of the paper is to recommend an essential improvement to the existing Multi-Domain Convolutional Neural Network tracker (MDNet) which is used to track unknown object in a video-stream.
Abstract: The main objective of the paper is to recommend an essential improvement to the existing Multi-Domain Convolutional Neural Network tracker (MDNet) which is used to track unknown object in a video-stream. MDNet is able to handle major basic tracking challenges like fast motion, background clutter, out of view, scale variations etc. through offline training and online tracking. We pre-train the Convolutional Neural Network (CNN) offline using many videos with ground truth to obtain a target representation in the network. In online tracking the MDNet uses large number of random sample of windows around the previous target for estimating the target in the current frame which make its tracking computationally complex while testing or obtaining the track. The major contribution of the paper is to give guided samples to the MDNet rather than random samples so that the computation and time required by the CNN while tracking could be greatly reduced. Evaluation of the proposed algorithm is done using the videos from the ALOV300++ dataset and the VOT dataset and the results are compared with the state of art trackers.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the YOLOv3 pretraining model is used for ship detection, recognition, and counting in the context of intelligent maritime surveillance, timely ocean rescue, and computer-aided decision-making.
Abstract: Automatic ship detection, recognition, and counting are crucial for intelligent maritime surveillance, timely ocean rescue, and computer-aided decision-making. YOLOv3 pretraining model is used for ...

7 citations

Book ChapterDOI
03 Jul 2019
TL;DR: A novel face recognition method for population search and criminal pursuit in smart cities and a cloud server architecture for face recognition in smart city environments are proposed.
Abstract: Face recognition technology can be applied to many aspects in smart city, and the combination of face recognition and deep learning can bring new applications to the public security. The use of deep learning machine vision technology and video-based image retrieval technology can quickly and easily solve the current problem of quickly finding the missing children and arresting criminal suspects. The main purpose of this paper is to propose a novel face recognition method for population search and criminal pursuit in smart cities. In large and medium-sized security, the face pictures of the most similar face images can be accurately searched in tens of millions of photos. The storage requires a powerful information processing center for a variety of information storage and processing. To fundamentally support the safe operation of a large system, cloud-based network architecture is considered and a smart city cloud computing data center is built. In addition, this paper proposed a cloud server architecture for face recognition in smart city environments.

1 citations

01 Jan 2018
TL;DR: Visual tracking is a computer vision problem where the task is to follow a target through a video sequence to solve the problem of tracking blindfolded people in the dark.
Abstract: Visual tracking is a computer vision problem where the task is to follow a targetthrough a video sequence. Tracking has many important real-world applications in several fields such as autonomous v ...

Cites methods from "Computationally efficient deep trac..."

  • ...Approaches such as MDnet [37] and SiamFC [2] train their networks to output the location of the target....

    [...]

References
More filters
Book ChapterDOI
06 Sep 2014
TL;DR: It is shown that the proposed multi-expert restoration scheme significantly improves the robustness of the base tracker, especially in scenarios with frequent occlusions and repetitive appearance variations.
Abstract: We propose a multi-expert restoration scheme to address the model drift problem in online tracking. In the proposed scheme, a tracker and its historical snapshots constitute an expert ensemble, where the best expert is selected to restore the current tracker when needed based on a minimum entropy criterion, so as to correct undesirable model updates. The base tracker in our formulation exploits an online SVM on a budget algorithm and an explicit feature mapping method for efficient model update and inference. In experiments, our tracking method achieves substantially better overall performance than 32 trackers on a benchmark dataset of 50 video sequences under various evaluation settings. In addition, in experiments with a newly collected dataset of challenging sequences, we show that the proposed multi-expert restoration scheme significantly improves the robustness of our base tracker, especially in scenarios with frequent occlusions and repetitive appearance variations.

1,174 citations


"Computationally efficient deep trac..." refers methods in this paper

  • ...(MUlti-Store Tracker) [20], MEEM (Multiple Expert Entropy minimization) [13], SAMF (Scale Adaptive with Multiple Features) [23], DSST (Discriminative Scale Space Tracker) [24] and KCF [19]....

    [...]

Proceedings ArticleDOI
13 Jun 2010
TL;DR: It is shown that the performance of a binary classifier can be significantly improved by the processing of structured unlabeled data, and a theory that formulates the conditions under which P-N learning guarantees improvement of the initial classifier is proposed and validated on synthetic and real data.
Abstract: This paper shows that the performance of a binary classifier can be significantly improved by the processing of structured unlabeled data, i.e. data are structured if knowing the label of one example restricts the labeling of the others. We propose a novel paradigm for training a binary classifier from labeled and unlabeled examples that we call P-N learning. The learning process is guided by positive (P) and negative (N) constraints which restrict the labeling of the unlabeled set. P-N learning evaluates the classifier on the unlabeled data, identifies examples that have been classified in contradiction with structural constraints and augments the training set with the corrected samples in an iterative process. We propose a theory that formulates the conditions under which P-N learning guarantees improvement of the initial classifier and validate it on synthetic and real data. P-N learning is applied to the problem of on-line learning of object detector during tracking. We show that an accurate object detector can be learned from a single example and an unlabeled video sequence where the object may occur. The algorithm is compared with related approaches and state-of-the-art is achieved on a variety of objects (faces, pedestrians, cars, motorbikes and animals).

1,165 citations


"Computationally efficient deep trac..." refers methods in this paper

  • ...In the case of discriminative methods the model is build using the target and the background simultaneously [5], [4], [7], [11]....

    [...]

Proceedings ArticleDOI
07 Dec 2015
TL;DR: In this paper, a spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location, which allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples.
Abstract: Robust and accurate visual tracking is one of the most challenging computer vision problems. Due to the inherent lack of training data, a robust approach for constructing a target appearance model is crucial. Recently, discriminatively learned correlation filters (DCF) have been successfully applied to address this problem for tracking. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood. However, the periodic assumption also introduces unwanted boundary effects, which severely degrade the quality of the tracking model. We propose Spatially Regularized Discriminative Correlation Filters (SRDCF) for tracking. A spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location. Our SRDCF formulation allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples. We further propose an optimization strategy, based on the iterative Gauss-Seidel method, for efficient online learning of our SRDCF. Experiments are performed on four benchmark datasets: OTB-2013, ALOV++, OTB-2015, and VOT2014. Our approach achieves state-of-the-art results on all four datasets. On OTB-2013 and OTB-2015, we obtain an absolute gain of 8.0% and 8.2% respectively, in mean overlap precision, compared to the best existing trackers.

1,160 citations

Journal ArticleDOI
TL;DR: A framework for learning robust, adaptive, appearance models to be used for motion-based tracking of natural objects to provide robustness in the face of image outliers, while adapting to natural changes in appearance such as those due to facial expressions or variations in 3D pose.
Abstract: We propose a framework for learning robust, adaptive, appearance models to be used for motion-based tracking of natural objects. The model adapts to slowly changing appearance, and it maintains a natural measure of the stability of the observed image structure during tracking. By identifying stable properties of appearance, we can weight them more heavily for motion estimation, while less stable properties can be proportionately downweighted. The appearance model involves a mixture of stable image structure, learned over long time courses, along with two-frame motion information and an outlier process. An online EM-algorithm is used to adapt the appearance model parameters over time. An implementation of this approach is developed for an appearance model based on the filter responses from a steerable pyramid. This model is used in a motion-based tracking algorithm to provide robustness in the face of image outliers, such as those caused by occlusions, while adapting to natural changes in appearance such as those due to facial expressions or variations in 3D pose.

1,142 citations

Proceedings ArticleDOI
07 Dec 2015
TL;DR: The results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers, and show that the convolutional features provide improved results compared to standard hand-crafted features.
Abstract: Visual object tracking is a challenging computer vision problem with numerous real-world applications. This paper investigates the impact of convolutional features for the visual tracking problem. We propose to use activations from the convolutional layer of a CNN in discriminative correlation filter based tracking frameworks. These activations have several advantages compared to the standard deep features (fully connected layers). Firstly, they miti-gate the need of task specific fine-tuning. Secondly, they contain structural information crucial for the tracking problem. Lastly, these activations have low dimensionality. We perform comprehensive experiments on three benchmark datasets: OTB, ALOV300++ and the recently introduced VOT2015. Surprisingly, different to image classification, our results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers. Our results further show that the convolutional features provide improved results compared to standard hand-crafted features. Finally, results comparable to state-of-the-art trackers are obtained on all three benchmark datasets.

961 citations


"Computationally efficient deep trac..." refers background or methods in this paper

  • ...The proposed method has been compared with the state of art trackers like MDNet, DeepSRDCF (Deep Spatially Regularized Discriminative Correlation Filters) [22], MUSTer...

    [...]

  • ...The proposed method has been compared with the state of art trackers like MDNet, DeepSRDCF (Deep Spatially Regularized Discriminative Correlation Filters) [22], MUSTer (MUlti-Store Tracker) [20], MEEM (Multiple Expert Entropy minimization) [13], SAMF (Scale Adaptive with Multiple Features) [23], DSST (Discriminative Scale Space Tracker) [24] and KCF [19]....

    [...]

  • ...Correlation filter trackers outstands many other trackers in terms of their computational efficiency and competitive performance [22], [8], [19], [20]....

    [...]