scispace - formally typeset
Search or ask a question

Showing papers by "Thomas Sikora published in 2017"


Proceedings ArticleDOI
01 Aug 2017
TL;DR: This work presents a tracking-by-detection algorithm which can compete with more sophisticated approaches at a fraction of the computational cost and shows with thorough experiments its potential using a wide range of object detectors.
Abstract: Tracking-by-detection is a common approach to multi-object tracking. With ever increasing performances of object detectors, the basis for a tracker becomes much more reliable. In combination with commonly higher frame rates, this poses a shift in the challenges for a successful tracker. That shift enables the deployment of much simpler tracking algorithms which can compete with more sophisticated approaches at a fraction of the computational cost. We present such an algorithm and show with thorough experiments its potential using a wide range of object detectors. The proposed method can easily run at 100K fps while outperforming the state-of-the-art on the DETRAC vehicle tracking dataset.

497 citations


Proceedings ArticleDOI
01 Aug 2017
TL;DR: The AVSS2017 Challenge on Advanced Traffic Monitoring, in conjunction with the International Workshop on Traffic and Street Surveillance for Safety and Security (IWT4S), to evaluate the state-of-the-art object detection and multi-object tracking algorithms in the relevance of traffic surveillance.
Abstract: The rapid advances of transportation infrastructure have led to a dramatic increase in the demand for smart systems capable of monitoring traffic and street safety. Fundamental to these applications are a community-based evaluation platform and benchmark for object detection and multi-object tracking. To this end, we organize the AVSS2017 Challenge on Advanced Traffic Monitoring, in conjunction with the International Workshop on Traffic and Street Surveillance for Safety and Security (IWT4S), to evaluate the state-of-the-art object detection and multi-object tracking algorithms in the relevance of traffic surveillance. Submitted algorithms are evaluated using the large-scale UA-DETRAC benchmark and evaluation protocol. The benchmark, the evaluation toolkit and the algorithm performance are publicly available from the website http://detrac-db.rit.albany.edu.

80 citations


Proceedings ArticleDOI
14 Sep 2017
TL;DR: An evolutionary algorithm-based framework to automatically optimize the CNN structure by means of hyper-parameters is proposed and extended towards a joint optimization of a committee of CNNs to leverage specialization and cooperation among the individual networks.
Abstract: In a broad range of computer vision tasks, convolutional neural networks (CNNs) are one of the most prominent techniques due to their outstanding performance. Yet it is not trivial to find the best performing network structure for a specific application because it is often unclear how the network structure relates to the network accuracy. We propose an evolutionary algorithm-based framework to automatically optimize the CNN structure by means of hyper-parameters. Further, we extend our framework towards a joint optimization of a committee of CNNs to leverage specialization and cooperation among the individual networks. Experimental results show a significant improvement over the state-of-the-art on the well-established MNIST dataset for hand-written digits recognition.

76 citations


Proceedings ArticleDOI
01 Aug 2017
TL;DR: The baseline GMPHD filter and its extension are evaluated on the UA-DETRAC benchmark, showing that combining both methods leads to a higher recall and a better quality of object tracks to the cost of increased computational complexity and increased sensitivity to false-positives.
Abstract: This work applies the Gaussian Mixture Probability Hypothesis Density (GMPHD) Filter to multi-object tracking in video data. In order to take advantage of additional visual information, Kernelized Correlation Filters (KCF) are evaluated as a possible extension of the GMPHD tracking-by-detection scheme to enhance its performance. The baseline GMPHD filter and its extension are evaluated on the UA-DETRAC benchmark, showing that combining both methods leads to a higher recall and a better quality of object tracks to the cost of increased computational complexity and increased sensitivity to false-positives.

66 citations


Journal ArticleDOI
TL;DR: This work presents a novel feature using Lagrangian direction fields that is based on a spatio-temporal model and uses appearance, background motion compensation, and long-term motion information to detect violent scenes in video footage.
Abstract: Lagrangian theory provides a rich set of tools for analyzing non-local, long-term motion information in computer vision applications. Based on this theory, we present a specialized Lagrangian technique for the automated detection of violent scenes in video footage. We present a novel feature using Lagrangian direction fields that is based on a spatio-temporal model and uses appearance, background motion compensation, and long-term motion information. To ensure appropriate spatial and temporal feature scales, we apply an extended bag-of-words procedure in a late-fusion manner as a classification scheme on a per-video basis. We demonstrate that the temporal scale, captured by the Lagrangian integration time parameter, is crucial for violence detection and show how it correlates to the spatial scale of characteristic events in the scene. The proposed system is validated on multiple public benchmarks and non-public, real-world data from the London Metropolitan Police. Our experiments confirm that the inclusion of Lagrangian measures is a valuable cue for automated violence detection and increases the classification performance considerably compared with the state-of-the-art methods.

61 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: The proposed framework, called Steered Mixture-of-Experts (SMoE), enables a multitude of processing tasks on light fields using a single unified Bayesian model that takes into account different regions of the scene, their edges, and their development along the spatial and disparity dimensions.
Abstract: The proposed framework, called Steered Mixture-of-Experts (SMoE), enables a multitude of processing tasks on light fields using a single unified Bayesian model. The underlying assumption is that light field rays are instantiations of a non-linear or non-stationary random process that can be modeled by piecewise stationary processes in the spatial domain. As such, it is modeled as a space-continuous Gaussian Mixture Model. Consequently, the model takes into account different regions of the scene, their edges, and their development along the spatial and disparity dimensions. Applications presented include light field coding, depth estimation, edge detection, segmentation, and view interpolation. The representation is compact, which allows for very efficient compression yielding state-of-the-art coding results for low bit-rates. Furthermore, due to the statistical representation, a vast amount of information can be queried from the model even without having to analyze the pixel values. This allows for “blind” light field processing and classification.

37 citations


Proceedings ArticleDOI
01 Mar 2017
TL;DR: A deeper analysis of the tolerance of the activation function is given through recycling color models in video sequences, yielding a high quality reconstruction over a considerable range of frames.
Abstract: We propose a novel approach for modeling and coding color in images and video. Luminance is linearly correlated with chrominance locally, as such we can predict color given the luma value. Using the Steered Mixture-of-Experts (SMoE) approach, the image is viewed as a stochastic process over 5 random variables including the 2-D pixel locations, 1 luminance and 2 chrominance values. We model this process as a continuous joint density function by fitting a K-modal 5-D Gaussian Mixture Model (GMM). As such, the chroma values are predicted as the expectation of the conditional density. To validate, the technique was integrated within JPEG showing PSNR gains in the lower bitrate regions. A deeper analysis of the tolerance of the activation function is given through recycling color models in video sequences, yielding a high quality reconstruction over a considerable range of frames.

4 citations


Proceedings ArticleDOI
01 Aug 2017
TL;DR: This work proposes two different post-detection filters designed to enhance the performance of custom person detectors that do not impose a high computational load and are not limited to a specific detector method.
Abstract: Tracking-by-detection becomes more and more popular for visual pedestrian tracking applications. However, it requires accurate and reliable detections in order to obtain good results. In this work, we propose two different post-detection filters designed to enhance the performance of custom person detectors. Using a popular deformable-parts-based pedestrian detector as a baseline, a detailed comparison over multiple test videos is performed and the gain of both algorithms is proven. Further analysis shows that the improved detection outcomes also lead to improved tracking results. We thus found that the usage of the proposed post-detection filters is recommendable as they do not impose a high computational load and are not limited to a specific detector method.

4 citations


Proceedings ArticleDOI
14 Sep 2017
TL;DR: This work proposes a new evaluation metric which is focused on an end-user application case and an evaluation protocol which eliminates uncertainties in previous performance assessments and shows the features of the novel metric on multiple datasets proving its advantages over previously used measures.
Abstract: Scientific interest in automated abandoned object detection algorithms using visual information is high and many related systems have been published in recent years. However, most evaluation techniques rely only on statistical evaluation on the object level. Therefore and due to benchmarks with commonly only few abandoned objects and a non-standardized evaluation procedure, an objective performance comparison between different methods is generally hard. We propose a new evaluation metric which is focused on an end-user application case and an evaluation protocol which eliminates uncertainties in previous performance assessments. Using two variants of an abandoned object detection method, we show the features of the novel metric on multiple datasets proving its advantages over previously used measures.

2 citations


Proceedings ArticleDOI
01 Oct 2017
TL;DR: A motion saliency model that exploits motion features on spatial level and also an approach for consideration of global motion in the temporal dimension are proposed, leading to further improvements in the accuracy of video quality assessment.
Abstract: This work focuses on considering motion towards improving video quality assessment algorithms. The improvement refers to improving computational video quality assessment algorithms in order to be in closer agreement with the subjective evaluation of video quality. We propose a motion saliency model that exploits motion features on spatial level and also an approach for consideration of global motion in the temporal dimension, leading to further improvements in the accuracy of video quality assessment. We perform evaluation by integrating our approaches in existing objective quality models and also by comparing them to existing related state-of-the-art video quality assessment methods.