Conference
German Conference on Pattern Recognition
About: German Conference on Pattern Recognition is an academic conference. The conference publishes majorly in the area(s): Segmentation & Computer science. Over the lifetime, 392 publications have been published by the conference receiving 7971 citations.
Papers
More filters
••
02 Sep 2014TL;DR: A structured lighting system for creating high-resolution stereo datasets of static indoor scenes with highly accurate ground-truth disparities using novel techniques for efficient 2D subpixel correspondence search and self-calibration of cameras and projectors with modeling of lens distortion is presented.
Abstract: We present a structured lighting system for creating high-resolution stereo datasets of static indoor scenes with highly accurate ground-truth disparities. The system includes novel techniques for efficient 2D subpixel correspondence search and self-calibration of cameras and projectors with modeling of lens distortion. Combining disparity estimates from multiple projector positions we are able to achieve a disparity accuracy of 0.2 pixels on most observed surfaces, including in half-occluded regions. We contribute 33 new 6-megapixel datasets obtained with our system and demonstrate that they present new challenges for the next generation of stereo algorithms.
1,071 citations
••
03 Sep 2013TL;DR: The paper describes how to embody specific spatial relations in a representation called Spatial Pattern Templates (SPT), which allows us to capture regularity constraints of alignment and equal spacing in pairwise and ternary potentials.
Abstract: We propose a method for semantic parsing of images with regular structure The structured objects are modeled in a densely connected CRF The paper describes how to embody specific spatial relations in a representation called Spatial Pattern Templates (SPT), which allows us to capture regularity constraints of alignment and equal spacing in pairwise and ternary potentials
321 citations
••
02 Sep 2014TL;DR: This paper follows a two-step approach where it first learns to predict a semantic representation from video and then generates natural language descriptions from it, and model across-sentence consistency at the level of the SR by enforcing a consistent topic.
Abstract: Humans can easily describe what they see in a coherent way and at varying level of detail. However, existing approaches for automatic video description focus on generating only single sentences and are not able to vary the descriptions’ level of detail. In this paper, we address both of these limitations: for a variable level of detail we produce coherent multi-sentence descriptions of complex videos. To understand the difference between detailed and short descriptions, we collect and analyze a video description corpus of three levels of detail. We follow a two-step approach where we first learn to predict a semantic representation (SR) from video and then generate natural language descriptions from it. For our multi-sentence descriptions we model across-sentence consistency at the level of the SR by enforcing a consistent topic. Human judges rate our descriptions as more readable, correct, and relevant than related work.
244 citations
••
03 Sep 2013TL;DR: A comparative study on recursive Bayesian filters for pedestrian path prediction at short time horizons (< 2s) based on single dynamical models and Interacting Multiple Models combining several such basic models (constant velocity/acceleration/turn).
Abstract: In the context of intelligent vehicles, we perform a comparative study on recursive Bayesian filters for pedestrian path prediction at short time horizons (< 2s). We consider Extended Kalman Filters (EKF) based on single dynamical models and Interacting Multiple Models (IMM) combining several such basic models (constant velocity/acceleration/turn). These are applied to four typical pedestrian motion types (crossing, stopping, bending in, starting). Position measurements are provided by an external state-of-the-art stereo vision-based pedestrian detector. We investigate the accuracy of position estimation and path prediction, and the benefit of the IMMs vs. the simpler single dynamical models. Special care is given to the proper sensor modeling and parameter optimization. The dataset and evaluation framework are made public to facilitate benchmarking.
234 citations
••
12 Sep 2016TL;DR: This work presents an approach that transfers the style from one image (for example, a painting) to a whole video sequence, and makes use of recent advances in style transfer in still images and proposes new initializations and loss functions applicable to videos.
Abstract: In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Doing this for a video sequence single-handed was beyond imagination. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. We make use of recent advances in style transfer in still images and propose new initializations and loss functions applicable to videos. This allows us to generate consistent and stable stylized video sequences, even in cases with large motion and strong occlusion. We show that the proposed method clearly outperforms simpler baselines both qualitatively and quantitatively.
229 citations