scispace - formally typeset
Search or ask a question

Showing papers by "Thomas Brox published in 2006"


Proceedings Article
01 Jan 2006
TL;DR: In this article, a variational model for optic flow computation based on non-linearised and higher order constancy assumptions is proposed, which is also capable of dealing with large displacements.
Abstract: In this paper, we suggest a variational model for optic flow computation based on non-linearised and higher order constancy assumptions. Besides the common grey value constancy assumption, also gradient constancy, as well as the constancy of the Hessian and the Laplacian are proposed. Since the model strictly refrains from a linearisation of these assumptions, it is also capable to deal with large displacements. For the minimisation of the rather complex energy functional, we present an efficient numerical scheme employing two nested fixed point iterations. Following a coarse-to-fine strategy it turns out that there is a theoretical foundation of so-called warping techniques hitherto justified only on an experimental basis. Since our algorithm consists of the integration of various concepts, ranging from different constancy assumptions to numerical implementation issues, a detailed account of the effect of each of these concepts is included in the experimental section. The superior performance of the proposed method shows up by significantly smaller estimation errors when compared to previous techniques. Further experiments also confirm excellent robustness under noise and insensitivity to parameter variations.

426 citations


Journal ArticleDOI
TL;DR: A variational model for optic flow computation based on non-linearised and higher order constancy assumptions, including the common grey value constancy assumption, as well as the constancy of the Hessian and the Laplacian are proposed.
Abstract: In this paper, we suggest a variational model for optic flow computation based on non-linearised and higher order constancy assumptions. Besides the common grey value constancy assumption, also gradient constancy, as well as the constancy of the Hessian and the Laplacian are proposed. Since the model strictly refrains from a linearisation of these assumptions, it is also capable to deal with large displacements. For the minimisation of the rather complex energy functional, we present an efficient numerical scheme employing two nested fixed point iterations. Following a coarse-to-fine strategy it turns out that there is a theoretical foundation of so-called warping techniques hitherto justified only on an experimental basis. Since our algorithm consists of the integration of various concepts, ranging from different constancy assumptions to numerical implementation issues, a detailed account of the effect of each of these concepts is included in the experimental section. The superior performance of the proposed method shows up by significantly smaller estimation errors when compared to previous techniques. Further experiments also confirm excellent robustness under noise and insensitivity to parameter variations.

388 citations


Journal ArticleDOI
TL;DR: Nonlinear versions of the popular structure tensor, also known as second moment matrix, are introduced and it is proved that for corner detection based on nonlinear structure tensors, anisotropic nonlinear tensors give the most precise localisation.

260 citations


Journal ArticleDOI
TL;DR: This communication introduces a comparatively simple way how to extend active contours to multiple regions keeping the familiar quality of the two-phase case and suggests a strategy to determine the optimum number of regions.
Abstract: The popularity of level sets for segmentation is mainly based on the sound and convenient treatment of regions and their boundaries. Unfortunately, this convenience is so far not known from level set methods when applied to images with more than two regions. This communication introduces a comparatively simple way how to extend active contours to multiple regions keeping the familiar quality of the two-phase case. We further suggest a strategy to determine the optimum number of regions as well as initializations for the contours

215 citations


Journal Article
TL;DR: A variational method for the joint estimation of optic flow and the segmentation of the image into regions of similar motion made use of the level set framework following the idea of motion competition, which is extended to non-parametric motion.
Abstract: We suggest a variational method for the joint estimation of optic flow and the segmentation of the image into regions of similar motion. It makes use of the level set framework following the idea of motion competition, which is extended to non-parametric motion. Moreover, we automatically determine an appropriate initialization and the number of regions by means of recursive two-phase splits with higher order region models. The method is further extended to the spatiotemporal setting and the use of additional cues like the gray value or color for the segmentation. It need not fear a quantitative comparison to pure optic flow estimation techniques: For the popular Yosemite sequence with clouds we obtain the currently most accurate result. We further uncover a mistake in the ground truth. Coarsely correcting this, we get an average angular error below 1 degree.

114 citations


Book ChapterDOI
07 May 2006
TL;DR: In this paper, a variational method for the joint estimation of optic flow and segmentation of the image into regions of similar motion is proposed, which makes use of the level set framework following the idea of motion competition.
Abstract: We suggest a variational method for the joint estimation of optic flow and the segmentation of the image into regions of similar motion. It makes use of the level set framework following the idea of motion competition, which is extended to non-parametric motion. Moreover, we automatically determine an appropriate initialization and the number of regions by means of recursive two-phase splits with higher order region models. The method is further extended to the spatiotemporal setting and the use of additional cues like the gray value or color for the segmentation. It need not fear a quantitative comparison to pure optic flow estimation techniques: For the popular Yosemite sequence with clouds we obtain the currently most accurate result. We further uncover a mistake in the ground truth. Coarsely correcting this, we get an average angular error below 1 degree.

111 citations


Book ChapterDOI
01 Jan 2006
TL;DR: In this article, the authors present a survey on different model assumptions for each of these terms and illustrate their impact by experiments on rotationally invariant convex functionals with a linearised data term.
Abstract: Optic fow describes the displacement field in an image sequence. Its reliable computation constitutes one of the main challenges in computer vision, and variational methods belong to the most successful techniques for achieving this goal. Variational methods recover the optic flow field as a minimiser of a suitable energy functional that involves data and smoothness terms. In this paper we present a survey on different model assumptions for each of these terms and illustrate their impact by experiments. We restrict ourselves to rotationally invariant convex functionals with a linearised data term. Such models are appropriate for small displacements. Regarding the data term, constancy assumptions on the brightness, the gradient, the Hessian, the gradient magnitude, the Laplacian, and the Hessian determinant are investigated. Local integration and nonquadratic penalisation are considered in order to improve robustness under noise. With respect to the smoothness term, we review a recent taxonomy that links regularisers to diffusion processes. It allows to distinguish five types of regularisation strategies: homogeneous, isotropic image-driven, anisotropic image-driven, isotropic flow-driven, and anisotropic flow-driven. All these regularisations can be performed either in the spatial or the spatiotemporal domain. After discussing well-posedness results for convex optic flow functionals, we sketch some numerical ideas in order to achieve realtime performance on a standard PC by means of multigrid methods, and we survey a simple and intuitive confidence measure.

106 citations


Book ChapterDOI
01 Jan 2006
TL;DR: This chapter wants to give an overview of the different approaches to the computation of the structure tensor, whereas the focus lies on the methods based on robust statistics and nonlinear diffusion.
Abstract: The structure tensor, also known as second moment matrix or Forstner interest operator, is a very popular tool in image processing. Its purpose is the estimation of orientation and the local analysis of structure in general. It is based on the integration of data from a local neighborhood. Normally, this neighborhood is defined by a Gaussian window function and the structure tensor is computed by the weighted sum within this window. Some recently proposed methods, however, adapt the computation of the structure tensor to the image data. There are several ways how to do that. This chapter wants to give an overview of the different approaches, whereas the focus lies on the methods based on robust statistics and nonlinear diffusion. Furthermore, the data-adaptive structure tensors are evaluated in some applications. Here the main focus lies on optic flow estimation, but also texture analysis and corner detection are considered.

102 citations


Journal Article
TL;DR: This paper proposes to use two complementary types of features for pose tracking, such that one type makes up for the shortcomings of the other, and to employ the optic flow in order to compute additional point correspondences.
Abstract: Tracking the 3-D pose of an object needs correspondences between 2-D features in the image and their 3-D counterparts in the object model. A large variety of such features has been suggested in the literature. All of them have drawbacks in one situation or the other since their extraction in the image and/or the matching is prone to errors. In this paper, we propose to use two complementary types of features for pose tracking, such that one type makes up for the shortcomings of the other. Aside from the object contour, which is matched to a free-form object surface, we suggest to employ the optic flow in order to compute additional point correspondences. Optic flow estimation is a mature research field with sophisticated algorithms available. Using here a high quality method ensures a reliable matching. In our experiments we demonstrate the performance of our method and in particular the improvements due to the optic flow.

67 citations


Book ChapterDOI
Andreas Wedel1, Uwe Franke1, Jens Klappstein1, Thomas Brox, Daniel Cremers 
12 Sep 2006
TL;DR: This paper deals with the detection of arbitrary static objects in traffic scenes from monocular video using structure from motion, and estimates the scene depth from the scaling of supervised image regions.
Abstract: This paper deals with the detection of arbitrary static objects in traffic scenes from monocular video using structure from motion. A camera in a moving vehicle observes the road course ahead. The camera translation in depth is known. Many structure from motion algorithms were proposed for detecting moving or nearby objects. However, detecting stationary distant obstacles in the focus of expansion remains quite challenging due to very small subpixel motion between frames. In this work the scene depth is estimated from the scaling of supervised image regions. We generate obstacle hypotheses from these depth estimates in image space. A second step then performs testing of these by comparing with the counter hypothesis of a free driveway. The approach can detect obstacles already at distances of 50m and more with a standard focal length. This early detection allows driver warning and safety precaution in good time.

60 citations


01 Jan 2006
TL;DR: In this article, the authors proposed to use two complementary types of features for pose tracking, such that one type makes up for the shortcomings of the other, and they employed the optic flow to compute additional point correspondences.
Abstract: Tracking the 3-D pose of an object needs correspondences between 2-D features in the image and their 3-D counterparts in the object model. A large variety of such features has been suggested in the literature. All of them have drawbacks in one situation or the other since their extraction in the image and/or the matching is prone to errors. In this paper, we propose to use two complementary types of features for pose tracking, such that one type makes up for the shortcomings of the other. Aside from the object contour, which is matched to a free-form object surface, we suggest to employ the optic flow in order to compute additional point correspondences. Optic flow estimation is a mature research field with sophisticated algorithms available. Using here a high quality method ensures a reliable matching. In our experiments we demonstrate the performance of our method and in particular the improvements due to the optic flow.

Book ChapterDOI
19 Jun 2006
TL;DR: This paper analyzes two conceptionally different approaches for shape matching: the well-known iterated closest point (ICP) algorithm and variational shape registration via level sets and shows that a combination of these methods improves the stability and convergence behavior of the pose estimation algorithm.
Abstract: In this paper, we analyze two conceptionally different approaches for shape matching: the well-known iterated closest point (ICP) algorithm and variational shape registration via level sets For the latter, we suggest to use a numerical scheme which was introduced in the context of optic flow estimation For the comparison, we focus on the application of shape matching in the context of pose estimation of 3-D objects by means of their silhouettes in stereo camera views It turns out that both methods have their specific shortcomings With the possibility of the pose estimation framework to combine correspondences from two different methods, we show that such a combination improves the stability and convergence behavior of the pose estimation algorithm

Journal ArticleDOI
TL;DR: A local region based scale measure, which exploits properties of a certain type of nonlinear diffusion, the so-called total variation (TV) flow, turns out that one can gain a total speedup of a factor 2 without loosing any quality concerning the discrimination of textures.

Book ChapterDOI
01 Jan 2006
TL;DR: A survey on relations between both paradigms when space-discrete or fully discrete versions of nonlinear diffusion filters are considered, and shows equivalence between soft Haar wavelet shrinkage and total variation (TV) diffusion for 2-pixel signals.
Abstract: Nonlinear diffusion filtering and wavelet shrinkage are two methods that serve the same purpose, namely discontinuity-preserving denoising. In this chapter we give a survey on relations between both paradigms when space-discrete or fully discrete versions of nonlinear diffusion filters are considered. For the case of space-discrete diffusion, we show equivalence between soft Haar wavelet shrinkage and total variation (TV) diffusion for 2-pixel signals. For the general case of N-pixel signals, this leads us to a numerical scheme for TV diffusion with many favourable properties. Both considerations are then extended to 2-D images, where an analytical solution for 2 × 2 pixel images serves as building block for a wavelet-inspired numerical scheme for TV diffusion. When replacing space-discrete diffusion by fully discrete one with an explicit time discretisation, we obtain a general relation between the shrinkage function of a shift-invariant Haar wavelet shrinkage on a single scale and the diffusivity of a nonlinear diffusion filter. This allows to study novel, diffusion-inspired shrinkage functions with competitive performance, to suggest now shrinkage rules for 2-D images with better rotation invariance, and to propose coupled shrinkage rules for colour images where a desynchronisation of the colour channels is avoided. Finally we present a new result which shows that one is not restricted to shrinkage with Haar wavelets: By using wavelets with a higher number of vanishing moments, equivalences to higher-order diffusion-like PDEs are discovered.

01 Jan 2006
TL;DR: This paper presents a silhouette based human motion capture system, which contains silhouette extraction based on level sets, a correspondence module, which relates image data to model data and a pose estimation module, and performs a comparison of the motion estimation system with a marker based tracking system.
Abstract: In this contribution we present a silhouette based human motion capture system The system components contain silhouette extraction based on level sets, a correspondence module, which relates image data to model data and a pose estimation module Experiments are done in different camera setups and we estimate the model components with 21 degrees of freedom in up to two frames per second To evaluate the stability of the proposed algorithm we perform a comparison of the motion estimation system with a marker based tracking system The results show the applicability of the system for marker-less sports movement analysis We finally present extensions for motion capture in complex environments, with changing lighting conditions and cluttered background This paper is an extended version of [21] which was awarded the DAGM Main Prize on the annual symposium of the German pattern recognition society (DAGM) in Vienna, 2005

Book ChapterDOI
07 May 2006
TL;DR: In this article, the authors proposed to use two complementary types of features for pose tracking, such that one type makes up for the shortcomings of the other, and they employed the optic flow to compute additional point correspondences.
Abstract: Tracking the 3-D pose of an object needs correspondences between 2-D features in the image and their 3-D counterparts in the object model. A large variety of such features has been suggested in the literature. All of them have drawbacks in one situation or the other since their extraction in the image and/or the matching is prone to errors. In this paper, we propose to use two complementary types of features for pose tracking, such that one type makes up for the shortcomings of the other. Aside from the object contour, which is matched to a free-form object surface, we suggest to employ the optic flow in order to compute additional point correspondences. Optic flow estimation is a mature research field with sophisticated algorithms available. Using here a high quality method ensures a reliable matching. In our experiments we demonstrate the performance of our method and in particular the improvements due to the optic flow.

Book ChapterDOI
12 Sep 2006
TL;DR: A probabilistic formulation of 3D segmentation given a series of images from calibrated cameras, which can reconstruct the mean intensity and variance of the extracted object and background and cope with noisy data.
Abstract: We propose a probabilistic formulation of 3D segmentation given a series of images from calibrated cameras. Instead of segmenting each image separately in order to build a 3D surface consistent with these segmentations, we compute the most probable surface that gives rise to the images. Additionally, our method can reconstruct the mean intensity and variance of the extracted object and background. Although it is designed for scenes, where the objects can be distinguished visually from the background (i.e. images of piecewise homogeneous regions), the proposed algorithm can also cope with noisy data. We carry out the numerical implementation in the level set framework. Our experiments on synthetic data sets reveal favorable results compared to state-of-the-art methods, in particular in terms of robustness to noise and initialization.

01 Jan 2006
TL;DR: In this paper, the supplement of prior knowledge about joint angle configurations in the scope of 3D human pose tracking is considered for a nonparametric Parzen density estimation in the 12-dimensional joint configuration space.
Abstract: The present paper considers the supplement of prior knowledge about joint angle configurations in the scope of 3-D human pose tracking. Training samples obtained from an industrial marker based tracking system are used for a nonparametric Parzen density estimation in the 12-dimensional joint configuration space. These learned probability densities constrain the image-driven joint angle estimates by drawing solutions towards familiar configurations. This prevents the method from producing unrealistic pose estimates due to unreliable image cues. Experiments on sequences with a human leg model reveal a considerably increased robustness, particularly in the presence of disturbed images and occlusions.

Book ChapterDOI
01 Jan 2006
TL;DR: This survey chapter reviews the most important PDEs for discontinuity-preserving denoising of tensor fields and considers isotropic and anisotropic diffusion filters and their corresponding variational methods, mean curvature motion, and selfsnakes.
Abstract: Methods based on partial differential equations (PDEs) belong to those image processing techniques that can be extended in a particularly elegant way to tensor fields. In this survey chapter the most important PDEs for discontinuity-preserving denoising of tensor fields are reviewed such that the underlying design principles becomes evident. We consider isotropic and anisotropic diffusion filters and their corresponding variational methods, mean curvature motion, and selfsnakes. These filters preserve positive semidefiniteness of any positive semidefinite initial tensor field. Finally we discuss geodesic active contours for segmenting tensor fields. Experiments are presented that illustrate the behaviour of all these methods.

Book ChapterDOI
12 Sep 2006
TL;DR: Training samples obtained from an industrial marker based tracking system are used for a nonparametric Parzen density estimation in the 12-dimensional joint configuration space, which constrain the image-driven joint angle estimates by drawing solutions towards familiar configurations.
Abstract: The present paper considers the supplement of prior knowledge about joint angle configurations in the scope of 3-D human pose tracking. Training samples obtained from an industrial marker based tracking system are used for a nonparametric Parzen density estimation in the 12-dimensional joint configuration space. These learned probability densities constrain the image-driven joint angle estimates by drawing solutions towards familiar configurations. This prevents the method from producing unrealistic pose estimates due to unreliable image cues. Experiments on sequences with a human leg model reveal a considerably increased robustness, particularly in the presence of disturbed images and occlusions.

Book ChapterDOI
01 Jan 2006
TL;DR: A new approach to automated muscle fiber analysis based on segmenting myofibers with combined region and edge-based active contours provides reliable and fully-automated processing, thus, enabling time-saving batch processing of the entire biopsy sample stemming from routinely HE-stained cryostat sections.
Abstract: This paper presents a new approach to automated muscle fiber analysis based on segmenting myofibers with combined region and edge-based active contours. It provides reliable and fully-automated processing, thus, enabling time-saving batch processing of the entire biopsy sample stemming from routinely HE-stained cryostat sections. The method combines color, texture, and edge cues in a level set based active contour model succeeded by a refinement with morphological filters. False-positive segmentations as compared to former methods are minimized. A quantitative comparison between manual and automated analysis of muscle fibers images did not reveal any significant differences.

Book ChapterDOI
06 Nov 2006
TL;DR: In this paper, an approach to use prior knowledge in the particle filter framework for 3D tracking, i.e. estimating the state parameters such as joint angles of a 3D object, is presented.
Abstract: In this paper we present an approach to use prior knowledge in the particle filter framework for 3D tracking, i.e. estimating the state parameters such as joint angles of a 3D object. The probability of the object's states, including correlations between the state parameters, is learned a priori from training samples. We introduce a framework that integrates this knowledge into the family of particle filters and particularly into the annealed particle filter scheme. Furthermore, we show that the annealed particle filter also works with a variational model for level set based image segmentation that does not rely on background subtraction and, hence, does not depend on a static background. In our experiments, we use a four camera set-up for tracking the lower part of a human body by a kinematic model with 18 degrees of freedom. We demonstrate the increased accuracy due to the prior knowledge and the robustness of our approach to image distortions. Finally, we compare the results of our multi-view tracking system quantitatively to the outcome of an industrial marker based tracking system.

Journal ArticleDOI
TL;DR: It is shown that TV flow may act self-stabilising: even if the total variation increases by the filtering process, the resulting oscillations remain bounded by a constant that is proportional to the ratio of mesh widths.
Abstract: The singular diffusion equation called total variation (TV) flow plays an important role in image processing and appears to be suitable for reducing oscillations in other types of data. Due to its singularity for zero gradients, numerical discretizations have to be chosen with care. We discuss different ways to implement TV flow numerically, and we show that a number of discrete versions of this equation may introduce oscillations such that the scheme is in general not TV-decreasing. On the other hand, we show that TV flow may act self-stabilising: even if the total variation increases by the filtering process, the resulting oscillations remain bounded by a constant that is proportional to the ratio of mesh widths. For our analysis we restrict ourselves to the one-dimensional setting.

01 Jan 2006
TL;DR: In this paper, the authors compare the performance of ICP and VAE in the context of pose estimation of 3D objects by means of their silhouettes in stereo camera views, and show that such a combination improves the stability and convergence behavior of the pose estimation algorithm.
Abstract: In this paper, we analyze two conceptionally different approaches for shape matching: the well-known iterated closest point (ICP) algorithm and variational shape registration via level sets. For the latter, we suggest to use a numerical scheme which was introduced in the context of optic flow estimation. For the comparison, we focus on the application of shape matching in the context of pose estimation of 3-D objects by means of their silhouettes in stereo camera views. It turns out that both methods have their specific shortcomings. With the possibility of the pose estimation framework to combine correspondences from two different methods, we show that such a combination improves the stability and convergence behavior of the pose estimation algorithm.

01 Jan 2006
TL;DR: An approach to use prior knowledge in the particle filter framework for 3D tracking, i.e. estimating the state parameters such as joint angles of a 3D object, is presented and a framework that integrates this knowledge into the family of particle filters and particularly into the annealed particle filter scheme is introduced.
Abstract: In this paper we present an approach to use prior knowledge in the particle filter framework for 3D tracking, i.e. estimating the state parameters such as joint angles of a 3D object. The probability of the object's states, including correlations between the state parameters, is learned a priori from training samples. We introduce a framework that integrates this knowledge into the family of particle filters and particularly into the annealed particle filter scheme. Furthermore, we show that the annealed particle filter also works with a variational model for level set based image segmentation that does not rely on background subtraction and, hence, does not depend on a static background. In our experiments, we use a four camera set-up for tracking the lower part of a human body by a kinematic model with 18 degrees of freedom. We demonstrate the increased accuracy due to the prior knowledge and the robustness of our approach to image distortions. Finally, we compare the results of our multi-view tracking system quantitatively to the outcome of an industrial marker based tracking system.