scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow

23 Jun 2013-pp 1674-1681
TL;DR: This work proposes a variation of EP that exploits regularities in natural scene statistics to achieve run times that are linear in both number of pixels and clique size, and uses large, non-local cliques to exploit cast shadow, which is traditionally ignored in shape from shading.
Abstract: For problems over continuous random variables, MRFs with large cliques pose a challenge in probabilistic inference. Difficulties in performing optimization efficiently have limited the probabilistic models explored in computer vision and other fields. One inference technique that handles large cliques well is Expectation Propagation. EP offers run times independent of clique size, which instead depend only on the rank, or intrinsic dimensionality, of potentials. This property would be highly advantageous in computer vision. Unfortunately, for grid-shaped models common in vision, traditional Gaussian EP requires quadratic space and cubic time in the number of pixels. Here, we propose a variation of EP that exploits regularities in natural scene statistics to achieve run times that are linear in both number of pixels and clique size. We test these methods on shape from shading, and we demonstrate strong performance not only for Lambertian surfaces, but also on arbitrary surface reflectance and lighting arrangements, which requires highly non-Gaussian potentials. Finally, we use large, non-local cliques to exploit cast shadow, which is traditionally ignored in shape from shading.

Summary (2 min read)

1. Introduction

  • Probabilistic inference for large loopy graphical models has become an important subfield with a growing body of applications, including many in computer vision.
  • These methods have resulted in significant progress for several applications.
  • The principal difference between BP and Gaussian EP can thus be summarized by a trade-off in their respective approximating families: BP favors flexible non-Gaussian marginals, while Gaussian EP favors a flexible covariance structure.
  • Another possible explanation is that for a grid-based graphical model with D pixels, Gaussian EP requires O(D2) space and a run time of O(D3).
  • Finally, the authors use the method to efficiently perform inference over large cliques produced by cast shadows and by global spatial priors.

2. Expectation Propagation

  • The family P̃ is chosen so that EP̃ [τj( x)] can be estimated easily.
  • EP achieves this goal by approximating each potential function φi( x) with an exponential family distribution P̃i( xi| θ(i)).
  • Regardless of the rank of each potential, the covariance matrix of the posterior S remains full-rank, and must be stored as a D×D matrix.
  • For large problems with tens of thousands of variables or more, this becomes limiting.
  • When the underlying graphical model is highly sparse, such as a nearest-neighbor pairwiseconnected MRFs, each iteration can be performed in time O(D1.5) [2].

3. Whitened EP

  • For many problems of computer vision, both the number of variables D and the number of potentials N grow linearly with the number of pixels.
  • Low-rank potentials of large clique size have a wide array of promising applications in computer vision [17, 10].
  • Expectation propagation can be made more efficient by limiting the forms of covariance structure expressible by S. Let S denote the covariance matrix for natural scenes.

4. Shape from Shading

  • Whitened EP permits inference over images in linear time with respect to both pixels and clique size.
  • In particular, the authors are interested in whether Gaussian message approximation will be effective when the potentials φi are highly non-Gaussian.
  • In recent years, several methods have been developed that solve the classical SfS problem well as long as surface reflectance R is assumed to be Lambertian [19, 17, 6, 3, 7].
  • For each pixel, one potential φR(p, q|i) enforces the surface normal to be consistent with the known pixel intensity i(x, y).
  • Whitened EP provides two benefits for spatial priors.

5. Conclusions

  • The methods in this paper reduce the run time of EP from cubic to linear in the number of pixels for visual inference, while retaining a run time that is linear in clique size.
  • The computational expense of inference for large cliques has prohibited the investigation of complex probabilistic models for vision.
  • The authors hope is that whitened EP will facilitate further research in these directions.
  • Results for whitened EP on SfS shows that the sacrifice in performance for this approach is small, even in problems with highly non-Gaussian potentials.
  • Performance remained strong for surfaces with arbitrary reflectance and arbitrary lighting, which is a novel finding in SfS.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Citations
More filters
Dissertation
31 May 2014
TL;DR: This thesis focuses on studying the statistical properties of single objects and their range images which can bene t shape inference techniques, including laser-acquired depth, binocular stereo, photometric stereo and High Dynamic Range (HDR) photography.
Abstract: Depth inference is a fundamental problem of computer vision with a broad range of potential applications. Monocular depth inference techniques, particularly shape from shading dates back to as early as the 40's when it was rst used to study the shape of the lunar surface. Since then there has been ample research to develop depth inference algorithms using monocular cues. Most of these are based on physical models of image formation and rely on a number of simplifying assumptions that do not hold for real world and natural imagery. Very few make use of the rich statistical information contained in real world images and their 3D information. There have been a few notable exceptions though. The study of statistics of natural scenes has been concentrated on outdoor scenes which are cluttered. Statistics of scenes of single objects has been less studied, but is an essential part of daily human interaction with the environment. Inferring shape of single objects is a very important computer vision problem which has captured the interest of many researchers over the past few decades and has applications in object recognition, robotic grasping, fault detection and Content Based Image Retrieval (CBIR). This thesis focuses on studying the statistical properties of single objects and their range images which can bene t shape inference techniques. I acquired two databases: Single Object Range and HDR (SORH) and the Eton Myers Database of single objects, including laser-acquired depth, binocular stereo, photometric stereo and High Dynamic Range (HDR) photography. I took a data driven approach and studied the statistics of color and range images of real scenes of single objects along with whole 3D objects and uncovered

2 citations

Posted Content
TL;DR: In this article, patch-based prior distributions are used to approximate the posterior distributions using products of multivariate Gaussian densities, imposing structural constraints on the covariance matrices of these densities allows for greater scalability and distributed computation.
Abstract: This paper presents a new Expectation Propagation (EP) framework for image restoration using patch-based prior distributions. While Monte Carlo techniques are classically used to sample from intractable posterior distributions, they can suffer from scalability issues in high-dimensional inference problems such as image restoration. To address this issue, EP is used here to approximate the posterior distributions using products of multivariate Gaussian densities. Moreover, imposing structural constraints on the covariance matrices of these densities allows for greater scalability and distributed computation. While the method is naturally suited to handle additive Gaussian observation noise, it can also be extended to non-Gaussian noise. Experiments conducted for denoising, inpainting and deconvolution problems with Gaussian and Poisson noise illustrate the potential benefits of such flexible approximate Bayesian method for uncertainty quantification in imaging problems, at a reduced computational cost compared to sampling techniques.
Dissertation
31 Aug 2013
TL;DR: This research builds an intelligent system based on brachiopod fossil images and their descriptions published in Treatise on Invertebrate Paleontology to compare fossil images directly, without referring to textual information.
Abstract: Science advances not only because of new discoveries, but also due to revolutionary ideas drawn from accumulated data. The quality of studies in paleontology, in particular, depends on accessibility of fossil data. This research builds an intelligent system based on brachiopod fossil images and their descriptions published in Treatise on Invertebrate Paleontology. The project is still on going and some significant developments will be discussed here. This thesis has two major parts. The first part describes the digitization, organization and integration of information extracted from the Treatise. The Treatise is in PDF format and it is non-trivial to convert large volumes into a structured, easily accessible digital library. Three important topics will be discussed: (1) how to extract data entries from the text, and save them in a structured manner; (2) how to crop individual specimen images from figures automatically, and associate each image with text entries; (3) how to build a search engine to perform both keyword search and natural language search. The search engine already has a web interface and many useful tasks can be done with ease. Verbal descriptions are second-hand information of fossil images and thus have limitations. The second part of the thesis develops an algorithm to compare fossil images directly, without referring to textual information. After similarities between fossil images are calculated, we can use the results in image search, fossil classification, and so on. The algorithm is based on deformable templates, and utilizes expectation propagation to find the optimal deformation. Specifically, I superimpose a “warp” on each image. Each node of the warp encapsulates a vector of local texture features, and comparing two images involves two steps: (1) deform the warp to the optimal configuration, so the energy function is minimized; and (2) based on the optimal configuration, compute the distance of two images. Experiment results confirmed that the method is reasonable and robust.
References
More filters
Proceedings ArticleDOI
13 Oct 2003
TL;DR: This article proposes a solution of the Lambertian shape from shading (SFS) problem in the case of a pinhole camera model (performing a perspective projection) based upon the notion of viscosity solutions of Hamilton-Jacobi equations.
Abstract: This article proposes a solution of the Lambertian shape from shading (SFS) problem in the case of a pinhole camera model (performing a perspective projection). Our approach is based upon the notion of viscosity solutions of Hamilton-Jacobi equations. This approach allows us to naturally deal with nonsmooth solutions and provides a mathematical framework for proving correctness of our algorithms. Our work extends previous work in the area in three aspects. First, it models the camera as a pinhole whereas most authors assume an orthographic projection, thereby extending the applicability of shape from shading methods to more realistic images. In particular it extends the work of E. Prados et al. (2002) and E. Rouy et al. (1992). Second, by adapting the brightness equation to the perspective problem, we obtain a new partial differential equation (PDE). Results about the existence and uniqueness of its solution are also obtained. Third, it allows us to come up with a new approximation scheme and a new algorithm for computing numerical approximations of the "continuous" solution as well as a proof of their convergence toward that solution.

135 citations


"Whitened Expectation Propagation: N..." refers methods in this paper

  • ...While there has been some success in applying methods such as Lax-Friedrichs and fastmarching to non-Lambertian reflectance [1, 23], these generalizations must proceed on a case-by-case basis for each class of reflectance functions....

    [...]

  • ...Other Lambertian SfS algorithms have reported image errors for the penny image of 0.0071 [9] and 0.0517 [13]....

    [...]

  • ...Lambertian SfS We first test our approach on Lambertian SfS, where it can be compared to past Lambertian SfS algorithms....

    [...]

  • ...We then test this approach on a problem with highly non-Gaussian potentials: non-Lambertian shape from shading (SfS)....

    [...]

  • ...SfS is one example, and non-Lambertian SfS produces especially non-Gaussian potentials....

    [...]

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work increases modelling power in several ways, and describes a closed-form method to reconstruct a smooth surface from its image apparent contour, including multilocal singularities ("kidney-bean" self-occlusions") and shows how the modelling process can be automated for simple object shapes and views, using a-priori object class information.
Abstract: Recent advances in single-view reconstruction (SVR) have been in modelling power (curved 2.5D surfaces) and automation (automatic photo pop-up). We extend SVR along both of these directions. We increase modelling power in several ways: (i) We represent general 3D surfaces, rather than 2.5D Monge patches; (ii) We describe a closed-form method to reconstruct a smooth surface from its image apparent contour, including multilocal singularities ("kidney-bean" self-occlusions); (iii) We show how to incorporate user-specified data such as surface normals, interpolation and approximation constraints; (iv) We show how this algorithm can be adapted to deal with surfaces of arbitrary genus. We also show how the modelling process can be automated for simple object shapes and views, using a-priori object class information. We demonstrate these advances on natural images drawn from a number of object classes.

133 citations


"Whitened Expectation Propagation: N..." refers background in this paper

  • ...Additionally, interior lines within the image can provide evidence of self-occlusion, cusps, or corners on the surface of the object [7, 9, 11]....

    [...]

Journal ArticleDOI
TL;DR: The authors propose to combine a triangular element surface model with a linearized reflectance map to formulate the shape-from-shading problem and express the approximating surface as a linear combination of a set of nodal basis functions.
Abstract: The authors propose to combine a triangular element surface model with a linearized reflectance map to formulate the shape-from-shading problem. The main idea is to approximate a smooth surface by the union of triangular surface patches called triangular elements and express the approximating surface as a linear combination of a set of nodal basis functions. Since the surface normal of a triangular element is uniquely determined by the heights of its three vertices (or nodes), image brightness can be directly related to nodal heights using the linearized reflectance map. The surface height can then be determined by minimizing a quadratic cost functional corresponding to the squares of brightness errors and solved effectively with the multigrid computational technique. The proposed method does not require any integrability constraint or artificial assumptions on boundary conditions. Simulation results for synthetic and real images are presented to illustrate the performance and efficiency of the method. >

118 citations

Journal ArticleDOI
TL;DR: By rewriting the multivariate Laplace distribution as a scale mixture, it is shown that the Bayesian logistic regression method can incorporate spatio-temporal constraints which lead to smooth importance maps that facilitate subsequent interpretation.

101 citations


"Whitened Expectation Propagation: N..." refers background or methods in this paper

  • ...Variations of EP have been proposed to reduce the run time and apply EP to problems in computer vision [16, 22]....

    [...]

  • ...The run time when storing the inverse covariance matrix [22] is at least O(D2....

    [...]

  • ...It can be shown that S−1 contains non-zero entries only between variable nodes that share a potential [22]....

    [...]

Proceedings Article
09 Dec 2003
TL;DR: Belief propagation is extended to represent factors with tree approximations, by way of the expectation propagation framework, which results in more accurate inferences and more frequent convergence than ordinary belief propagation, at a lower cost than variational trees or double-loop algorithms.
Abstract: Approximation structure plays an important role in inference on loopy graphs. As a tractable structure, tree approximations have been utilized in the variational method of Ghahramani & Jordan (1997) and the sequential projection method of Frey et al. (2000). However, belief propagation represents each factor of the graph with a product of single-node messages. In this paper, belief propagation is extended to represent factors with tree approximations, by way of the expectation propagation framework. That is, each factor sends a "message" to all pairs of nodes in a tree structure. The result is more accurate inferences and more frequent convergence than ordinary belief propagation, at a lower cost than variational trees or double-loop algorithms.

96 citations


"Whitened Expectation Propagation: N..." refers methods in this paper

  • ...Variations of EP have been proposed to reduce the run time and apply EP to problems in computer vision [16, 22]....

    [...]