scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The Laplacian Pyramid as a Compact Image Code

TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations

Posted Content
TL;DR: It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation.
Abstract: Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a novel architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes one third of a second for a typical image.

9,803 citations


Cites background from "The Laplacian Pyramid as a Compact ..."

  • ...antic segmentation results in Section5. Combining feature hierarchies We fuse features across layers to define a nonlinear local-to-global representation that we tune end-to-end. The Laplacian pyramid [26] is a classic multi-scale representation made of fixed smoothing and differencing. The jet of Koenderink and van Doorn [27] is a rich, local feature defined by compositions of partial derivatives. In th...

    [...]

  • ... the model make local predictions that respect global structure. This crossing of layers and resolutions is a learned, nonlinear counterpart to the multi-scale representation of the Laplacian pyramid [26]. By analogy to the jet of Koenderick and van Doorn [27], we call our feature hierarchy the deep jet. Layer fusion is essentially an elementwise operation. However, the correspondence of elements acro...

    [...]

Journal ArticleDOI
Ingrid Daubechies1
TL;DR: This work construct orthonormal bases of compactly supported wavelets, with arbitrarily high regularity, by reviewing the concept of multiresolution analysis as well as several algorithms in vision decomposition and reconstruction.
Abstract: We construct orthonormal bases of compactly supported wavelets, with arbitrarily high regularity. The order of regularity increases linearly with the support width. We start by reviewing the concept of multiresolution analysis as well as several algorithms in vision decomposition and reconstruction. The construction then follows from a synthesis of these different approaches.

8,588 citations

Journal ArticleDOI
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

7,458 citations


Cites methods from "The Laplacian Pyramid as a Compact ..."

  • ...We use the coefficients 1/16{1, 4, 6, 4, 1}, the same ones used in Burt and Adelson’s [26] Laplacian pyramid....

    [...]

References
More filters
Book
01 Jan 1976
TL;DR: The rapid rate at which the field of digital picture processing has grown in the past five years had necessitated extensive revisions and the introduction of topics not found in the original edition.
Abstract: The rapid rate at which the field of digital picture processing has grown in the past five years had necessitated extensive revisions and the introduction of topics not found in the original edition.

4,231 citations

Journal ArticleDOI
01 Jan 1985

927 citations


"The Laplacian Pyramid as a Compact ..." refers background in this paper

  • ...The difference between these two functions is similar to the "Laplacian" operators commonly used in image enhancement [13]....

    [...]

Journal ArticleDOI
Arun N. Netravali1, J.O. Limb1
01 Mar 1980
TL;DR: This paper presents a review of techniques used for digital encoding of picture material, covering statistical models of picture signals and elements of psychophysics relevant to picture coding, followed by a description of the coding techniques.
Abstract: This paper presents a review of techniques used for digital encoding of picture material. Statistical models of picture signals and elements of psychophysics relevant to picture coding are covered first, followed by a description of the coding techniques. Detailed examples of three typical systems, which combine some of the coding principles, are given. A bright future for new systems is forecasted based on emerging new concepts, technology of integrated circuits and the need to digitize in a variety of contexts.

551 citations


"The Laplacian Pyramid as a Compact ..." refers background in this paper

  • ...These steps are then repeated to compress the low-pass image....

    [...]

Journal ArticleDOI
TL;DR: A highly efficient recursive algorithm is defined for simultaneously convolving an image (or other two-dimensional function) with a set of kernels which differ in width but not in shape, so that the algorithm generates aSet of low-pass or band-pass versions of the image.
Abstract: A highly efficient recursive algorithm is defined for simultaneously convolving an image (or other two-dimensional function) with a set of kernels which differ in width but not in shape. These kernels may closely resemble the Gaussian probability distribution, so that the algorithm generates a set of low-pass or band-pass versions of the image. Image correlation with spot, edge, and bar operators of many sized can be obtained with negligible additional computation

442 citations


"The Laplacian Pyramid as a Compact ..." refers background or methods in this paper

  • ...This is a very algorithm, requiring fewer computational steps to compute a set of filtered than are required by the fast Fourier transform to compute a single filtered image [ 2 ]....

    [...]

  • ...A suitable fast algorithm has recently been developed [ 2 ] and will be described in the section....

    [...]

  • ...This weighting pattern, called the generating kernel, is chosen t o certain constraints [ 2 ]....

    [...]

  • ...A graphical representation of this process in one dimension i s given in Fig. 1. The of the function is not [ 2 ]....

    [...]