How can contourlet transforms be generalized to arbitrary tree structures?

the full binary tree decomposition of the DFB in the contourlet transform can be generalized to arbitrary tree structures, similar to the wavelet packets generalization of the wavelet transform [30].

What is the way to achieve the optimal approximation rate for contourlets?

With parabolic scaling and sufficient directional vanishing moments, the contourlet expansion is shown to achieve the optimal approximation rate for piecewise C2 smooth images with C2 smooth contours.

What is the M -term approximation error for wavelets?

For this class of functions, the best M -term approximation error (in L2-norm square) ‖f − f̂M‖22 using curvelets has a decay rate of O((logM)3M−2) [5], while for wavelets this rate is O(M−1) and for the Fourier basis it is O(M−1/2) [1], [2].

what is the sum squared error due to discarding all except M largest type 1 coefficients?

Suppose that the scaling function φ has accuracy of order 2, which is equivalent to requiring the filter G(ejω1 , ejω2) in (7) to have a second-order zero at (π, π) [34], that is for all p1, p2 ∈ Z; 0 ≤ p1 + p2 < 2,∂p1ω1∂ p2 ω2G(e jω1 , ejω2) ∣ ∣ (π,π) = 0. (42)Then for f ∈ C2, the authors have‖f − PVjf‖2 ∼ (2j)2. (43)Thus, the sum squared error due to discarding all except M ∼ 2−2j type 2 contourlet coefficients down to scale 2j satisfiesE2(M) ∼ (2j)4 ∼M−2. (44)Combining (41) and (44), and using (32), the authors obtain the following result that characterizes the approximation power of contourlets.

What is the reason that (l)j,k is an overcomplete frame?

The reason that {λ(l)j,k,n}n∈Z2 is an overcomplete frame for W (l)j,k is because it uses the same sampling grid of the bigger subspace V (l)j−1,k .

What is the redundancy ratio of the discrete contourlet transform?

4) Suppose an lj-level DFB is applied at the pyramidal level j of the LP, then the basis images of the discrete contourlet transform (i.e. the equivalent filters of the contourlet filter bank) have an essential support size of width ≈ C2j and length ≈ C2j+lj−2.

What is the optimal nonlinear approximation rate for contourlet functions?

In this subsection the authors will show that a contourlet expansion that satisfies the parabolic scaling and has sufficient DVMs (this will be defined precisely in Lemma 1) achieves the optimal nonlinear approximation rate for 2-D piecewise C2 smooth functions with discontinuities along C2 smooth curves.

Journal Article•DOI•

The contourlet transform: an efficient directional multiresolution image representation

Q: What are the contributions in "The contourlet transform: an efficient directional multiresolution image representation" ?

In this paper, the authors pursue a “ true ” twodimensional transform that can capture the intrinsic geometrical structure that is key in visual information. Thus, unlike other approaches, such as curvelets, that first develop a transform in the continuous domain and then discretize for sampled data, their approach starts with a discrete-domain construction and then studies its convergence to an expansion in the continuous domain. The authors show that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves. Finally, the authors show some numerical experiments demonstrating the potential of contourlets in several image processing applications. Furthermore, the authors establish a precise link between the developed filter bank and the associated continuousdomain contourlet expansion via a directional multiresolution analysis framework.

Q: What other well-known systems provide multiscale and directional image representations?

Several other well-known systems that provide multiscale and directional image representations include: 2-D Gabor wavelets [15], the cortex transform [16], the steerable pyramid [17], 2-D directional wavelets [18], brushlets [19], and complex wavelets [20].

Q: What is the way to improve a curvelet-like method?

for typical images with smooth contours, the authors expect a significant improvement of a curvelet-like method over wavelets, which is comparable to the improvement of wavelets over the Fourier basis for one-dimensional piecewise smooth signals.

Q: What is the reason that (l)j,k is an overcomplete frame?

The reason that {λ(l)j,k,n}n∈Z2 is an overcomplete frame for W (l)j,k is because it uses the same sampling grid of the bigger subspace V (l)j−1,k .

Q: What is the redundancy ratio of the discrete contourlet transform?

4) Suppose an lj-level DFB is applied at the pyramidal level j of the LP, then the basis images of the discrete contourlet transform (i.e. the equivalent filters of the contourlet filter bank) have an essential support size of width ≈ C2j and length ≈ C2j+lj−2.

Q: What is the optimal nonlinear approximation rate for contourlet functions?

In this subsection the authors will show that a contourlet expansion that satisfies the parabolic scaling and has sufficient DVMs (this will be defined precisely in Lemma 1) achieves the optimal nonlinear approximation rate for 2-D piecewise C2 smooth functions with discontinuities along C2 smooth curves.

Minh N. Do¹, Martin Vetterli²•Institutions (2)

University of Illinois at Urbana–Champaign¹, École Polytechnique Fédérale de Lausanne²

01 Dec 2005-IEEE Transactions on Image Processing (IEEE)-Vol. 14, Iss: 12, pp 2091-2106

TL;DR: A "true" two-dimensional transform that can capture the intrinsic geometrical structure that is key in visual information is pursued and it is shown that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves.

read less

Abstract: The limitations of commonly used separable extensions of one-dimensional transforms, such as the Fourier and wavelet transforms, in capturing the geometry of image edges are well known. In this paper, we pursue a "true" two-dimensional transform that can capture the intrinsic geometrical structure that is key in visual information. The main challenge in exploring geometry in images comes from the discrete nature of the data. Thus, unlike other approaches, such as curvelets, that first develop a transform in the continuous domain and then discretize for sampled data, our approach starts with a discrete-domain construction and then studies its convergence to an expansion in the continuous domain. Specifically, we construct a discrete-domain multiresolution and multidirection expansion using nonseparable filter banks, in much the same way that wavelets were derived from filter banks. This construction results in a flexible multiresolution, local, and directional image expansion using contour segments, and, thus, it is named the contourlet transform. The discrete contourlet transform has a fast iterated filter bank algorithm that requires an order N operations for N-pixel images. Furthermore, we establish a precise link between the developed filter bank and the associated continuous-domain contourlet expansion via a directional multiresolution analysis framework. We show that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves. Finally, we show some numerical experiments demonstrating the potential of contourlets in several image processing applications.

...read moreread less

Summary (5 min read)

Jump to: [Introduction] – [II. BACKGROUND AND RELATED WORK] – [A. Concept] – [B. Pyramid frames] – [D. Multiscale and directional decomposition: the discrete contourlet transform] – [IV. CONTOURLETS AND DIRECTIONAL MULTIRESOLUTION ANALYSIS] – [C. Multiscale and multidirection: the contourlet expansion] – [V. CONTOURLET APPROXIMATION AND COMPRESSION] – [A. Parabolic scaling] – [B. Directional vanishing moment] – [C. Contourlet approximation] – [D. Contourlet compression] – [VI. NUMERICAL EXPERIMENTS] – [B. Nonlinear approximation] and [VII. CONCLUSION]

Introduction

This work was supported in part by the US National Science Foundation under Grant CCR-0237633 and the Swiss National Science Foundation under Grant 20-63664.00.
As a result of a separable extension from 1-D bases, wavelets in 2-D are good at isolating the discontinuities at edge points, but will not “see” the smoothness along the contours.
The new style painter, on the other hand, exploits effectively the smoothness of the contour by making brush strokes with different elongated shapes and in a variety of directions following the contour.
More importantly, this result suggests that for a computational image representation to be efficient, it should based on a local, directional, and multiresolution expansion.

A. Concept

Comparing the wavelet scheme with the new scheme shown in Figure 1, the authors see that the improvement of the new scheme can be attributed to the grouping of nearby wavelet coefficients, since they are locally correlated due to the smoothness of the contours.
In essence, the authors first use a wavelet-like transform for edge detection, and then a local directional transform for contour segment detection.
The authors proposed a double filter bank structure (see Figure 7) [22] for obtaining sparse expansions for typical images having smooth contours.
The overall result is an image expansion using basic elements like contour segments, and thus are named contourlets.
In the frequency domain, the contourlet transform provides a multiscale and directional decomposition.

B. Pyramid frames

One way to obtain a multiscale decomposition is to use the Laplacian pyramid (LP) introduced by Burt and Adelson [23].
The LP decomposition at each level generates a downsampled lowpass version of the original and the difference between the original and the prediction, resulting in a bandpass image.
Thus, the key in the DFB is to use an appropriate combination of shearing operators together with two-direction partition of quincunx filter banks at each node in a binary tree-structured filter bank, to obtain the desired 2-D spectrum division as shown in Figure 3(a).
These basis functions have quasi-linear supports in space and span all directions.
Furthermore, it can be shown [29] that if the building block filter bank in Figure 4 uses orthogonal filters, then the resulting DFB is orthogonal and (4) becomes an orthogonal basis.

D. Multiscale and directional decomposition: the discrete contourlet transform

Combining the Laplacian pyramid and the directional filter bank, the authors are now ready to describe their combination into a double filter bank structure that was motivated in Section IIIA.
That means, the j-th level of the LP decomposes the image aj−1[n] into a coarser image aj [n] and a detail image bj [n].
The main properties of the discrete contourlet transform are stated in the following theorem.
(5) For the DFB, its building block two-channel filter banks requires Ld operations per input sample.
Since the multiscale and directional decomposition stages are decoupled in the discrete contourlet transform, the authors can have a different number of directions at different scales, thus offering a flexible multiscale and directional expansion.

IV. CONTOURLETS AND DIRECTIONAL MULTIRESOLUTION ANALYSIS

As for the wavelet filter bank, the contourlet filter bank has an associated continuous-domain expansion in L2(R2) using the contourlet functions.
The connection between the discrete contourlet transform and the continuousdomain contourlet expansion will be made precisely via a new multiresolution analysis framework that is similar to the link between wavelets and filter banks [2].
The new elements in this framework are multidirection and its combination with multiscale.
For simplicity, the authors will only consider the case with orthogonal filters, which leads to tight frames.
The more general case with biorthogonal filters can be treated similarly.

C. Multiscale and multidirection: the contourlet expansion

Applying the directional decomposition by the family (4) onto the detail subspace Proof:.
Note that the number of DFB decomposition levels l can be different at different scales j, and in that case will be denoted by lj .
As a result, the subspace W (l) j,k is defined on a rectangular grid with intervals 2j+l−2×2j or 2j×2j+l−2, depending on whether it is mostly horizontal or vertical (see Figure 9(b)).
The reason that {λ(l)j,k,n}n∈Z2 is an overcomplete frame for W (l)j,k is because it uses the same sampling grid of the bigger subspace V (l)j−1,k . (22) The discrete filter w(l)k is roughly equal to the summation of convolutions between the directional filter d(l)k and bandpass filters fi’s, and thus it is a bandpass directional filter.
1) The contourlet expansions are defined on rectangular grids, and thus offer a seamless translation (as demonstrated in Theorem 3) to the discrete world, where image pixels are sampled on a rectangular grid.

V. CONTOURLET APPROXIMATION AND COMPRESSION

The proposed contourlet filter bank and its associated continuous-domain frames in previous sections provide a framework for constructing general directional multiresolution image representations.
Since their goal is to develop efficient or sparse expansions for images having smooth contours, the next important issues are: (1) what conditions should the authors impose on contourlets to obtain a sparse expansion for that class of images; and (2) how can they design filter banks that can lead to contourlet expansions satisfying those conditions.
The authors consider the first issue in this paper; the second one is addressed in another paper [31].

A. Parabolic scaling

In the curvelet construction, Candès and Donoho [4] point out that a key to achieving the correct nonlinear approximation behavior by curvelets is to select support sizes obeying the parabolic scaling relation for curves: width ∝ length2.
The same scaling relation has been used in the study of Fourier integral operators and wave equations; for example, see [32].
More precisely, with the local coordinate setup as in Figure 10(a), the authors can readily verify that the parametric representation of the discontinuity curve obeys u(v) ≈ κ 2 v2, when v ≈ 0, (27) where κ is the local curvature of the curve.
As can be seen in the two pyramidal levels shown, as the support size of the basis element of the LP is reduced by four in each dimension, the number of directions of the DFB is doubled.
Combining these two stages, the support sizes of the contourlet functions evolve in accordance to the desired parabolic scaling.

B. Directional vanishing moment

For the wavelet case in 1-D, the wavelet approximation theory brought a novel condition into filter bank design, which earlier only focused on designing filters with good frequency selection properties.
Intuitively, wavelets with vanishing moments are orthogonal to polynomial signals, and thus only a few wavelet basis functions around the discontinuities points would “feel” these discontinuities and lead to significant coefficients [33].
The key feature of these images is that image edges are localized in both location and direction.
Thus, it is desirable that only few contourlet functions whose supports intersect with a contour and align with the contour local direction would “feel” this discontinuity.
The authors refer this requirement as the directional vanishing moment (DVM) condition.

C. Contourlet approximation

In this subsection the authors will show that a contourlet expansion that satisfies the parabolic scaling and has sufficient DVMs (this will be defined precisely in Lemma 1) achieves the optimal nonlinear approximation rate for 2-D piecewise C2 smooth functions with discontinuities along C2 smooth curves.
Therefore the authors need to bound the integral of 〈f, λj,k̃,n〉 outside region A to be the same order.
In addition, since the discontinuity curve S has finite length, the number of type 1 coefficients with these indexes is mj,k̃ ∼ 1/dj,k̃,n ∼ 2−j/2k̃. (39) From (38), for a type 1 coefficient to have magnitude above a threshold ǫ, it is necessary that k̃ .
Suppose that a compactly supported contourlet frame (24) satisfies the parabolic scaling condition (29), the contourlet functions λj,k satisfy the condition in Lemma 1, and the scaling function φ ∈ Cp has accuracy of order 2, also known as Theorem 4.
Then for a function f that is C2 away from a C2 discontinuity curve, the M -term approximation by this frame achieves ‖f − f̂ M ‖22 ≤ C(logM)3M−2. (45) Remark 2: The approximation rate for s in (45) is the same as the approximation rate for curvelets, which was derived in [5] and [35].

D. Contourlet compression

So far, the authors consider the approximation problem of contourlets by keeping the M largest coefficients.
Specifically, from coarse to fine scales, significant contourlet coefficients are successively localized in both location (contourlets intersect with the discontinuity curve) and direction (intersected contourlets with direction close to the local direction of the discontinuity curve).
Thus, using embedded tree structures for contourlet coefficients that are similar to the embedded zero-trees for wavelets [36], the authors can efficiently index the retained coefficients using 1 bit per coefficient.
Instead of using fixed length coding for the quantized coefficients, a slight gain (in the log factor, but not the exponent of the rate-distortion function) can be obtained by variable length coding.
In particular, the authors use the bit plane coding scheme [8] where coefficients with magnitude in the range (2l−1−L, 2l−L] are encoded with l bits.

VI. NUMERICAL EXPERIMENTS

All experiments in this section use a wavelet transform with “9-7” biorthogonal filters [37], [38] and 6 decomposition levels.
Apart from also being linear phase and nearly orthogonal, these fan filters are close to having the ideal frequency response and thus can approximate the directional vanishing moment condition.
The number of DFB decomposition levels is doubled at every other finer scale and is equal to 5 at the finest scale.
Note that in this case, both the wavelet and the contourlet transforms share the same detail subspaces.
The difference is that each detail subspace.

B. Nonlinear approximation

Next the authors compare the nonlinear approximation (NLA) performances of the wavelet and contourlet transforms.
In these NLA experiments, for a given value M , the authors select the M - most significant coefficients in each transform domain, and then compare the reconstructed images from these sets of M coefficients.
The authors expect that most of the refinement happens around the image edges.
The wavelet scheme is seen to slowly capture contours by isolated “dots”.
In addition, there is a significant gain of 1.46 dB in peak signalto-noise ratio (PSNR) for contourlets.

VII. CONCLUSION

The authors constructed a discrete transform that provides a sparse expansion for typical images having smooth contours.
Based on this observation, the authors developed a new filter bank structure, the contourlet filter bank, that can provide a flexible multiscale and directional decomposition for images.
This connection is defined via a directional multiresolution analysis that provides successive refinements at both spatial and directional resolution.
The authors make a change to a new coordinate (x, y) as shown in Figure 18, where λj,k̃ has vanishing moments along the x direction.
Also, for the same order, the authors can parameterize the discontinuity line as y = αx.

Did you find this useful? Give us your feedback

Figures (15)

Fig. 1. Wavelet versus new scheme: illustrating the successive refinement by the two systems near a smooth contour, which is shown as a thick curve separating two smooth regions.

Fig. 12. Interaction between a contourlet (denoted by the ellips ) and a discontinuity curve (denoted by the thick curve).

Fig. 17. Denoising experiments. From left to right, top to bottom are: original image, noisy image (PSNR = 24.42 dB), denoising using wavelets (PSNR = 29.41 dB), and denoising using contourlets (PSNR = 30.47 dB).

Fig. 18. Interaction between of a contourlet with a line discontinuity.

Fig. 16. Nonlinear approximations (NLA) by the wavelet and contourlet transforms. In each case, the original imageBarbaraof size512×512 is reconstructed from the4096-most significant coefficients. Only part of images are shownfor detail comparison.

Fig. 15. Sequence of images showing the nonlinear approximations of thePeppersimage usingM most significant coefficients at the finest detailed subspace Wj , which is shared by both the wavelet and contourlet transforms.

Fig. 14. Comparing a few actual 2-D wavelets (5 on the left) and contourlets (4 on the right).

Fig. 13. Examples of the contourlet transform on thePeppersandBarbara images. For clear visualization, each image is only decomposed into two pyramidal levels, which are then decomposed into four and eight directional subbands. Small coefficients are shown in black while large coeffi ients are shown in white.

Fig. 9. Contourlet subspaces. (a) Multiscale and multidirection subspaces generated by the contourlet transform which is illustratedon a 2-D spectrum decomposition. (b) Sampling grid and approximate support of contourlet

Fig. 4. Two-dimensional spectrum partition using quincunxfilter banks with fan filters. The black regions represent the ideal frequencysupports of each filter. Q is a quincunx sampling matrix.

Fig. 5. Example of shearing operation that is used like a rotation operation for DFB decomposition. (a) The “cameraman” image. (b) The “cameraman” image after a shearing operation.

Fig. 2. Laplacian pyramid. (a) One level of decomposition. The outputs are a coarse approximationa[ ] and a differenceb[n] between the original signal and the prediction. (b) The new reconstruction scheme for the Laplacian pyramid [26].

Fig. 3. Directional filter bank. (a) Frequency partitioningwherel = 3 and there are23 = 8 real wedge-shaped frequency bands. Subbands 0–3 correspond to the mostly horizontaldirections, while subbands 4–7 correspond to themostly verticaldirections. (b) The multichannel view of anl-level tree-structured directional filter bank.

Fig. 6. Impulse responses of 32 equivalent filters for the first half channels, corresponding to the mostly horizontal directions, of a 6-leve s DFB that uses the Haar filters. Black and gray squares correspond to+1 and −1, respectively. Because the basis functions resemble “localines”, we call them Radonlets.

Fig. 7. The contourlet filter bank: first, a multiscale decompsition into octave bands by the Laplacian pyramid is computed, and then adirectional filter bank is applied to each bandpass channel.

Content maybe subject to copyright Report

IEEE TRANSACTIONS ON IMAGE PROCESSING 1

The Contourlet Transform: An Efﬁcient

Directional Multiresolution Image Representation

Minh N. Do, Member, IEEE, and Martin Vetterli, Fellow, IEEE

Abstract—The limitations of commonly used separable ex-

tensions of one-dimensional transforms, such as the Fourier

and wavelet transforms, in capturing the geometry of image

edges are well known. In this paper, we pursue a “true” two-

dimensional transform that can capture the intrinsic geometrical

structure that is key in visual information. The main challenge

in exploring geometry in images comes from the discrete nature

of the data. Thus, unlike other approaches, such as curvelets,

that ﬁrst develop a transform in the continuous domain and

then discretize for sampled data, our approach starts with a

discrete-domain construction and then studies its convergence

to an expansion in the continuous domain. Speciﬁcally, we

construct a discrete-domain multiresolution and multidirection

expansion using non-separable ﬁlter banks, in much the same way

that wavelets were derived from ﬁlter banks. This construction

results in a ﬂexible multiresolution, local, and directional image

expansion using contour segments, and thus it is named the

contourlet transform. The discrete contourlet transform has a fast

iterated ﬁlter bank algorithm that requires an order N operations

for N -pixel images. Furthermore, we establish a precise link

between the developed ﬁlter bank and the associated continuous-

domain contourlet expansion via a directional multiresolution

analysis framework. We show that with parabolic scaling and

sufﬁcient directional vanishing moments, contourlets achieve the

optimal approximation rate for piecewise smooth functions with

discontinuities along twice continuously differentiable curves.

Finally, we show some numerical experiments demonstrating the

potential of contourlets in several image processing applications.

Index Terms—sparse representation, wavelets, contourlets,

ﬁlter banks, multiresolution, multidirection, contours, geometric

image processing.

I. INTRODUCTION

Efﬁcient representation of visual information lies at the

heart of many image processing tasks, including compression,

denoising, feature extraction, and inverse problems. Efﬁciency

of a representation refers to the ability to capture signiﬁcant

information about an object of interest using a small descrip-

tion. For image compression or content-based image retrieval,

the use of an efﬁcient representation implies the compactness

of the compressed ﬁle or the index entry for each image

in the database. For practical applications, such an efﬁcient

M. N. Do is with the Department of Electrical and Computer Engineering,

the Coordinated Science Laboratory, and the Beckman Institute, University of

Illinois at Urbana-Champaign, Urbana IL 61801 (email: minhdo@uiuc.edu).

M. Vetterli is with the Audiovisual Communications Laboratory,

Ecole

Polytechnique F´ed´erale de Lausanne (EPFL), CH-1015 Lausanne, Switzer-

land, and with the Department of Electrical Engineering and Computer

Science, University of California at Berkeley, Berkeley CA 94720 (email:

martin.vetterli@epﬂ.ch).

This work was supported in part by the US National Science Foundation

under Grant CCR-0237633 (CAREER) and the Swiss National Science

Foundation under Grant 20-63664.00.

representation has to be obtained by structured transforms and

fast algorithms.

For one-dimensional piecewise smooth signals, like scan-

lines of an image, wavelets have been established as the right

tool, because they provide an optimal representation for these

signals in a certain sense [1], [2]. In addition, the wavelet

representation is amenable to efﬁcient algorithms; in particular

it leads to fast transforms and convenient tree data structures.

These are the key reasons for the success of wavelets in

many signal processing and communication applications; for

example, the wavelet transform was adopted as the transform

for the new image-compression standard, JPEG-2000 [3].

However, natural images are not simply stacks of 1-D

piecewise smooth scan-lines; discontinuity points (i.e. edges)

are typically located along smooth curves (i.e. contours) owing

to smooth boundaries of physical objects. Thus, natural images

contain intrinsic geometrical structures that are key features

in visual information. As a result of a separable extension

from 1-D bases, wavelets in 2-D are good at isolating the dis-

continuities at edge points, but will not “see” the smoothness

along the contours. In addition, separable wavelets can capture

only limited directional information – an important and unique

feature of multidimensional signals. These disappointing be-

haviors indicate that more powerful representations are needed

in higher dimensions.

To see how one can improve the 2-D separable wavelet

transform for representing images with smooth contours,

consider the following scenario. Imagine that there are two

painters, one with a “wavelet”-style and the other with a new

style, both wishing to paint a natural scene. Both painters apply

a reﬁnement technique to increase resolution from coarse to

ﬁne. Here, efﬁciency is measured by how quickly, that is with

how few brush strokes, one can faithfully reproduce the scene.

0000000000000000

1111111111111111

00000000000000000

11111111111111111

Wavelet New scheme

Fig. 1. Wavelet versus new scheme: illustrating the successive reﬁnement

by the two systems near a smooth contour, which is shown as a thick curve

separating two smooth regions.

Consider the situation when a smooth contour is being

2 IEEE TRANSACTIONS ON IMAGE PROCESSING

painted, as shown in Figure 1. Because 2-D wavelets are con-

structed from tensor products of 1-D wavelets, the “wavelet”-

style painter is limited to using square-shaped brush strokes

along the contour, using different sizes corresponding to

the multiresolution structure of wavelets. As the resolution

becomes ﬁner, we can clearly see the limitation of the wavelet-

style painter who needs to use many ﬁne “dots” to capture the

contour.

The new style painter, on the other hand, exploits

effectively the smoothness of the contour by making brush

strokes with different elongated shapes and in a variety of

directions following the contour. This intuition was formalized

by Cand`es and Donoho in the curvelet construction [4], [5],

reviewed below in Section II.

For the human visual system, it is well-known [6] that the

receptive ﬁelds in the visual cortex are characterized as being

localized, oriented, and bandpass. Furthermore, experiments

in searching for the sparse components of natural images pro-

duced basis images that closely resemble the aforementioned

characteristics of the visual cortex [7]. This result supports the

hypothesis that the human visual system has been tuned so as

to capture the essential information of a natural scene using

a least number of visual active cells. More importantly, this

result suggests that for a computational image representation

to be efﬁcient, it should based on a local, directional, and

multiresolution expansion.

Inspired by the painting scenario and studies related to the

human visual system and natural image statistics, we identify

a “wish list” for new image representations:

1) Multiresolution. The representation should allow im-

ages to be successively approximated, from coarse to

ﬁne resolutions.

2) Localization. The basis elements in the representation

should be localized in both the spatial and the frequency

domains.

3) Critical sampling. For some applications (e.g., com-

pression), the representation should form a basis, or a

frame with small redundancy.

4) Directionality. The representation should contain basis

elements oriented at a variety of directions, much more

than the few directions that are offered by separable

wavelets.

5) Anisotropy. To capture smooth contours in images, the

representation should contain basis elements using a

variety of elongated shapes with different aspect ratios.

Among these desiderata, the ﬁrst three are successfully

provided by separable wavelets, while the last two require

new constructions. Moreover, a major challenge in capturing

geometry and directionality in images comes from the discrete

nature of the data: the input is typically sampled images

deﬁned on rectangular grids. For example, directions other

than horizontal and vertical look very different on a rectangular

grid. Because of pixelization, the notion of smooth contours

on sampled images are not obvious. For these reasons, unlike

other transforms that were initially developed in the continuous

domain and then discretized for sampled data, our approach

Or we could consider the wavelet-style painter as a pointillist!

starts with a discrete-domain construction and then studies its

convergence to an expansion in the continuous domain.

The outline of the rest of the paper is as follows. After

reviewing related work in Section II, we propose in Section III

a multiresolution and multidirection image expansion using

non-separable ﬁlter banks. This construction results in a ﬂex-

ible multiresolution, local, and directional image expansion

using contour segments, and thus it is named the contourlet

transform. It is of interest to study the limit behavior when

such schemes are iterated over scale and/or direction, which

has been analyzed in the connection between ﬁlter banks, their

iteration, and the associated wavelet construction [8], [2]. Such

a connection is studied in Section IV, where we establish

a precise link between the proposed ﬁlter bank and the

associated continuous-domain contourlet expansion in a newly

deﬁned directional multiresolution analysis framework. The

approximation power of the contourlet expansion is studied in

Section V. We show that with parabolic scaling and sufﬁcient

directional vanishing moments, contourlets achieve the optimal

approximation rate for 2-D piecewise smooth functions with

(twice continuously differentiable) contours. Numerical

experiments are presented and discussed in Section VI.

II. BACKGROUND AND RELATED WORK

Consider a general series expansion by {φ

}

∞

n=1

(e.g. a

Fourier or wavelets basis) for a given signal f as:

f =

∞

n=1

. (1)

The error decay of the best M-term approximation provides

a measurement of the efﬁciency of an expansion. The best M -

term approximation (also commonly referred to as nonlinear

approximation [1]) using this expansion is deﬁned as

n∈I

, (2)

where I

is the set of indexes of the M-largest |c

|. The

quality of the approximated function

relates to how sparse

the expansion by {φ

}

∞

n=1

is, or how well the expansion

compacts the energy of f into a few coefﬁcients.

Recently, Cand`es and Donoho [4], [5] pioneered a new

expansion in the continuous two-dimensional space R

using

curvelets. This expansion achieves essentially optimal approx-

imation behavior for 2-D piecewise smooth functions that

are C

except for discontinuities along C

curves. For this

class of functions, the best M -term approximation error (in

-norm square) kf −

using curvelets has a decay

rate of O((log M)

−2

) [5], while for wavelets this rate

is O(M

−1

) and for the Fourier basis it is O(M

−1/2

) [1],

[2]. Therefore, for typical images with smooth contours, we

expect a signiﬁcant improvement of a curvelet-like method

over wavelets, which is comparable to the improvement of

wavelets over the Fourier basis for one-dimensional piecewise

smooth signals. Perhaps equally important, the curvelet con-

struction demonstrates that it is possible to develop an optimal

representation for images with smooth contours via a ﬁxed

transform.

DO AND VETTERLI: THE CONTOURLET TRANSFORM 3

The curvelet transform was developed initially in the con-

tinuous domain [4] via multiscale ﬁltering and then applying a

block ridgelet transform [9] on each bandpass image. Later, the

authors proposed the second generation curvelet transform [5]

that was deﬁned directly via frequency partitioning without us-

ing the ridgelet transform. Both curvelet constructions require

a rotation operation and correspond to a 2-D frequency par-

tition based on the polar coordinate. This makes the curvelet

construction simple in the continuous domain but causes the

implementation for discrete images – sampled on a rectangular

grid – to be very challenging. In particular, approaching critical

sampling seems difﬁcult in such discretized constructions.

The reason for this difﬁculty, we believe, is because the

typical rectangular-sampling grid imposes a prior geometry to

discrete images; e.g. strong bias toward horizontal and vertical

directions. This fact motivates our development of a directional

multiresolution transform like curvelets, but directly in the

discrete domain, which results in the contourlet construction

described in this paper. We would like to emphasize that

although curvelet and contourlet transforms have some similar

properties and goals, the latter is not a discretized version of

the former. More comparisons between these two transforms

are provided at the end of Section IV.

Apart from curvelets and contourlets, there have recently

been several approaches in developing efﬁcient representations

of geometrical regularity. Notable examples are bandelets [10],

the edge-adapted multiscale transform [11], wedgelets [12],

[13], and quadtree coding [14]. These approaches typically

require an edge-detection stage, followed by an adaptive repre-

sentation. By contrast, curvelet and contourlet representations

are ﬁxed transforms. This feature allows them to be easily

applied in a wide range of image processing tasks, similar to

wavelets. For example, we do not have to rely on edge detec-

tion, which is unreliable and noise sensitive. Furthermore, we

can beneﬁt from the well-established knowledge in transform

coding when applying contourlets to image compression (e.g.

for bit allocation).

Several other well-known systems that provide multiscale

and directional image representations include: 2-D Gabor

wavelets [15], the cortex transform [16], the steerable pyra-

mid [17], 2-D directional wavelets [18], brushlets [19], and

complex wavelets [20]. The main differences between these

systems and our contourlet construction is that the previous

methods do not allow for a different number of directions

at each scale while achieving nearly critical sampling. In

addition, our construction employs iterated ﬁlter banks, which

makes it computationally efﬁcient, and there is a precise

connection with continuous-domain expansions.

III. DISCRETE-DOMAIN CONSTRUCTION USING FILTER

BANKS

A. Concept

Comparing the wavelet scheme with the new scheme shown

in Figure 1, we see that the improvement of the new scheme

can be attributed to the grouping of nearby wavelet coefﬁ-

cients, since they are locally correlated due to the smoothness

of the contours. Therefore, we can obtain a sparse expansion

for natural images by ﬁrst applying a multiscale transform,

followed by a local directional transform to gather the nearby

basis functions at the same scale into linear structures. In

essence, we ﬁrst use a wavelet-like transform for edge detec-

tion, and then a local directional transform for contour segment

detection. Interestingly, the latter step is similar to the popular

Hough transform [21] for line detection in computer vision.

With this insight, we proposed a double ﬁlter bank structure

(see Figure 7) [22] for obtaining sparse expansions for typical

images having smooth contours. In this double ﬁlter bank,

the Laplacian pyramid [23] is ﬁrst used to capture the point

discontinuities, and then followed by a directional ﬁlter bank

[24] to link point discontinuities into linear structures. The

overall result is an image expansion using basic elements like

contour segments, and thus are named contourlets. In par-

ticular, contourlets have elongated supports at various scales,

directions, and aspect ratios. This allows contourlets to efﬁ-

ciently approximate a smooth contour at multiple resolutions

in much the same way as the new scheme shown in Figure 1.

In the frequency domain, the contourlet transform provides a

multiscale and directional decomposition.

We would like to point out that the decoupling of multiscale

and directional decomposition stages offers a simple and

ﬂexible transform, but at the cost of a small redundancy (up

to 33%, which comes from the Laplacian pyramid). In a more

recent work [25], we developed a critically sampled contourlet

transform, which we call CRISP-contourlets, using a combined

iterated nonseparable ﬁlter bank for both multiscale and direc-

tional decomposition.

B. Pyramid frames

One way to obtain a multiscale decomposition is to use the

Laplacian pyramid (LP) introduced by Burt and Adelson [23].

The LP decomposition at each level generates a downsampled

lowpass version of the original and the difference between

the original and the prediction, resulting in a bandpass image.

Figure 2(a) depicts this decomposition process, where H

and G are called (lowpass) analysis and synthesis ﬁlters,

respectively, and M is the sampling matrix. The process can

be iterated on the coarse (downsampled lowpass) signal. Note

that in multidimensional ﬁlter banks, sampling is represented

by sampling matrices; for example, downsampling x[n] by M

yields x

[n] = x[M n], where M is an integer matrix [8].

A drawback of the LP is the implicit oversampling. How-

ever, in contrast to the critically sampled wavelet scheme,

the LP has the distinguishing feature that each pyramid level

generates only one bandpass image (even for multidimensional

cases), and this image does not have “scrambled” frequencies.

This frequency scrambling happens in the wavelet ﬁlter bank

when a highpass channel, after downsampling, is folded back

into the low frequency band, and thus its spectrum is reﬂected.

In the LP, this effect is avoided by downsampling the lowpass

channel only.

In [26], we studied the LP using the theory of frames

and oversampled ﬁlter banks. We showed that the LP with

orthogonal ﬁlters (that is, the analysis and synthesis ﬁlters

are time reversal, h[n] = g[−n], and g[n] is orthogonal

4 IEEE TRANSACTIONS ON IMAGE PROCESSING

M M

(a)

M M

ˆx

(b)

Fig. 2. Laplacian pyramid. (a) One level of decomposition. The outputs are a coarse approximation a[n] and a difference b[n] between the original signal

and the prediction. (b) The new reconstruction scheme for the Laplacian pyramid [26].

to its translates with respect to the sampling lattice by M )

provides a tight frame with frame bounds are equal to 1. In this

case, we proposed the use of the optimal linear reconstruction

using the dual frame operator (or pseudo-inverse) as shown

in Figure 2(b). The new reconstruction differs from the usual

method, where the signal is obtained by simply adding back

the difference to the prediction from the coarse signal, and

was shown [26] to achieve signiﬁcant improvement over the

usual reconstruction in the presence of noise.

C. Iterated directional ﬁlter banks

Bamberger and Smith [24] constructed a 2-D directional

ﬁlter bank (DFB) that can be maximally decimated while

achieving perfect reconstruction. The DFB is efﬁciently im-

plemented via an l-level binary tree decomposition that leads

to 2

subbands with wedge-shaped frequency partitioning as

shown in Figure 3(a). The original construction of the DFB in

[24] involves modulating the input image and using quincunx

ﬁlter banks with diamond-shaped ﬁlters [27]. To obtain the

desired frequency partition, a complicated tree expanding rule

has to be followed for ﬁner directional subbands (e.g., see [28]

for details).

In [29], we proposed a new construction for the DFB that

avoids modulating the input image and has a simpler rule

for expanding the decomposition tree. Our simpliﬁed DFB

is intuitively constructed from two building blocks. The ﬁrst

building block is a two-channel quincunx ﬁlter bank [27] with

fan ﬁlters (see Figure 4) that divides a 2-D spectrum into two

directions: horizontal and vertical. The second building block

of the DFB is a shearing operator, which amounts to just

reordering of image samples. Figure 5 shows an application

of a shearing operator where a −45

◦

direction edge becomes

a vertical edge. By adding a pair of shearing operator and its

inverse (“unshearing”) to before and after, respectively, a two-

channel ﬁlter bank in Figure 4, we obtain a different directional

frequency partition while maintaining perfect reconstruction.

Thus, the key in the DFB is to use an appropriate combination

of shearing operators together with two-direction partition of

quincunx ﬁlter banks at each node in a binary tree-structured

ﬁlter bank, to obtain the desired 2-D spectrum division as

shown in Figure 3(a). For details, see [29] (Chapter 3).

Using multirate identities [8], it is instructive to view an l-

level tree-structured DFB equivalently as a 2

parallel channel

ﬁlter bank with equivalent ﬁlters and overall sampling matrices

as shown in Figure 3(b). Denote these equivalent (directional)

synthesis ﬁlters as D

(l)

, 0 ≤ k < 2

, which correspond to the

subbands indexed as in Figure 3(a). The corresponding overall

ˆx

Fig. 4. Two-dimensional spectrum partition using quincunx ﬁlter banks with

fan ﬁlters. The black regions represent the ideal frequency supports of each

ﬁlter. Q is a quincunx sampling matrix.

(a) (b)

Fig. 5. Example of shearing operation that is used like a rotation operation

for DFB decomposition. (a) The “cameraman” image. (b) The “cameraman”

image after a shearing operation.

sampling matrices were shown [29] to have the following

diagonal forms

(l)

(

diag(2

l−1

, 2) for 0 ≤ k < 2

l−1

diag(2, 2

l−1

) for 2

l−1

≤ k < 2

(3)

which means sampling is separable. The two sets correspond

to the mostly horizontal and mostly vertical set of directions,

respectively.

From the equivalent parallel view of the DFB, we see that

the family

(l)

[n −S

(l)

0≤k<2

, m∈Z

, (4)

obtained by translating the impulse responses of the equivalent

synthesis ﬁlters D

(l)

over the sampling lattices by S

(l)

provides a basis for discrete signals in l

). This basis

exhibits both directional and localization properties. Figure 6

demonstrates this fact by showing the impulse responses of

equivalent ﬁlters from an example DFB. These basis functions

have quasi-linear supports in space and span all directions. In

other words, the basis (4) resembles a local Radon transform

and are called Radonlets. Furthermore, it can be shown [29]

that if the building block ﬁlter bank in Figure 4 uses orthogonal

ﬁlters, then the resulting DFB is orthogonal and (4) becomes

an orthogonal basis.

DO AND VETTERLI: THE CONTOURLET TRANSFORM 5

4 7

1 2

(π, π)

(−π, −π)

(a)

−1

ˆx

(b)

Fig. 3. Directional ﬁlter bank. (a) Frequency partitioning where l = 3 and there are 2

= 8 real wedge-shaped frequency bands. Subbands 0–3 correspond

to the mostly horizontal directions, while subbands 4–7 correspond to the mostly vertical directions. (b) The multichannel view of an l-level tree-structured

directional ﬁlter bank.

Fig. 6. Impulse responses of 32 equivalent ﬁlters for the ﬁrst half channels,

corresponding to the mostly horizontal directions, of a 6-levels DFB that

uses the Haar ﬁlters. Black and gray squares correspond to +1 and −1,

respectively. Because the basis functions resemble “local lines”, we call them

Radonlets.

D. Multiscale and directional decomposition: the discrete

contourlet transform

Combining the Laplacian pyramid and the directional ﬁlter

bank, we are now ready to describe their combination into a

double ﬁlter bank structure that was motivated in Section III-

A. Since the directional ﬁlter bank (DFB) was designed to

capture the high frequency (representing directionality) of the

input image, the low frequency content is poorly handled.

In fact, with the frequency partition shown in Figure 3(a),

low frequency would “leak” into several directional subbands,

hence the DFB alone does not provide a sparse representation

for images. This fact provides another reason to combine the

DFB with a multiscale decomposition, where low frequencies

of the input image are removed before applying the DFB.

Figure 7 shows a multiscale and directional decomposition

using a combination of a Laplacian pyramid (LP) and a

directional ﬁlter bank (DFB). Bandpass images from the LP

are fed into a DFB so that directional information can be

captured. The scheme can be iterated on the coarse image.

The combined result is a double iterated ﬁlter bank structure,

named contourlet ﬁlter bank, which decomposes images into

directional subbands at multiple scales.

Speciﬁcally, let a

[n] be the input image. The output after

the LP stage is J bandpass images b

[n], j = 1, 2, . . . , J

(in the ﬁne-to-coarse order) and a lowpass image a

[n].

That means, the j-th level of the LP decomposes the image

image

subbands

directional

bandpass

directional

subbands

bandpass

(2,2)

Fig. 7. The contourlet ﬁlter bank: ﬁrst, a multiscale decomposition into

octave bands by the Laplacian pyramid is computed, and then a directional

ﬁlter bank is applied to each bandpass channel.

j−1

[n] into a coarser image a

[n] and a detail image b

[n].

Each bandpass image b

[n] is further decomposed by an

-level DFB into 2

bandpass directional images c

)

j,k

[n],

k = 0, 1, . . . , 2

− 1. The main properties of the discrete

contourlet transform are stated in the following theorem.

Theorem 1: In a contourlet ﬁlter bank, the following hold:

1) If both the LP and the DFB use perfect-reconstruction

ﬁlters, then the discrete contourlet transform achieves

perfect reconstruction, which means it provides a frame

operator.

2) If both the LP and the DFB use orthogonal ﬁlters, then

the discrete contourlet transform provides a tight frame

with frame bounds equal to 1.

3) The discrete contourlet transform has a redundancy ratio

that is less than 4/3.

4) Suppose an l

-level DFB is applied at the pyramidal

level j of the LP, then the basis images of the discrete

contourlet transform (i.e. the equivalent ﬁlters of the

contourlet ﬁlter bank) have an essential support size of

width ≈ C2

and length ≈ C2

j+l

−2

5) Using FIR ﬁlters, the computational complexity of the

discrete contourlet transform is O(N) for N-pixel im-

ages.

Proof:

1) This is obvious as the discrete contourlet transform is a

composition of perfect-reconstruction blocks.

2) With orthogonal ﬁlters, the LP is a tight frame with

HTML Viewer

The contourlet transform: an efficient directional multiresolution image representation

Summary (5 min read)

Introduction

A. Concept

B. Pyramid frames

D. Multiscale and directional decomposition: the discrete contourlet transform

IV. CONTOURLETS AND DIRECTIONAL MULTIRESOLUTION ANALYSIS

C. Multiscale and multidirection: the contourlet expansion

V. CONTOURLET APPROXIMATION AND COMPRESSION

A. Parabolic scaling

B. Directional vanishing moment

C. Contourlet approximation

D. Contourlet compression

VI. NUMERICAL EXPERIMENTS

B. Nonlinear approximation

VII. CONCLUSION

Figures (15)

Citations

Cites background from "The contourlet transform: an effici..."

Cites background from "The contourlet transform: an effici..."

Cites background from "The contourlet transform: an effici..."

Cites background from "The contourlet transform: an effici..."

References

"The contourlet transform: an effici..." refers background or methods in this paper

Additional excerpts

"The contourlet transform: an effici..." refers methods in this paper

"The contourlet transform: an effici..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (10)

Q1. What are the contributions in "The contourlet transform: an efficient directional multiresolution image representation" ?

Q2. How can contourlet transforms be generalized to arbitrary tree structures?

Q3. What other well-known systems provide multiscale and directional image representations?

Q4. What is the way to achieve the optimal approximation rate for contourlets?

Q5. What is the M -term approximation error for wavelets?

Q6. What is the way to improve a curvelet-like method?

Q7. what is the sum squared error due to discarding all except M largest type 1 coefficients?

Q8. What is the reason that (l)j,k is an overcomplete frame?

Q9. What is the redundancy ratio of the discrete contourlet transform?

Q10. What is the optimal nonlinear approximation rate for contourlet functions?