scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The contourlet transform: an efficient directional multiresolution image representation

TL;DR: A "true" two-dimensional transform that can capture the intrinsic geometrical structure that is key in visual information is pursued and it is shown that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves.
Abstract: The limitations of commonly used separable extensions of one-dimensional transforms, such as the Fourier and wavelet transforms, in capturing the geometry of image edges are well known. In this paper, we pursue a "true" two-dimensional transform that can capture the intrinsic geometrical structure that is key in visual information. The main challenge in exploring geometry in images comes from the discrete nature of the data. Thus, unlike other approaches, such as curvelets, that first develop a transform in the continuous domain and then discretize for sampled data, our approach starts with a discrete-domain construction and then studies its convergence to an expansion in the continuous domain. Specifically, we construct a discrete-domain multiresolution and multidirection expansion using nonseparable filter banks, in much the same way that wavelets were derived from filter banks. This construction results in a flexible multiresolution, local, and directional image expansion using contour segments, and, thus, it is named the contourlet transform. The discrete contourlet transform has a fast iterated filter bank algorithm that requires an order N operations for N-pixel images. Furthermore, we establish a precise link between the developed filter bank and the associated continuous-domain contourlet expansion via a directional multiresolution analysis framework. We show that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves. Finally, we show some numerical experiments demonstrating the potential of contourlets in several image processing applications.

Summary (5 min read)

Introduction

  • This work was supported in part by the US National Science Foundation under Grant CCR-0237633 and the Swiss National Science Foundation under Grant 20-63664.00.
  • As a result of a separable extension from 1-D bases, wavelets in 2-D are good at isolating the discontinuities at edge points, but will not “see” the smoothness along the contours.
  • The new style painter, on the other hand, exploits effectively the smoothness of the contour by making brush strokes with different elongated shapes and in a variety of directions following the contour.
  • More importantly, this result suggests that for a computational image representation to be efficient, it should based on a local, directional, and multiresolution expansion.

A. Concept

  • Comparing the wavelet scheme with the new scheme shown in Figure 1, the authors see that the improvement of the new scheme can be attributed to the grouping of nearby wavelet coefficients, since they are locally correlated due to the smoothness of the contours.
  • In essence, the authors first use a wavelet-like transform for edge detection, and then a local directional transform for contour segment detection.
  • The authors proposed a double filter bank structure (see Figure 7) [22] for obtaining sparse expansions for typical images having smooth contours.
  • The overall result is an image expansion using basic elements like contour segments, and thus are named contourlets.
  • In the frequency domain, the contourlet transform provides a multiscale and directional decomposition.

B. Pyramid frames

  • One way to obtain a multiscale decomposition is to use the Laplacian pyramid (LP) introduced by Burt and Adelson [23].
  • The LP decomposition at each level generates a downsampled lowpass version of the original and the difference between the original and the prediction, resulting in a bandpass image.
  • Thus, the key in the DFB is to use an appropriate combination of shearing operators together with two-direction partition of quincunx filter banks at each node in a binary tree-structured filter bank, to obtain the desired 2-D spectrum division as shown in Figure 3(a).
  • These basis functions have quasi-linear supports in space and span all directions.
  • Furthermore, it can be shown [29] that if the building block filter bank in Figure 4 uses orthogonal filters, then the resulting DFB is orthogonal and (4) becomes an orthogonal basis.

D. Multiscale and directional decomposition: the discrete contourlet transform

  • Combining the Laplacian pyramid and the directional filter bank, the authors are now ready to describe their combination into a double filter bank structure that was motivated in Section IIIA.
  • That means, the j-th level of the LP decomposes the image aj−1[n] into a coarser image aj [n] and a detail image bj [n].
  • The main properties of the discrete contourlet transform are stated in the following theorem.
  • (5) For the DFB, its building block two-channel filter banks requires Ld operations per input sample.
  • Since the multiscale and directional decomposition stages are decoupled in the discrete contourlet transform, the authors can have a different number of directions at different scales, thus offering a flexible multiscale and directional expansion.

IV. CONTOURLETS AND DIRECTIONAL MULTIRESOLUTION ANALYSIS

  • As for the wavelet filter bank, the contourlet filter bank has an associated continuous-domain expansion in L2(R2) using the contourlet functions.
  • The connection between the discrete contourlet transform and the continuousdomain contourlet expansion will be made precisely via a new multiresolution analysis framework that is similar to the link between wavelets and filter banks [2].
  • The new elements in this framework are multidirection and its combination with multiscale.
  • For simplicity, the authors will only consider the case with orthogonal filters, which leads to tight frames.
  • The more general case with biorthogonal filters can be treated similarly.

C. Multiscale and multidirection: the contourlet expansion

  • Applying the directional decomposition by the family (4) onto the detail subspace Proof:.
  • Note that the number of DFB decomposition levels l can be different at different scales j, and in that case will be denoted by lj .
  • As a result, the subspace W (l) j,k is defined on a rectangular grid with intervals 2j+l−2×2j or 2j×2j+l−2, depending on whether it is mostly horizontal or vertical (see Figure 9(b)).
  • The reason that {λ(l)j,k,n}n∈Z2 is an overcomplete frame for W (l)j,k is because it uses the same sampling grid of the bigger subspace V (l)j−1,k . (22) The discrete filter w(l)k is roughly equal to the summation of convolutions between the directional filter d(l)k and bandpass filters fi’s, and thus it is a bandpass directional filter.
  • 1) The contourlet expansions are defined on rectangular grids, and thus offer a seamless translation (as demonstrated in Theorem 3) to the discrete world, where image pixels are sampled on a rectangular grid.

V. CONTOURLET APPROXIMATION AND COMPRESSION

  • The proposed contourlet filter bank and its associated continuous-domain frames in previous sections provide a framework for constructing general directional multiresolution image representations.
  • Since their goal is to develop efficient or sparse expansions for images having smooth contours, the next important issues are: (1) what conditions should the authors impose on contourlets to obtain a sparse expansion for that class of images; and (2) how can they design filter banks that can lead to contourlet expansions satisfying those conditions.
  • The authors consider the first issue in this paper; the second one is addressed in another paper [31].

A. Parabolic scaling

  • In the curvelet construction, Candès and Donoho [4] point out that a key to achieving the correct nonlinear approximation behavior by curvelets is to select support sizes obeying the parabolic scaling relation for curves: width ∝ length2.
  • The same scaling relation has been used in the study of Fourier integral operators and wave equations; for example, see [32].
  • More precisely, with the local coordinate setup as in Figure 10(a), the authors can readily verify that the parametric representation of the discontinuity curve obeys u(v) ≈ κ 2 v2, when v ≈ 0, (27) where κ is the local curvature of the curve.
  • As can be seen in the two pyramidal levels shown, as the support size of the basis element of the LP is reduced by four in each dimension, the number of directions of the DFB is doubled.
  • Combining these two stages, the support sizes of the contourlet functions evolve in accordance to the desired parabolic scaling.

B. Directional vanishing moment

  • For the wavelet case in 1-D, the wavelet approximation theory brought a novel condition into filter bank design, which earlier only focused on designing filters with good frequency selection properties.
  • Intuitively, wavelets with vanishing moments are orthogonal to polynomial signals, and thus only a few wavelet basis functions around the discontinuities points would “feel” these discontinuities and lead to significant coefficients [33].
  • The key feature of these images is that image edges are localized in both location and direction.
  • Thus, it is desirable that only few contourlet functions whose supports intersect with a contour and align with the contour local direction would “feel” this discontinuity.
  • The authors refer this requirement as the directional vanishing moment (DVM) condition.

C. Contourlet approximation

  • In this subsection the authors will show that a contourlet expansion that satisfies the parabolic scaling and has sufficient DVMs (this will be defined precisely in Lemma 1) achieves the optimal nonlinear approximation rate for 2-D piecewise C2 smooth functions with discontinuities along C2 smooth curves.
  • Therefore the authors need to bound the integral of 〈f, λj,k̃,n〉 outside region A to be the same order.
  • In addition, since the discontinuity curve S has finite length, the number of type 1 coefficients with these indexes is mj,k̃ ∼ 1/dj,k̃,n ∼ 2−j/2k̃. (39) From (38), for a type 1 coefficient to have magnitude above a threshold ǫ, it is necessary that k̃ .
  • Suppose that a compactly supported contourlet frame (24) satisfies the parabolic scaling condition (29), the contourlet functions λj,k satisfy the condition in Lemma 1, and the scaling function φ ∈ Cp has accuracy of order 2, also known as Theorem 4.
  • Then for a function f that is C2 away from a C2 discontinuity curve, the M -term approximation by this frame achieves ‖f − f̂ M ‖22 ≤ C(logM)3M−2. (45) Remark 2: The approximation rate for s in (45) is the same as the approximation rate for curvelets, which was derived in [5] and [35].

D. Contourlet compression

  • So far, the authors consider the approximation problem of contourlets by keeping the M largest coefficients.
  • Specifically, from coarse to fine scales, significant contourlet coefficients are successively localized in both location (contourlets intersect with the discontinuity curve) and direction (intersected contourlets with direction close to the local direction of the discontinuity curve).
  • Thus, using embedded tree structures for contourlet coefficients that are similar to the embedded zero-trees for wavelets [36], the authors can efficiently index the retained coefficients using 1 bit per coefficient.
  • Instead of using fixed length coding for the quantized coefficients, a slight gain (in the log factor, but not the exponent of the rate-distortion function) can be obtained by variable length coding.
  • In particular, the authors use the bit plane coding scheme [8] where coefficients with magnitude in the range (2l−1−L, 2l−L] are encoded with l bits.

VI. NUMERICAL EXPERIMENTS

  • All experiments in this section use a wavelet transform with “9-7” biorthogonal filters [37], [38] and 6 decomposition levels.
  • Apart from also being linear phase and nearly orthogonal, these fan filters are close to having the ideal frequency response and thus can approximate the directional vanishing moment condition.
  • The number of DFB decomposition levels is doubled at every other finer scale and is equal to 5 at the finest scale.
  • Note that in this case, both the wavelet and the contourlet transforms share the same detail subspaces.
  • The difference is that each detail subspace.

B. Nonlinear approximation

  • Next the authors compare the nonlinear approximation (NLA) performances of the wavelet and contourlet transforms.
  • In these NLA experiments, for a given value M , the authors select the M - most significant coefficients in each transform domain, and then compare the reconstructed images from these sets of M coefficients.
  • The authors expect that most of the refinement happens around the image edges.
  • The wavelet scheme is seen to slowly capture contours by isolated “dots”.
  • In addition, there is a significant gain of 1.46 dB in peak signalto-noise ratio (PSNR) for contourlets.

VII. CONCLUSION

  • The authors constructed a discrete transform that provides a sparse expansion for typical images having smooth contours.
  • Based on this observation, the authors developed a new filter bank structure, the contourlet filter bank, that can provide a flexible multiscale and directional decomposition for images.
  • This connection is defined via a directional multiresolution analysis that provides successive refinements at both spatial and directional resolution.
  • The authors make a change to a new coordinate (x, y) as shown in Figure 18, where λj,k̃ has vanishing moments along the x direction.
  • Also, for the same order, the authors can parameterize the discontinuity line as y = αx.

Did you find this useful? Give us your feedback

Figures (15)

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON IMAGE PROCESSING 1
The Contourlet Transform: An Efficient
Directional Multiresolution Image Representation
Minh N. Do, Member, IEEE, and Martin Vetterli, Fellow, IEEE
AbstractThe limitations of commonly used separable ex-
tensions of one-dimensional transforms, such as the Fourier
and wavelet transforms, in capturing the geometry of image
edges are well known. In this paper, we pursue a “true” two-
dimensional transform that can capture the intrinsic geometrical
structure that is key in visual information. The main challenge
in exploring geometry in images comes from the discrete nature
of the data. Thus, unlike other approaches, such as curvelets,
that first develop a transform in the continuous domain and
then discretize for sampled data, our approach starts with a
discrete-domain construction and then studies its convergence
to an expansion in the continuous domain. Specifically, we
construct a discrete-domain multiresolution and multidirection
expansion using non-separable filter banks, in much the same way
that wavelets were derived from filter banks. This construction
results in a flexible multiresolution, local, and directional image
expansion using contour segments, and thus it is named the
contourlet transform. The discrete contourlet transform has a fast
iterated filter bank algorithm that requires an order N operations
for N -pixel images. Furthermore, we establish a precise link
between the developed filter bank and the associated continuous-
domain contourlet expansion via a directional multiresolution
analysis framework. We show that with parabolic scaling and
sufficient directional vanishing moments, contourlets achieve the
optimal approximation rate for piecewise smooth functions with
discontinuities along twice continuously differentiable curves.
Finally, we show some numerical experiments demonstrating the
potential of contourlets in several image processing applications.
Index Termssparse representation, wavelets, contourlets,
filter banks, multiresolution, multidirection, contours, geometric
image processing.
I. INTRODUCTION
Efficient representation of visual information lies at the
heart of many image processing tasks, including compression,
denoising, feature extraction, and inverse problems. Efficiency
of a representation refers to the ability to capture significant
information about an object of interest using a small descrip-
tion. For image compression or content-based image retrieval,
the use of an efficient representation implies the compactness
of the compressed file or the index entry for each image
in the database. For practical applications, such an efficient
M. N. Do is with the Department of Electrical and Computer Engineering,
the Coordinated Science Laboratory, and the Beckman Institute, University of
Illinois at Urbana-Champaign, Urbana IL 61801 (email: minhdo@uiuc.edu).
M. Vetterli is with the Audiovisual Communications Laboratory,
´
Ecole
Polytechnique F´ed´erale de Lausanne (EPFL), CH-1015 Lausanne, Switzer-
land, and with the Department of Electrical Engineering and Computer
Science, University of California at Berkeley, Berkeley CA 94720 (email:
martin.vetterli@epfl.ch).
This work was supported in part by the US National Science Foundation
under Grant CCR-0237633 (CAREER) and the Swiss National Science
Foundation under Grant 20-63664.00.
representation has to be obtained by structured transforms and
fast algorithms.
For one-dimensional piecewise smooth signals, like scan-
lines of an image, wavelets have been established as the right
tool, because they provide an optimal representation for these
signals in a certain sense [1], [2]. In addition, the wavelet
representation is amenable to efficient algorithms; in particular
it leads to fast transforms and convenient tree data structures.
These are the key reasons for the success of wavelets in
many signal processing and communication applications; for
example, the wavelet transform was adopted as the transform
for the new image-compression standard, JPEG-2000 [3].
However, natural images are not simply stacks of 1-D
piecewise smooth scan-lines; discontinuity points (i.e. edges)
are typically located along smooth curves (i.e. contours) owing
to smooth boundaries of physical objects. Thus, natural images
contain intrinsic geometrical structures that are key features
in visual information. As a result of a separable extension
from 1-D bases, wavelets in 2-D are good at isolating the dis-
continuities at edge points, but will not “see” the smoothness
along the contours. In addition, separable wavelets can capture
only limited directional information an important and unique
feature of multidimensional signals. These disappointing be-
haviors indicate that more powerful representations are needed
in higher dimensions.
To see how one can improve the 2-D separable wavelet
transform for representing images with smooth contours,
consider the following scenario. Imagine that there are two
painters, one with a “wavelet”-style and the other with a new
style, both wishing to paint a natural scene. Both painters apply
a refinement technique to increase resolution from coarse to
fine. Here, efficiency is measured by how quickly, that is with
how few brush strokes, one can faithfully reproduce the scene.
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
1111111111111111
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
00000000000000000
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
11111111111111111
Wavelet New scheme
Fig. 1. Wavelet versus new scheme: illustrating the successive refinement
by the two systems near a smooth contour, which is shown as a thick curve
separating two smooth regions.
Consider the situation when a smooth contour is being

2 IEEE TRANSACTIONS ON IMAGE PROCESSING
painted, as shown in Figure 1. Because 2-D wavelets are con-
structed from tensor products of 1-D wavelets, the “wavelet”-
style painter is limited to using square-shaped brush strokes
along the contour, using different sizes corresponding to
the multiresolution structure of wavelets. As the resolution
becomes finer, we can clearly see the limitation of the wavelet-
style painter who needs to use many fine “dots” to capture the
contour.
1
The new style painter, on the other hand, exploits
effectively the smoothness of the contour by making brush
strokes with different elongated shapes and in a variety of
directions following the contour. This intuition was formalized
by Cand`es and Donoho in the curvelet construction [4], [5],
reviewed below in Section II.
For the human visual system, it is well-known [6] that the
receptive fields in the visual cortex are characterized as being
localized, oriented, and bandpass. Furthermore, experiments
in searching for the sparse components of natural images pro-
duced basis images that closely resemble the aforementioned
characteristics of the visual cortex [7]. This result supports the
hypothesis that the human visual system has been tuned so as
to capture the essential information of a natural scene using
a least number of visual active cells. More importantly, this
result suggests that for a computational image representation
to be efficient, it should based on a local, directional, and
multiresolution expansion.
Inspired by the painting scenario and studies related to the
human visual system and natural image statistics, we identify
a “wish list” for new image representations:
1) Multiresolution. The representation should allow im-
ages to be successively approximated, from coarse to
fine resolutions.
2) Localization. The basis elements in the representation
should be localized in both the spatial and the frequency
domains.
3) Critical sampling. For some applications (e.g., com-
pression), the representation should form a basis, or a
frame with small redundancy.
4) Directionality. The representation should contain basis
elements oriented at a variety of directions, much more
than the few directions that are offered by separable
wavelets.
5) Anisotropy. To capture smooth contours in images, the
representation should contain basis elements using a
variety of elongated shapes with different aspect ratios.
Among these desiderata, the first three are successfully
provided by separable wavelets, while the last two require
new constructions. Moreover, a major challenge in capturing
geometry and directionality in images comes from the discrete
nature of the data: the input is typically sampled images
defined on rectangular grids. For example, directions other
than horizontal and vertical look very different on a rectangular
grid. Because of pixelization, the notion of smooth contours
on sampled images are not obvious. For these reasons, unlike
other transforms that were initially developed in the continuous
domain and then discretized for sampled data, our approach
1
Or we could consider the wavelet-style painter as a pointillist!
starts with a discrete-domain construction and then studies its
convergence to an expansion in the continuous domain.
The outline of the rest of the paper is as follows. After
reviewing related work in Section II, we propose in Section III
a multiresolution and multidirection image expansion using
non-separable filter banks. This construction results in a flex-
ible multiresolution, local, and directional image expansion
using contour segments, and thus it is named the contourlet
transform. It is of interest to study the limit behavior when
such schemes are iterated over scale and/or direction, which
has been analyzed in the connection between filter banks, their
iteration, and the associated wavelet construction [8], [2]. Such
a connection is studied in Section IV, where we establish
a precise link between the proposed filter bank and the
associated continuous-domain contourlet expansion in a newly
defined directional multiresolution analysis framework. The
approximation power of the contourlet expansion is studied in
Section V. We show that with parabolic scaling and sufficient
directional vanishing moments, contourlets achieve the optimal
approximation rate for 2-D piecewise smooth functions with
C
2
(twice continuously differentiable) contours. Numerical
experiments are presented and discussed in Section VI.
II. BACKGROUND AND RELATED WORK
Consider a general series expansion by {φ
n
}
n=1
(e.g. a
Fourier or wavelets basis) for a given signal f as:
f =
X
n=1
c
n
φ
n
. (1)
The error decay of the best M-term approximation provides
a measurement of the efficiency of an expansion. The best M -
term approximation (also commonly referred to as nonlinear
approximation [1]) using this expansion is defined as
ˆ
f
M
=
X
nI
M
c
n
φ
n
, (2)
where I
M
is the set of indexes of the M-largest |c
n
|. The
quality of the approximated function
ˆ
f
M
relates to how sparse
the expansion by {φ
n
}
n=1
is, or how well the expansion
compacts the energy of f into a few coefficients.
Recently, Cand`es and Donoho [4], [5] pioneered a new
expansion in the continuous two-dimensional space R
2
using
curvelets. This expansion achieves essentially optimal approx-
imation behavior for 2-D piecewise smooth functions that
are C
2
except for discontinuities along C
2
curves. For this
class of functions, the best M -term approximation error (in
L
2
-norm square) kf
ˆ
f
M
k
2
2
using curvelets has a decay
rate of O((log M)
3
M
2
) [5], while for wavelets this rate
is O(M
1
) and for the Fourier basis it is O(M
1/2
) [1],
[2]. Therefore, for typical images with smooth contours, we
expect a significant improvement of a curvelet-like method
over wavelets, which is comparable to the improvement of
wavelets over the Fourier basis for one-dimensional piecewise
smooth signals. Perhaps equally important, the curvelet con-
struction demonstrates that it is possible to develop an optimal
representation for images with smooth contours via a fixed
transform.

DO AND VETTERLI: THE CONTOURLET TRANSFORM 3
The curvelet transform was developed initially in the con-
tinuous domain [4] via multiscale filtering and then applying a
block ridgelet transform [9] on each bandpass image. Later, the
authors proposed the second generation curvelet transform [5]
that was defined directly via frequency partitioning without us-
ing the ridgelet transform. Both curvelet constructions require
a rotation operation and correspond to a 2-D frequency par-
tition based on the polar coordinate. This makes the curvelet
construction simple in the continuous domain but causes the
implementation for discrete images sampled on a rectangular
grid to be very challenging. In particular, approaching critical
sampling seems difficult in such discretized constructions.
The reason for this difficulty, we believe, is because the
typical rectangular-sampling grid imposes a prior geometry to
discrete images; e.g. strong bias toward horizontal and vertical
directions. This fact motivates our development of a directional
multiresolution transform like curvelets, but directly in the
discrete domain, which results in the contourlet construction
described in this paper. We would like to emphasize that
although curvelet and contourlet transforms have some similar
properties and goals, the latter is not a discretized version of
the former. More comparisons between these two transforms
are provided at the end of Section IV.
Apart from curvelets and contourlets, there have recently
been several approaches in developing efficient representations
of geometrical regularity. Notable examples are bandelets [10],
the edge-adapted multiscale transform [11], wedgelets [12],
[13], and quadtree coding [14]. These approaches typically
require an edge-detection stage, followed by an adaptive repre-
sentation. By contrast, curvelet and contourlet representations
are fixed transforms. This feature allows them to be easily
applied in a wide range of image processing tasks, similar to
wavelets. For example, we do not have to rely on edge detec-
tion, which is unreliable and noise sensitive. Furthermore, we
can benefit from the well-established knowledge in transform
coding when applying contourlets to image compression (e.g.
for bit allocation).
Several other well-known systems that provide multiscale
and directional image representations include: 2-D Gabor
wavelets [15], the cortex transform [16], the steerable pyra-
mid [17], 2-D directional wavelets [18], brushlets [19], and
complex wavelets [20]. The main differences between these
systems and our contourlet construction is that the previous
methods do not allow for a different number of directions
at each scale while achieving nearly critical sampling. In
addition, our construction employs iterated filter banks, which
makes it computationally efficient, and there is a precise
connection with continuous-domain expansions.
III. DISCRETE-DOMAIN CONSTRUCTION USING FILTER
BANKS
A. Concept
Comparing the wavelet scheme with the new scheme shown
in Figure 1, we see that the improvement of the new scheme
can be attributed to the grouping of nearby wavelet coeffi-
cients, since they are locally correlated due to the smoothness
of the contours. Therefore, we can obtain a sparse expansion
for natural images by first applying a multiscale transform,
followed by a local directional transform to gather the nearby
basis functions at the same scale into linear structures. In
essence, we first use a wavelet-like transform for edge detec-
tion, and then a local directional transform for contour segment
detection. Interestingly, the latter step is similar to the popular
Hough transform [21] for line detection in computer vision.
With this insight, we proposed a double filter bank structure
(see Figure 7) [22] for obtaining sparse expansions for typical
images having smooth contours. In this double filter bank,
the Laplacian pyramid [23] is first used to capture the point
discontinuities, and then followed by a directional filter bank
[24] to link point discontinuities into linear structures. The
overall result is an image expansion using basic elements like
contour segments, and thus are named contourlets. In par-
ticular, contourlets have elongated supports at various scales,
directions, and aspect ratios. This allows contourlets to effi-
ciently approximate a smooth contour at multiple resolutions
in much the same way as the new scheme shown in Figure 1.
In the frequency domain, the contourlet transform provides a
multiscale and directional decomposition.
We would like to point out that the decoupling of multiscale
and directional decomposition stages offers a simple and
flexible transform, but at the cost of a small redundancy (up
to 33%, which comes from the Laplacian pyramid). In a more
recent work [25], we developed a critically sampled contourlet
transform, which we call CRISP-contourlets, using a combined
iterated nonseparable filter bank for both multiscale and direc-
tional decomposition.
B. Pyramid frames
One way to obtain a multiscale decomposition is to use the
Laplacian pyramid (LP) introduced by Burt and Adelson [23].
The LP decomposition at each level generates a downsampled
lowpass version of the original and the difference between
the original and the prediction, resulting in a bandpass image.
Figure 2(a) depicts this decomposition process, where H
and G are called (lowpass) analysis and synthesis filters,
respectively, and M is the sampling matrix. The process can
be iterated on the coarse (downsampled lowpass) signal. Note
that in multidimensional filter banks, sampling is represented
by sampling matrices; for example, downsampling x[n] by M
yields x
d
[n] = x[M n], where M is an integer matrix [8].
A drawback of the LP is the implicit oversampling. How-
ever, in contrast to the critically sampled wavelet scheme,
the LP has the distinguishing feature that each pyramid level
generates only one bandpass image (even for multidimensional
cases), and this image does not have “scrambled” frequencies.
This frequency scrambling happens in the wavelet filter bank
when a highpass channel, after downsampling, is folded back
into the low frequency band, and thus its spectrum is reflected.
In the LP, this effect is avoided by downsampling the lowpass
channel only.
In [26], we studied the LP using the theory of frames
and oversampled filter banks. We showed that the LP with
orthogonal filters (that is, the analysis and synthesis filters
are time reversal, h[n] = g[n], and g[n] is orthogonal

4 IEEE TRANSACTIONS ON IMAGE PROCESSING
+_
M M
H
G
x
a
b
(a)
+
+
_
M M
H
G
a
b
ˆx
(b)
Fig. 2. Laplacian pyramid. (a) One level of decomposition. The outputs are a coarse approximation a[n] and a difference b[n] between the original signal
and the prediction. (b) The new reconstruction scheme for the Laplacian pyramid [26].
to its translates with respect to the sampling lattice by M )
provides a tight frame with frame bounds are equal to 1. In this
case, we proposed the use of the optimal linear reconstruction
using the dual frame operator (or pseudo-inverse) as shown
in Figure 2(b). The new reconstruction differs from the usual
method, where the signal is obtained by simply adding back
the difference to the prediction from the coarse signal, and
was shown [26] to achieve significant improvement over the
usual reconstruction in the presence of noise.
C. Iterated directional filter banks
Bamberger and Smith [24] constructed a 2-D directional
filter bank (DFB) that can be maximally decimated while
achieving perfect reconstruction. The DFB is efficiently im-
plemented via an l-level binary tree decomposition that leads
to 2
l
subbands with wedge-shaped frequency partitioning as
shown in Figure 3(a). The original construction of the DFB in
[24] involves modulating the input image and using quincunx
filter banks with diamond-shaped filters [27]. To obtain the
desired frequency partition, a complicated tree expanding rule
has to be followed for finer directional subbands (e.g., see [28]
for details).
In [29], we proposed a new construction for the DFB that
avoids modulating the input image and has a simpler rule
for expanding the decomposition tree. Our simplified DFB
is intuitively constructed from two building blocks. The first
building block is a two-channel quincunx filter bank [27] with
fan filters (see Figure 4) that divides a 2-D spectrum into two
directions: horizontal and vertical. The second building block
of the DFB is a shearing operator, which amounts to just
reordering of image samples. Figure 5 shows an application
of a shearing operator where a 45
direction edge becomes
a vertical edge. By adding a pair of shearing operator and its
inverse (“unshearing”) to before and after, respectively, a two-
channel lter bank in Figure 4, we obtain a different directional
frequency partition while maintaining perfect reconstruction.
Thus, the key in the DFB is to use an appropriate combination
of shearing operators together with two-direction partition of
quincunx filter banks at each node in a binary tree-structured
filter bank, to obtain the desired 2-D spectrum division as
shown in Figure 3(a). For details, see [29] (Chapter 3).
Using multirate identities [8], it is instructive to view an l-
level tree-structured DFB equivalently as a 2
l
parallel channel
filter bank with equivalent filters and overall sampling matrices
as shown in Figure 3(b). Denote these equivalent (directional)
synthesis filters as D
(l)
k
, 0 k < 2
l
, which correspond to the
subbands indexed as in Figure 3(a). The corresponding overall
+
x
y
0
y
1
ˆx
Q
Q
Q
Q
Fig. 4. Two-dimensional spectrum partition using quincunx filter banks with
fan filters. The black regions represent the ideal frequency supports of each
filter. Q is a quincunx sampling matrix.
(a) (b)
Fig. 5. Example of shearing operation that is used like a rotation operation
for DFB decomposition. (a) The “cameraman” image. (b) The “cameraman”
image after a shearing operation.
sampling matrices were shown [29] to have the following
diagonal forms
S
(l)
k
=
(
diag(2
l1
, 2) for 0 k < 2
l1
,
diag(2, 2
l1
) for 2
l1
k < 2
l
,
(3)
which means sampling is separable. The two sets correspond
to the mostly horizontal and mostly vertical set of directions,
respectively.
From the equivalent parallel view of the DFB, we see that
the family
n
d
(l)
k
[n S
(l)
k
m]
o
0k<2
l
, mZ
2
, (4)
obtained by translating the impulse responses of the equivalent
synthesis filters D
(l)
k
over the sampling lattices by S
(l)
k
,
provides a basis for discrete signals in l
2
(Z
2
). This basis
exhibits both directional and localization properties. Figure 6
demonstrates this fact by showing the impulse responses of
equivalent filters from an example DFB. These basis functions
have quasi-linear supports in space and span all directions. In
other words, the basis (4) resembles a local Radon transform
and are called Radonlets. Furthermore, it can be shown [29]
that if the building block filter bank in Figure 4 uses orthogonal
filters, then the resulting DFB is orthogonal and (4) becomes
an orthogonal basis.

DO AND VETTERLI: THE CONTOURLET TRANSFORM 5
3
7
6
5
4 7
6
5
4
0
12
3
0
1 2
ω
1
ω
2
(π, π)
(π, π)
(a)
+
x
E
0
E
1
E
2
l
1
D
0
D
1
D
2
l
1
S
0
S
0
S
1
S
1
S
2
l
1
S
2
l
1
y
0
y
1
y
2
l
1
ˆx
(b)
Fig. 3. Directional filter bank. (a) Frequency partitioning where l = 3 and there are 2
3
= 8 real wedge-shaped frequency bands. Subbands 0–3 correspond
to the mostly horizontal directions, while subbands 4–7 correspond to the mostly vertical directions. (b) The multichannel view of an l-level tree-structured
directional filter bank.
Fig. 6. Impulse responses of 32 equivalent filters for the first half channels,
corresponding to the mostly horizontal directions, of a 6-levels DFB that
uses the Haar filters. Black and gray squares correspond to +1 and 1,
respectively. Because the basis functions resemble “local lines”, we call them
Radonlets.
D. Multiscale and directional decomposition: the discrete
contourlet transform
Combining the Laplacian pyramid and the directional filter
bank, we are now ready to describe their combination into a
double filter bank structure that was motivated in Section III-
A. Since the directional filter bank (DFB) was designed to
capture the high frequency (representing directionality) of the
input image, the low frequency content is poorly handled.
In fact, with the frequency partition shown in Figure 3(a),
low frequency would “leak” into several directional subbands,
hence the DFB alone does not provide a sparse representation
for images. This fact provides another reason to combine the
DFB with a multiscale decomposition, where low frequencies
of the input image are removed before applying the DFB.
Figure 7 shows a multiscale and directional decomposition
using a combination of a Laplacian pyramid (LP) and a
directional filter bank (DFB). Bandpass images from the LP
are fed into a DFB so that directional information can be
captured. The scheme can be iterated on the coarse image.
The combined result is a double iterated filter bank structure,
named contourlet filter bank, which decomposes images into
directional subbands at multiple scales.
Specifically, let a
0
[n] be the input image. The output after
the LP stage is J bandpass images b
j
[n], j = 1, 2, . . . , J
(in the fine-to-coarse order) and a lowpass image a
J
[n].
That means, the j-th level of the LP decomposes the image
image
subbands
directional
bandpass
directional
subbands
bandpass
(2,2)
Fig. 7. The contourlet filter bank: first, a multiscale decomposition into
octave bands by the Laplacian pyramid is computed, and then a directional
filter bank is applied to each bandpass channel.
a
j1
[n] into a coarser image a
j
[n] and a detail image b
j
[n].
Each bandpass image b
j
[n] is further decomposed by an
l
j
-level DFB into 2
l
j
bandpass directional images c
(l
j
)
j,k
[n],
k = 0, 1, . . . , 2
l
j
1. The main properties of the discrete
contourlet transform are stated in the following theorem.
Theorem 1: In a contourlet filter bank, the following hold:
1) If both the LP and the DFB use perfect-reconstruction
filters, then the discrete contourlet transform achieves
perfect reconstruction, which means it provides a frame
operator.
2) If both the LP and the DFB use orthogonal filters, then
the discrete contourlet transform provides a tight frame
with frame bounds equal to 1.
3) The discrete contourlet transform has a redundancy ratio
that is less than 4/3.
4) Suppose an l
j
-level DFB is applied at the pyramidal
level j of the LP, then the basis images of the discrete
contourlet transform (i.e. the equivalent filters of the
contourlet filter bank) have an essential support size of
width C2
j
and length C2
j+l
j
2
.
5) Using FIR filters, the computational complexity of the
discrete contourlet transform is O(N) for N-pixel im-
ages.
Proof:
1) This is obvious as the discrete contourlet transform is a
composition of perfect-reconstruction blocks.
2) With orthogonal filters, the LP is a tight frame with

Citations
More filters
Journal ArticleDOI
TL;DR: This paper describes two digital implementations of a new mathematical transform, namely, the second generation curvelet transform in two and three dimensions, based on unequally spaced fast Fourier transforms, while the second is based on the wrapping of specially selected Fourier samples.
Abstract: This paper describes two digital implementations of a new mathematical transform, namely, the second generation curvelet transform in two and three dimensions. The first digital transformation is based on unequally spaced fast Fourier transforms, while the second is based on the wrapping of specially selected Fourier samples. The two implementations essentially differ by the choice of spatial grid used to translate curvelets at each scale and angle. Both digital transformations return a table of digital curvelet coefficients indexed by a scale parameter, an orientation parameter, and a spatial location parameter. And both implementations are fast in the sense that they run in O(n^2 log n) flops for n by n Cartesian arrays; in addition, they are also invertible, with rapid inversion algorithms of about the same complexity. Our digital transformations improve upon earlier implementations—based upon the first generation of curvelets—in the sense that they are conceptually simpler, faster, and far less redundant. The software CurveLab, which implements both transforms presented in this paper, is available at http://www.curvelet.org.

2,603 citations


Cites background from "The contourlet transform: an effici..."

  • ...A more recent, very interesting attempt at implementing low-redundancy curvelets was introduced by Do and Vetterli in [16]....

    [...]

  • ...Given the significance of such intermediate dimensional phenomena, there has been a vigorous research effort to provide better adapted alternatives by combining ideas from geometry with ideas from traditional multiscale analysis [17, 19, 4, 31, 14, 16]....

    [...]

Journal ArticleDOI
TL;DR: The aim of this paper is to introduce a few key notions and applications connected to sparsity, targeting newcomers interested in either the mathematical aspects of this area or its applications.
Abstract: A full-rank matrix ${\bf A}\in \mathbb{R}^{n\times m}$ with $n

2,372 citations


Cites background from "The contourlet transform: an effici..."

  • ...How can we wisely choose A to perform well on the signals we have in mind? One line of work considered choosing preconstructed dictionaries, such as undecimated wavelets [149], steerable wavelets [145, 37, 136], contourlets [38, 39, 40, 70, 71], curvelets [146, 12], and others [22, 123]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a design framework based on the mapping approach, that allows for a fast implementation based on a lifting or ladder structure, and only uses one-dimensional filtering in some cases.
Abstract: In this paper, we develop the nonsubsampled contourlet transform (NSCT) and study its applications. The construction proposed in this paper is based on a nonsubsampled pyramid structure and nonsubsampled directional filter banks. The result is a flexible multiscale, multidirection, and shift-invariant image decomposition that can be efficiently implemented via the a trous algorithm. At the core of the proposed scheme is the nonseparable two-channel nonsubsampled filter bank (NSFB). We exploit the less stringent design condition of the NSFB to design filters that lead to a NSCT with better frequency selectivity and regularity when compared to the contourlet transform. We propose a design framework based on the mapping approach, that allows for a fast implementation based on a lifting or ladder structure, and only uses one-dimensional filtering in some cases. In addition, our design ensures that the corresponding frame elements are regular, symmetric, and the frame is close to a tight one. We assess the performance of the NSCT in image denoising and enhancement applications. In both applications the NSCT compares favorably to other existing methods in the literature

1,900 citations


Cites background from "The contourlet transform: an effici..."

  • ...These include adaptive schemes such as wedgelets [9], [10] and bandelets [11], and non-adaptive ones such as curvelets [12 ] and contourlets [13]....

    [...]

  • ...In particular, it can satisfy the a nisotropic scaling law — a key property in establishing the expansion nonlinear approxim ation behavior [12], [13]....

    [...]

Journal ArticleDOI
22 Apr 2010
TL;DR: This paper surveys the various options such training has to offer, up to the most recent contributions and structures of the MOD, the K-SVD, the Generalized PCA and others.
Abstract: Sparse and redundant representation modeling of data assumes an ability to describe signals as linear combinations of a few atoms from a pre-specified dictionary. As such, the choice of the dictionary that sparsifies the signals is crucial for the success of this model. In general, the choice of a proper dictionary can be done using one of two ways: i) building a sparsifying dictionary based on a mathematical model of the data, or ii) learning a dictionary to perform best on a training set. In this paper we describe the evolution of these two paradigms. As manifestations of the first approach, we cover topics such as wavelets, wavelet packets, contourlets, and curvelets, all aiming to exploit 1-D and 2-D mathematical models for constructing effective dictionaries for signals and images. Dictionary learning takes a different route, attaching the dictionary to a set of examples it is supposed to serve. From the seminal work of Field and Olshausen, through the MOD, the K-SVD, the Generalized PCA and others, this paper surveys the various options such training has to offer, up to the most recent contributions and structures.

1,345 citations


Cites background from "The contourlet transform: an effici..."

  • ...The dictionaries of this sort are characterized by an analytic formulation, and are usually supported by a set of optimality proofs and error rate bounds....

    [...]

Journal ArticleDOI
TL;DR: Deep Convolutional Neural Networks (CNNs) as mentioned in this paper are a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing.
Abstract: Deep Convolutional Neural Network (CNN) is a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing. Some of the exciting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, and Speech Recognition. The powerful learning ability of deep CNN is primarily due to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in the hardware technology has accelerated the research in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the significant improvement in the representational capacity of the deep CNN is achieved through architectural innovations. Notably, the ideas of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing have gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and, consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges, and applications of CNN are also provided.

1,328 citations

References
More filters
Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations


"The contourlet transform: an effici..." refers background or methods in this paper

  • ...The more general case with biorthogonal filters can be treated similarly....

    [...]

  • ...Under certain regularity conditions, the lowpass synthesis filter in the iterated LP uniquely defines a unique scaling function that satisfies the following two-scale equation [2], [8] (7) Let (8) Then the family is an orthonormal basis for an approximation subspace at the scale ....

    [...]

  • ...In this double filter bank, the Laplacian pyramid (LP) [23] is first used to capture the point discontinuities, and then followed by a directional filter bank (DFB) [24] to link point discontinuities into linear structures....

    [...]

Journal ArticleDOI
TL;DR: This method is used to examine receptive fields of a more complex type and to make additional observations on binocular interaction and this approach is necessary in order to understand the behaviour of individual cells, but it fails to deal with the problem of the relationship of one cell to its neighbours.
Abstract: What chiefly distinguishes cerebral cortex from other parts of the central nervous system is the great diversity of its cell types and interconnexions. It would be astonishing if such a structure did not profoundly modify the response patterns of fibres coming into it. In the cat's visual cortex, the receptive field arrangements of single cells suggest that there is indeed a degree of complexity far exceeding anything yet seen at lower levels in the visual system. In a previous paper we described receptive fields of single cortical cells, observing responses to spots of light shone on one or both retinas (Hubel & Wiesel, 1959). In the present work this method is used to examine receptive fields of a more complex type (Part I) and to make additional observations on binocular interaction (Part II). This approach is necessary in order to understand the behaviour of individual cells, but it fails to deal with the problem of the relationship of one cell to its neighbours. In the past, the technique of recording evoked slow waves has been used with great success in studies of functional anatomy. It was employed by Talbot & Marshall (1941) and by Thompson, Woolsey & Talbot (1950) for mapping out the visual cortex in the rabbit, cat, and monkey. Daniel & Whitteiidge (1959) have recently extended this work in the primate. Most of our present knowledge of retinotopic projections, binocular overlap, and the second visual area is based on these investigations. Yet the method of evoked potentials is valuable mainly for detecting behaviour common to large populations of neighbouring cells; it cannot differentiate functionally between areas of cortex smaller than about 1 mm2. To overcome this difficulty a method has in recent years been developed for studying cells separately or in small groups during long micro-electrode penetrations through nervous tissue. Responses are correlated with cell location by reconstructing the electrode tracks from histological material. These techniques have been applied to

12,923 citations


Additional excerpts

  • ...Index Terms—Contourlets, contours, filter banks, geometric image processing, multidirection, multiresolution, sparse representation, wavelets....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors developed a spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coefficients, and achieved a performance within a factor log 2 n of the ideal performance of piecewise polynomial and variable-knot spline methods.
Abstract: SUMMARY With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle offers dramatic advantages over traditional linear estimation by nonadaptive kernels; however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatially-adaptive estimation: selective wavelet reconstruction. We show that variable-knot spline fits and piecewise-polynomial fits, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coefficients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality in multivariate normal decision theory which we call the oracle inequality shows that attained performance differs from ideal performance by at most a factor of approximately 2 log n, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variableknot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.

8,153 citations

Journal ArticleDOI
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations


"The contourlet transform: an effici..." refers methods in this paper

  • ...One way to obtain a multiscale decomposition is to use the LP introduced by Burt and Adelson [23]....

    [...]

  • ...In this double filter bank, the Laplacian pyramid (LP) [23] is first used to capture the point discontinuities, and then followed by a directional filter bank (DFB) [24] to link point discontinuities into linear structures....

    [...]

Journal ArticleDOI
13 Jun 1996-Nature
TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.
Abstract: The receptive fields of simple cells in mammalian primary visual cortex can be characterized as being spatially localized, oriented and bandpass (selective to structure at different spatial scales), comparable to the basis functions of wavelet transforms. One approach to understanding such response properties of visual neurons has been to consider their relationship to the statistical structure of natural images in terms of efficient coding. Along these lines, a number of studies have attempted to train unsupervised learning algorithms on natural images in the hope of developing receptive fields with similar properties, but none has succeeded in producing a full set that spans the image space and contains all three of the above properties. Here we investigate the proposal that a coding strategy that maximizes sparseness is sufficient to account for these properties. We show that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex. The resulting sparse image code provides a more efficient representation for later stages of processing because it possesses a higher degree of statistical independence among its outputs.

5,947 citations


"The contourlet transform: an effici..." refers background in this paper

  • ...Furthermore, experiments in searching for the sparse components of natural images pro duced basis images that closely resemble the aforementione d characteristics of the visual cortex [7]....

    [...]

Frequently Asked Questions (10)
Q1. What are the contributions in "The contourlet transform: an efficient directional multiresolution image representation" ?

In this paper, the authors pursue a “ true ” twodimensional transform that can capture the intrinsic geometrical structure that is key in visual information. Thus, unlike other approaches, such as curvelets, that first develop a transform in the continuous domain and then discretize for sampled data, their approach starts with a discrete-domain construction and then studies its convergence to an expansion in the continuous domain. The authors show that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves. Finally, the authors show some numerical experiments demonstrating the potential of contourlets in several image processing applications. Furthermore, the authors establish a precise link between the developed filter bank and the associated continuousdomain contourlet expansion via a directional multiresolution analysis framework. 

the full binary tree decomposition of the DFB in the contourlet transform can be generalized to arbitrary tree structures, similar to the wavelet packets generalization of the wavelet transform [30]. 

Several other well-known systems that provide multiscale and directional image representations include: 2-D Gabor wavelets [15], the cortex transform [16], the steerable pyramid [17], 2-D directional wavelets [18], brushlets [19], and complex wavelets [20]. 

With parabolic scaling and sufficient directional vanishing moments, the contourlet expansion is shown to achieve the optimal approximation rate for piecewise C2 smooth images with C2 smooth contours. 

For this class of functions, the best M -term approximation error (in L2-norm square) ‖f − f̂M‖22 using curvelets has a decay rate of O((logM)3M−2) [5], while for wavelets this rate is O(M−1) and for the Fourier basis it is O(M−1/2) [1], [2]. 

for typical images with smooth contours, the authors expect a significant improvement of a curvelet-like method over wavelets, which is comparable to the improvement of wavelets over the Fourier basis for one-dimensional piecewise smooth signals. 

Suppose that the scaling function φ has accuracy of order 2, which is equivalent to requiring the filter G(ejω1 , ejω2) in (7) to have a second-order zero at (π, π) [34], that is for all p1, p2 ∈ Z; 0 ≤ p1 + p2 < 2,∂p1ω1∂ p2 ω2G(e jω1 , ejω2) ∣ ∣ (π,π) = 0. (42)Then for f ∈ C2, the authors have‖f − PVjf‖2 ∼ (2j)2. (43)Thus, the sum squared error due to discarding all except M ∼ 2−2j type 2 contourlet coefficients down to scale 2j satisfiesE2(M) ∼ (2j)4 ∼M−2. (44)Combining (41) and (44), and using (32), the authors obtain the following result that characterizes the approximation power of contourlets. 

The reason that {λ(l)j,k,n}n∈Z2 is an overcomplete frame for W (l)j,k is because it uses the same sampling grid of the bigger subspace V (l)j−1,k . 

4) Suppose an lj-level DFB is applied at the pyramidal level j of the LP, then the basis images of the discrete contourlet transform (i.e. the equivalent filters of the contourlet filter bank) have an essential support size of width ≈ C2j and length ≈ C2j+lj−2. 

In this subsection the authors will show that a contourlet expansion that satisfies the parabolic scaling and has sufficient DVMs (this will be defined precisely in Lemma 1) achieves the optimal nonlinear approximation rate for 2-D piecewise C2 smooth functions with discontinuities along C2 smooth curves.