Journal Article•DOI•

Ideal spatial adaptation by wavelet shrinkage

David L. Donoho¹, Jain M. Johnstone¹•Institutions (1)

01 Sep 1994-Biometrika (Oxford University Press)-Vol. 81, Iss: 3, pp 425-455

TL;DR: In this article, the authors developed a spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coefficients, and achieved a performance within a factor log 2 n of the ideal performance of piecewise polynomial and variable-knot spline methods.

read less

Abstract: SUMMARY With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle offers dramatic advantages over traditional linear estimation by nonadaptive kernels; however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatially-adaptive estimation: selective wavelet reconstruction. We show that variable-knot spline fits and piecewise-polynomial fits, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coefficients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality in multivariate normal decision theory which we call the oracle inequality shows that attained performance differs from ideal performance by at most a factor of approximately 2 log n, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variableknot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.

...read moreread less

Summary (4 min read)

Jump to: [1.1 Spatially Adaptive Methods] – [1.2 Ideal Adaptation with Oracles] – [1.3 Selective Wavelet Reconstruction as a Spatially Adaptive Method] – [1.4 Near-Ideal Spatial Adaptation by Wavelets] – [1.5 Universality of Wavelets as a Spatially Adaptive Procedure] – [1.6 Contents] – [2.1 Oracles for Diagonal Linear Projection] – [2.2 Adaptive Wavelet Shrinkage] – [2.3 Implementation] – [3 Piecewise Polynomials are not more powerful than Wavelets] – [4.1 Variations on Choice of Oracle] – [4.2 Variations on Choice of Threshold] – [4.4 Numerical measures of t] – [4.5 Other Adaptive Properties] – [4.6 Boundary correction] – [4.7 Relation to Model Selection] – [5.5 Theorem 3] and [5.6 Theorems 4 and 6]

1.1 Spatially Adaptive Methods

The authors are particularly interested in a variety of spatially adaptive methods which have been proposed in the statistical literature, such as CART (Breiman, Friedman, Olshen and Stone, 1983), Turbo (Friedman and Silverman, 1989), MARS (Friedman, 1991), and variablebandwidth kernel methods (M uller and Stadtmuller, 1987).
Informal conversations with Leo Breiman and Jerome Friedman have con rmed this assumption.
The authors now describe a simple framework which encompasses the most important spatially adaptive methods, and allows us to develop their main theme e ciently.
The reconstruction formula is TPC(y; )(t) = LX `=1 Ave(yi : ti 2 I`)1I`(t); piecewise constant reconstruction using the mean of the data within each piece to estimate the pieces. [2].
The kernel method TK;2 equipped with the variable bandwidth selector described in Brockmann, Gasser and Herrmann (1992) results in the \Heidelberg" variable bandwidth smoothing method.

1.2 Ideal Adaptation with Oracles

To avoid messy questions, the authors abandon the study of speci c -selectors and instead study ideal adaptation.
For us, ideal adaptation is the performance which can be achieved from smoothing with the aid of an oracle.
The risk of ideally adaptive piecewise polynomial ts is essentially 2L(D+1)=n.
Indeed, an oracle could supply the information that one should use I1; : : : ; IL rather than some other partition.
No better performance than this can be expected, since n 1 is the usual \parametric rate" for estimating nite-dimensional parameters.

1.3 Selective Wavelet Reconstruction as a Spatially Adaptive Method

A new principle for spatially adaptive estimation can be based on recently developed \wavelets" ideas.
This version yields an exactly orthogonal transformation between data and wavelet coe cient domains.
This approximation improves with increasing n and increasing j1.
For their purposes, the only details the authors need are [W1].
Figures 1 displays four functions { Bumps, Blocks, HeaviSine and Doppler { which have been chosen because they caricature spatially variable functions arising in imaging, spectroscopy and other scienti c signal processing.

1.4 Near-Ideal Spatial Adaptation by Wavelets

Of course, calculations of ideal risk which point to the bene t of ideal spatial adaptation prompt the question:.
The bene t of the wavelet framework is that the authors can answer such questions precisely.
The result, while slightly noisier than the ideal estimate, is still of good quality { and requires no oracle.

1.5 Universality of Wavelets as a Spatially Adaptive Procedure

This last calculation is not essentially limited to piecewise polynomials; something like it holds for all f .
We interpret this result as saying that selective wavelet reconstruction is essentially as powerful as variable-partition piecewise constant ts, variable-knot least-squares splines, or piecewise polynomial ts.the authors.the authors.
The authors know of no proof that existing procedures for tting piecewise polynomials and variable-knot splines, such as those current in the statistical literature, can attain anything like the performance of ideal methods.
And wavelet selection with an oracle o ers the advantages of other spatially-variable methods.
The main assertion of this paper is therefore that, from this perspective, it is cleaner and more elegant to abandon the ideal of tting piecewise polynomials with optimal partitions, and turn instead to RiskShrink, about which the authors have results, and an order O(n) algorithm.

1.6 Contents

Section 2 discusses the problem of mimicking ideal wavelet selection; Section 3 shows why wavelet selection o ers the same advantages as piecewise polynomial ts; Section 4 discusses variations and relations to other work.
Related manuscripts by the authors, currently under publication review and available as LaTeX les by anonymous ftp from playfair.

2.1 Oracles for Diagonal Linear Projection

Consider the following problem from multivariate normal decision theory.
Suppose the authors had available an oracle which would supply for us the coe cients DP ( ) optimal for use in the diagonal projection scheme.
Motivated by the idea that only very few wavelet coe cients contribute signal, the authors consider threshold rules, that retain only observed data that exceeds a multiple of the noise level.
The authors give the result here and outline the approach in Section 2.4.
However it is worth mentioning that a more traditional hard threshold estimator (11) exhibits the same asymptotic performance.

2.2 Adaptive Wavelet Shrinkage

The authors now apply the preceding results to function estimation.
Let n = 2J+1, and letW denote the wavelet transform mentioned in section 1.3.
Now let (yi) be data as in model (1) and let w =Wy be the discrete wavelet transform.
Hence, the authors have achieved, by very simple means, essentially the best spatial adaptation possible via wavelets.

2.3 Implementation

The authors have developed a computer software package which runs in the numerical computing environment Matlab.
The name RiskShrink for the estimator emphasises that shrinkage of wavelet coe cients is performed by soft thresholding, and that a mean squared error , or \risk" approach has been taken to specify the threshold.
The rationale behind this rule is as follows.
Hence, those coe cients (a xed number, independent of n) should not be shrunken towards zero.
Let gSW denote the selective wavelet reconstruction where the levels below j0 are never shrunk.

3 Piecewise Polynomials are not more powerful than Wavelets

The authors now show that wavelet selection using an oracle can closely mimick piecewise polynomial tting using an oracle.
Hence for every function, wavelets supplied with an oracle have an ideal risk that di ers by at most a logarithmic factor from the ideal risk of the piecewise polynomial estimate.
Since variable-knot splines of order D are piecewise polynomials of order D, the authors also have Rn; (SW; f) (C1 + C2J)Rn; (Spl(D); f): (25) Note that the constants are not necessarily the same at each appearance : see the proof below.
Suppose that this optimal partition contains L elements.

4.1 Variations on Choice of Oracle

There is an oracle inequality for diagonal shrinkage also.
(ii) More generally, the asymptotic inequality (28) continues to hold for soft threshold sequences ( n) and hard threshold estimators with threshold sequences (`n) satisfying respectively 5 log log n 2n 2 logn o(logn) (29) (1 ) log log n `2n 2 logn o(logn): (30) (iii) Theorem 3 continues to hold, a fortiori, if the denominator 2 +.
So oracles for diagonal shrinkage can be mimicked to within a factor 2 logn and not more closely.
These results are carried over to adaptive wavelet shrinkage just as in Section 2.2 by de ning wavelet shrinkage in this case by the analog of (18) TWS =W T TDS W : Corollary 1 extends immediately to this case.

4.2 Variations on Choice of Threshold

In Theorem 1 the authors have studied n, the minimax threshold for the soft threshold nonlinearity, with comparison to a projection oracle.
A drawback of using optimal thresholds is that the threshold which is precisely optimal for one of the four combinations may not be even asymptotically optimal for another of the four combinations, also known as Remark.
If a sample that in the noiseless case ought to be zero is in the noisy case nonzero, and that character is preserved in the reconstruction, the reconstruction will have an annoying visual appearance { it will contain small blips against an otherwise clean background.
Not only is the method better in visual quality than RiskShrink, the asymptotic risk bounds are no worse: R( ~fvn ; f) (2 logn + 1)f 2 n +Rn; (gSW; f)g: This estimator is discussed further in their report [asymp.tex].
In their experience, the empirical wavelet coe cients at the nest scale are, with a small fraction of exceptions, essentially pure noise.

4.4 Numerical measures of t

Table 2 contains the average (over location) squared error of the various estimates from their four test functions for the noise realisation and the reconstructions shown in Figures 2 - 10.
It is apparent that the ideal wavelets reconstruction dominates ideal Fourier and that the genuine estimate using soft threshold at n comes well within the factor 6.824 of the ideal error predicted for n = 2048 by Table 1.
It has uniformly worse squared error than n, which re ects the well-known divergence between the usual numerical and visual assessments of quality of t.
Table 3 shows the results of a very small simulation comparison of the same four techniques as sample size is varied dyadically from n = 256 through 8192, and using 10 replications in each case.
The same features noted in Table 2 extend to the other sample sizes.

4.5 Other Adaptive Properties

The estimator proposed here has a number of optimality properties in minimax decision theory.
RiskShrink is adaptive in the sense that it achieves, within a logarithmic factor, the best risk bounds that could be had if the class were known; and the logarithmic factor is necessary when the class is unknown, by work of Brown and Low (1993) and Lepskii (1990).
Other near-minimax properties are described in detail in their report [asymp.tex].

4.6 Boundary correction

As described in the Introduction, Cohen, Daubechies, Jawerth and Vial (1993), have introduced separate `boundary lters' to correct the non-orthogonality on [0; 1] of the restriction to [0; 1] of basis functions that intersect [0; 1]c.
Thus, the transform may be represented as W = U P , where U is the orthogonal transformation built from the quadrature mirror lters and their boundary versions via the cascade algorithm.
Thus all the ideal risk inequalities in the paper remain valid, with only an additional dependence for the constants on 1 and 2.
In particular, the conclusions concerning logarithmic mimicking of oracles are unchanged.

4.7 Relation to Model Selection

The authors results show that the method gives almost the same performance in mean-square error as one could attain if one knew in advance which model provided the minimum mean-square error.
The authors results apply equally well in orthogonal regression.
George and Foster (1990) have proved two results about model selection which it is interesting to compare with their Theorem 4.
The authors results here di er because the authors attempt to mimick more powerful oracles, which attain optimal mean-squared errors.
The authors are also most grateful to Carl Taswell, who carried out the simulations reported in Table 3.

5.5 Theorem 3

The main idea is to make a random variable, with prior distribution chosen so that a randomly selected subset of about logn coordinates are each of size roughly (2 logn)1=2, and to derive information from the Bayes risk of such a prior.
Let ~ n denote the Bayes rule for n with respect to the loss ~Ln.

5.6 Theorems 4 and 6

The authors give a proof that covers both soft and hard thresholding, and both DP and DS oracles.
The expansion (23) shows that this range includes n and hence ̂ .

Did you find this useful? Give us your feedback

Figures (3)

Figure 3 show that RiskShrink is much better for spatial adaptation; see Figure 4 of [ausws.tex].

Content maybe subject to copyright Report

Ideal Spatial Adaptation byWavelet Shrinkage

David L. Donoho

Iain M. Johnstone

Department of Statistics, Stanford University, Stanford, CA, 94305-4065, U.S.A.

June 1992

Revised April 1993

Abstract

With

ideal spatial adaptation

, an oracle furnishes information about how b est to

adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial,

variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation

with the aid of an oracle oers dramatic advantages over traditional linear estimation

by nonadaptivekernels; however, it is

a priori

unclear whether such performance can

be obtained by a procedure relying on the data alone. We describe a new principle for

spatially-adaptive estimation:

selective wavelet reconstruction

.Weshowthatvariable-

knot spline ts and piecewise-polynomial ts, when equipped with an oracle to select the

knots, are not dramatically more p owerful than selectivewavelet reconstruction with

an oracle. Wedevelop a practical spatially adaptive method,

RiskShrink

, whichworks

by shrinkage of empirical wavelet coecients.

RiskShrink

mimics the p erformance of

an oracle for selectivewavelet reconstruction as well as it is p ossible to do so. A new

inequalityin multivariate normal decision theory whichwecallthe

oracle inequality

shows that attained performance diers from ideal performance by at most a factor



2log

, where

is the sample size. Moreover no estimator can give a b etter guarantee

than this. Within the class of spatially adaptive procedures,

RiskShrink

is essentially

optimal. Relying only on the data, it comes within a factor log

of the p erformance

of piecewise polynomial and variable-knot spline metho ds equipped with an oracle.

In contrast, it is unknown how or if piecewise polynomial methods could b e made to

function this well when denied access to an oracle and forced to rely on data alone.

Keywords:

Minimax estimation sub ject to doing well at a point; Orthogonal Wavelet

Bases of Compact Support; Piecewise-Polynomial tting; Variable-Knot Spline.

1 Intro duction

Suppose we are given data

(

; i

;:::;n;

(1)

i=n

, where

are independently distributed as

;

), and

(



) is an unknown

function whichwewould liketorecover. We measure performance of an estimate

(



)in

terms of quadratic loss at the sample p oints. In detail, let

(

))

and

(

))

denote the vectors of true and estimated sample values, respectively.Let

denote the usual squared

norm; we measure performance by the risk

(

f; f

;

whichwewould liketomake as small as possible. Although the notation

suggests a

function of a real variable

, in this paper wework only with the equally spaced sample

points

1.1 Spatially Adaptive Metho ds

We are particularly interested in a variety of spatially adaptive metho ds whichhave been

proposed in the statistical literature, suchas

CART

(Breiman, Friedman, Olshen and Stone,

1983),

Turbo

(Friedman and Silverman, 1989),

MARS

(Friedman, 1991), and variable-

bandwidth kernel metho ds (M uller and Stadtmuller, 1987).

Such metho ds have presumably b een introduced because they were exp ected to do a

better job in recovery of the functions actually occurring with real data than do traditional

methods based on a xed spatial scale, suchasFourier series methods, xed-bandwidth

kernel methods, and linear spline smo others. Informal conversations with Leo Breiman and

Jerome Friedman have conrmed this assumption.

Wenow describe a simple framework which encompasses the most important spatially

adaptive metho ds, and allows us to develop our main theme eciently.We consider esti-

mates

dened as

(



(

y; d

(

))(



) (2)

where

(

y; 

)is a

reconstruction formula

with \spatial smoothing" parameter



,and

(

)

is a data-adaptivechoice of the spatial smo othing parameter



. A clearer picture of what

weintend emerges from ve examples.

[1]. Piecewise Constant Reconstruction

(

y; 

). Here



is a nite list of, say,

real

numbers dening a partition (

;:::;I

)of[0

;

1] via

=[0

;

)



;



)

;:::;I

[







;

;





] ,so that



= 1. Note that

is a variable. The reconstruction

formula is

(

y; 

)(

Ave(

(

);

piecewise constant reconstruction using the mean of the data within each piece to estimate

the pieces.

[2]. Piecewise Polynomials

(

)

(

y; 

). Here the interpretation of



is the same as in

[1], only the reconstruction uses p olynomials of degree

(

)

(

y; 

)(

(

)

;

where ^

(

is determined by applying the least squares principle to the data

arising for interval

(

)

;

)

=min!

[3]. Variable-Knot Splines

spl;D

(

y; 

). Here



denes a partition as ab ove, and on each

interval of the partition the reconstruction formula is a p olynomial of degree

, but now

the reconstruction must b e continuous and havecontinuous derivatives through order

;

In detail, let



be the left endpointof

;:::;L

. The reconstruction is chosen from

among those piecewise polynomials

(

) satisfying

(



;

(



for

;:::;D

;

;:::;L

; sub ject to this constraint, one solves

(

)

;

)

= min!

[4]. Variable Bandwidth Kernel Methods

VK;

(

y; 

). Now



is a

function

on [0

;

1];



(

)

represents the \bandwidth of the kernel at

"; the smo othing kernel

is a

function of

compact support which is also a probabilitydensity, and if

VK;

(

y; 

) then

(



;



(

)





(

)

(3)

More rened versions of this formula would adjust

for b oundary eects near

=0and

=1.

[5]. Variable-Bandwidth High-Order Kernels

VK;D

(

y; 

2. Here



is again the

local bandwidth, and the reconstruction formula is as in (3), only

(



)is a

function

integrating to 1, with vanishing intermediate moments

(

)

; j

;:::;D

;

(



) cannot b e nonnegative.

These reconstruction techniques, when equipped with appropriate selectors of the spatial

smoothing parameter



, duplicate essential features of certain well-known metho ds.

[1] The piecewise constant reconstruction formula

, equipped with choice of partition



by recursive partitioning and cross-validatory choice of \pruning constant" as de-

scribed by Breiman, Friedman, Olshen and Stone (1983) results in the method

CART

applied to 1-dimensional data.

[2] The spline reconstruction formula

spl

, equipped with a backwards deletion scheme

models the metho ds of Friedman and Silverman (1989) and Friedman (1991) applied

to 1-dimensional data.

[3] The kernel metho d

equipped with the variable bandwidth selector described

in Bro ckmann, Gasser and Herrmann (1992) results in the \Heidelberg" variable

bandwidth smo othing method. Compare also Terrell and Scott (1992).

These schemes are computationally feasible and intuitively appealing. However, very

little is known ab out the theoretical p erformance of these adaptiveschemes, at the level of

uniformityin

and

that wewould like.

1.2 Ideal Adaptation with Oracles

Toavoid messy questions, we abandon the study of sp ecic



-selectors and instead study

ideal

adaptation.

For us, ideal adaptation is the performance whichcanbeachieved from smoothing with

the aid of an

oracle

. Such an oracle will not tell us

, but will tell us, for our metho d

(

y; 

the \b est" choice of



for the true underlying

. The oracle's response is conceptually a

selection (

) which satises

(

(

))

n;

(

T; f

)

where

n;

denotes the

ideal risk

n;

(

T; f

) = inf



(

y; 

)

measures performance with a selection (

) based on full knowledge of

rather

than a data-dependent selection

(

), it represents an ideal we cannot expect to attain.

Nevertheless it is the target we shall consider.

Ideal adaptation oers, in principle, considerable advantages over traditional nonadap-

tive linear smo others. Consider the case of a function

which is a piecewise polynomial of

degree

, with a nite number of pieces

;:::;I

,say:

(

)

(4)

Assume that

has discontinuities at some of the break-points



;:::;

The risk of ideally adaptive piecewise polynomial ts is essentially



(

+1)

. Indeed,

an oracle could supply the information that one should use

;:::;I

rather than some other

partition. Traditional least-squares theory says that, for data from the traditional linear

model

X

, with noise

independently distributed as

;

), the traditional

least-squares estimator



satises

X

;



=(number of parameters in



)(variance of noise)

Applying this to our setting, tting a function of the form (4) requires tting (# pieces )(degree+

1) parameters, so for the risk

(

f; f

;

weget

(

+1)



as advertised.

On the other hand, the risk of a spatially-non-adaptive pro cedure is far worse. Con-

sider kernel smoothing. Because

has discontinuities, no kernel smo other with xed non-

spatially varying bandwidth attains a risk

(

f; f

) tending to zero faster than

;

(

kernel). The same result holds for estimates in orthogonal series of polynomials

or sinusoids, for smoothing splines with knots at the sample p oints and for least squares

smoothing splines with knots equispaced.

Most strikingly,even for piecewise p olynomial ts with equal-width pieces, wehave that

(

f; f

) is of size



;

unless the breakpoints of

form a subset of the breakpoints of

. But this can happen only for very special

,soinanyevent

lim sup

(

f; f

)



In short, oracles oer an improvement|ideally|from risk of order

;

to order

;

.No

better p erformance than this can be exp ected, since

;

is the usual \parametric rate" for

estimating nite-dimensional parameters.

Can we approach this ideal performance with estimators using the data alone?

1.3 SelectiveWavelet Reconstruction as a Spatially Adaptive Metho d

A new principle for spatially adaptive estimation can b e based on recently developed

\wavelets" ideas. Introductions, historical accounts and references to much recentwork

may b e found in the b ooks byDaubechies (1992), Meyer (1990), Chui (1992) and Frazier,

Jawerth and Weiss (1991). Orthonormal bases of compactly supported wavelets provide a

powerful complement to traditional Fourier metho ds: they permit an analysis of a signal or

image into lo calised oscillating comp onents. In a statistical regression context, this spatially

varying decomposition can b e used to build algorithms that adapt their eective \window

width" to the amount of local oscillation in the data. Since the decomposition is in terms

of an orthogonal basis, analytic study in closed form is p ossible.

For the purp oses of this paper, we discuss a

nite, discrete, wavelet transform

. This

transform, along with a careful treatment of b oundary correction, has b een described by

Cohen, Daubechies, Jawerth, and Vial (1993), with related work in Meyer (1991) and

Malgouyres (1991). To fo cus attention on our main themes, we employ a simpler

periodised

version of the nite discrete wavelet transform in the main exp osition. This version yields

exactly

orthogonal transformation b etween data and wavelet coecient domains. Brief

comments on the minor changes needed for the boundary corrected version are made in

Section 4.6.

Suppose wehave data

)

,with

.For various combinations of pa-

rameters

(number of vanishing moments),

(support width), and

(Low-resolution

cuto ), one may construct an

-by-

orthogonal matrix

|the nite wavelet transform

matrix. Actually there are many such matrices, depending on sp ecial lters: in addition to

the original Daub echies wavelets there are the Coiets and Symmlets of Daubechies (1993).

For the gures in this pap er we use the Symmlet with parameter

= 8. This has

vanishing moments and support length

=15.

This matrix yields a vector

of the

wavelet coecients

via|

;

and b ecause the matrix is orthogonal wehave the inversion formula

The vector

has

elements. It is convenienttoindex dyadically

;

1=2

;

of the elements following the scheme

j;k

;:::;J

;

;:::;

;

and the remaining elementwelabel

;

.Tointerpret these co ecients let

denote

the (

j; k

)-th rowof

. The inversion formula

becomes

j;k

(

)

;

expressing

as a sum of basis elements

with coecients

j;k

.We call the

wavelets

HTML Viewer

Frequently Asked Questions (11)

Q1. What are the contributions mentioned in the paper "Ideal spatial adaptation by wavelet shrinkage" ?

The authors describe a new principle for spatially-adaptive estimation: selective wavelet reconstruction. The authors show that variableknot spline ts and piecewise-polynomial ts, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. A new inequality in multivariate normal decision theory which the authors call the oracle inequality shows that attained performance di ers from ideal performance by at most a factor 2 logn, where n is the sample size.

Q2. What is the risk of a piecewise polynomial ts?

Because f has discontinuities, no kernel smoother with xed nonspatially varying bandwidth attains a risk R(f̂ ; f) tending to zero faster than Cn 1=2, C = C(f; kernel).

Q3. What is the way to preserve the important property of orthogonality to polynomials?

To preserve the important property [W1] of orthogonality to polynomials of degree M , a further `preconditioning' transformation P of the data y is necessary.

Q4. What is the preconditioning transformation a ects?

The preconditioning transformation a ects only the N = M + 1 left-most and the N right-most elements of y: it has block diagonal structure P = diag(PL j The authorj PR).

Q5. What is the nite wavelet transform matrix?

For various combinations of parameters M (number of vanishing moments), S (support width), and j0 (Low-resolution cuto ), one may construct an n-by-n orthogonal matrix W|the nite wavelet transform matrix.

Q6. How many minimax quantities can be de ned?

A total of 4 minimax quantities may be de ned, by considering various combinations of threshold type (soft, hard) and oracle type (projection,shrinkage).

Q7. What are the four functions chosen for the figure?

"Figures 1 displays four functions { Bumps, Blocks, HeaviSine and Doppler { which have been chosen because they caricature spatially variable functions arising in imaging, spectroscopy and other scienti c signal processing.

Q8. What is the way to estimate the thresholds?

However it is natural and more revealing tolook for `optimal' thresholds n which yield the smallest possible constant n in place of 2 logn+1 among soft threshold estimators.

Q9. What is the inverse of the nite discrete wavelet transform?

This matrix yields a vector w of the wavelet coe cients of y via|w =Wy;and because the matrix is orthogonal the authors have the inversion formula y =WTw.

Q10. How can one mimic the nonzeroness oracle?

In their language, they show that one can mimick the \\nonzeroness" oracle Z( ; ) = 21 f 6=0g to within Ln = 1 + 2 log(n + 1) by hard thresholding with n = (2 log(n + 1)) 1=2.

Q11. Where is the implementation of f available?

In addition, an implementation by G.P. Nason in the S language is available by anonymous ftp from Statlib at lib.stat.cmu.edu ; other implementations are also in development.

Ideal spatial adaptation by wavelet shrinkage

Summary (4 min read)

1.1 Spatially Adaptive Methods

1.2 Ideal Adaptation with Oracles

1.3 Selective Wavelet Reconstruction as a Spatially Adaptive Method

1.4 Near-Ideal Spatial Adaptation by Wavelets

1.5 Universality of Wavelets as a Spatially Adaptive Procedure

1.6 Contents

2.1 Oracles for Diagonal Linear Projection

2.2 Adaptive Wavelet Shrinkage

2.3 Implementation

3 Piecewise Polynomials are not more powerful than Wavelets

4.1 Variations on Choice of Oracle

4.2 Variations on Choice of Threshold

4.4 Numerical measures of t

4.5 Other Adaptive Properties

4.6 Boundary correction

4.7 Relation to Model Selection

5.5 Theorem 3

5.6 Theorems 4 and 6

Figures (3)

Citations

Cites background or methods from "Ideal spatial adaptation by wavelet..."

Cites background or methods from "Ideal spatial adaptation by wavelet..."

Cites background from "Ideal spatial adaptation by wavelet..."

Cites background or methods from "Ideal spatial adaptation by wavelet..."

References

"Ideal spatial adaptation by wavelet..." refers background in this paper

"Ideal spatial adaptation by wavelet..." refers background in this paper

"Ideal spatial adaptation by wavelet..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (11)

Q1. What are the contributions mentioned in the paper "Ideal spatial adaptation by wavelet shrinkage" ?

Q2. What is the risk of a piecewise polynomial ts?

Q3. What is the way to preserve the important property of orthogonality to polynomials?

Q4. What is the preconditioning transformation a ects?

Q5. What is the nite wavelet transform matrix?

Q6. How many minimax quantities can be de ned?

Q7. What are the four functions chosen for the figure?

Q8. What is the way to estimate the thresholds?

Q9. What is the inverse of the nite discrete wavelet transform?

Q10. How can one mimic the nonzeroness oracle?

Q11. Where is the implementation of f available?