Support Recovery for Sparse Super-Resolution of Positive Measures

doi:10.1007/S00041-016-9502-X

Support Recovery for Sparse

Deconvolution of Positive Measures

Quentin Denoyelle

∗

, Vincent Duval

†

and Gabriel Peyr´e

‡

September 1, 2016

Abstract

We study sparse spikes deconvolution over the space of Radon mea-

sures on R or T when the input measure is a ﬁnite sum of positive Dirac

masses using the BLASSO convex program. We fo c u s o n th e recovery

properties of the support and the a mp l it u d es of the initial m ea su re in the

presence of noise as a function of the minimum separation t of t h e input

measure (the minimum distance between two spikes). We show that when

w/ λ , w/t

2N −1

and λ/t

2N −1

are small enough (where λ is the regula riz a -

tion parameter, w the noise an d N the number of spikes), which corre-

sponds roughly to a suﬃcient signal-to-noise ratio and a noise level small

enough with respect to the minimum separation, there exists a unique

solution to the BLASSO program with exa c tl y the same number of spikes

as the original measure. We show that the amplitudes and positions of

the spikes of the solution both converge toward those of the input measure

when the noise and the regulariza t io n parameter drops to zero faster than

t

2N −1

.

1 Introduction

1.1 Super-Resolution and Spar s e Spikes Deconvolution

Super-resolution consists i n retrieving ﬁne scale details of a possibly noisy

signal from coarse scale information. The importance of recovering the high

frequencies of a signal comes from the fact that there is often a physical blur in

the acquisition pr ocess, such as diﬀraction in optical systems, wave reﬂection in

seismic imaging or spikes recording from neuronal activity.

In resolution theory, the two-point resolution criterion deﬁnes the ability of

a system to resolve two points of equal intensities. As a point source produces

a diﬀraction pattern which is centered about the geometrical image point of the

∗

CEREMADE, Univ. Paris-Dauphine, denoyelle@ceremade.dauphine.fr

†

INRIA Rocquencourt, MOKAPLAN, vincent.duval@inria.fr

‡

CNRS & CEREMADE, Univ. Paris-Dauphine, gabriel.peyre@ceremade.dauphine.fr

1

arXiv:1506.08264v2 [cs.IT] 31 Aug 2016

point source, it is often admitted that two poi nts are re sol ved by a system if the

central maximum of the intensity diﬀraction of one poi nt source coinci de s with

the ﬁrst zero of the intensity diﬀraction pattern of the other point. This deﬁnes

a distance that only depends on the system and which is called the Rayleigh

Length. In the case of the ideal low-pass ﬁ lt e r (meaning that the input signal is

convolved with the Dirichlet kernel, see (9) for the exact deﬁnition) with cu t oﬀ

frequency f

c

, the Rayleigh Length is 1/f

c

. We refer to [9] for more details

about resolution the or y. Super-resol u ti on in signal processi ng thus consi st s in

developing techniques t hat enable t o retrieve information below the Rayleigh

Length.

Let us introduce in a more formal way the problem which will be the core

of this article. Let X be the real line R or the 1-D torus T = R/Z and M(X)

the Banach space of bounded Radon measur es on X, which can be seen as the

topological dual of the space C

X

where C

X

is either the space of continuous

functions on R that vanish at inﬁnity when X = R or the space of continuous

functions on T when X = T. We consider a given integral operator Φ : M(X) →

H, where H is a separable Hilbert space, whose kernel ϕ is s up posed to be a

smooth funct i on (see Deﬁnition 1 for the technical assumptions made on ϕ), i.e.

∀m ∈ M(X), Φm =

Z

X

ϕ(x)dm(x). (1)

Φ represe nts the acquisition operator and can for instance account for a blu r in

the measurements. In the sp e ci al case of ϕ(x) = ˜ϕ(· − x), Φ is a convolution

operator. We denote by m

a

0

,tz

0

=

P

N

i=1

a

0,i

δ

tz

0,i

our input sparse spikes train

where the a

0,i

∈ R

∗

+

are the amplitudes of the Dirac masses at positions tz

0,i

∈

X. Let y

t

= Φm

a

0

,tz

0

be the noiseless observation. The parameter t > 0 controls

the minimum separation distance between the spikes, and this paper aims at

studying the recovery of m

a

0

,tz

0

from y

t

+ w (where w ∈ H is some noise) when

t is small.

1.2 From the LASSO to the BLASSO

LASSO. ℓ

1

regularization techniques were ﬁrst introduced in geophysics (see

[6, 17, 20]) for seismic prospecting. Indeed, the density changes in the un-

derground can be modeled as a sparse spikes train. ℓ

1

reconstruction property

provides solutions with few non ze ro coeﬃcients and can be solved eﬃciently with

convex optimization methods. Donoho theoretically studied and justiﬁed these

techniques in [11]. In statistics, the ℓ

1

norm is used in the Lasso method [23]

which consists in minimizing a quad rat i c error subject to a ℓ

1

penalization. As

the au th or s remarked it, it retains both the features of subset selection (by set-

ting to zero some co eﬃc i ents, thanks to the property of the ℓ

1

norm to favor

sparse solutions) and ridge regression (by shrinking the other coeﬃcients). In

signal processing, the basis pursuit metho d [22] uses t h e ℓ

1

norm to decompose

signals into overcomplete dictionaries.

2

BLASSO. Following recent works (see for instance [1, 3, 5, 8, 12]), the sparse

deconvolution method that we consider in this article operates over a continuous

domain, i.e. wit h out resorting to some sort of discretization on a grid. The in-

verse prob l em is solved over the space of Radon measu r es which is a non-reﬂexive

Banach space. This c ontinuous “grid free” setting makes the mathematical anal-

ysis easier and all ows us to make precise statement about the location of the

recovered spikes locations.

The te chnique that we study in this paper consists in solving a convex opti-

mization problem that uses the t ot al variation norm which is the equivalent of

the ℓ

1

norm for measures. The ℓ

1

norm is known to be particularly well ﬁtted

for the recovery of sparse signals. Thus the use of the total variation norm

favors the emergence of spikes in the solution.

The total variation norm is deﬁned by

∀m ∈ M(X), |m|(X)

def.

= sup

ψ∈C

X



Z

X

ψdm ; kψk

L

∞

(X)

6 1



.

In particular,

|m

a

0

,tz

0

|(X) = ka

0

k

1

,

which shows in a way that the total variation norm generalizes the ℓ

1

norm to

the continuous set t i ng of measures (i.e. no discretization grid i s required).

The ﬁrst method that we are interested in is the classical basis pursuit,

deﬁned originally in [22] in a ﬁnite dimensional setting, and written her e over

the space of Radon measures

min

m∈M(X)

{|m|(X) ; Φm = y

t

}. (P

0

(y

t

))

This is the problem studied in [5], in the case where Φ is an ideal low-pass ﬁlter

on the torus (i.e. X = T).

When th e signal is noisy, i.e. when we observe y

t

+ w instead, with w ∈ H,

we may rather consider the problem

min

m∈M(X)

1

2

kΦm − (y

t

+ w)k

2

H

+ λ|m|(X). (P

λ

(y

t

+ w) )

Here λ > 0 is a parameter that should adapted to the noise level kwk

H

. This

problem is coined “BLASSO” in [8].

While this is not the focus of this article, we note that there exist algori t h ms

to solve the inﬁnite dimensional convex pr ob le ms (P

0

(y

t

)) and (P

λ

(y

t

+ w) ), see

for instance [3, 5].

1.3 Previous Works

LASSO/BLASSO performance anal y si s. In order to quantify the recov-

ery performan ce of the methods P

0

(y

t

) and P

λ

(y

t

+ w), the following two ques-

tions arise:

3

1. Does the solutions of P

0

(y

t

) recover the input measure m

a

0

,tz

0

?

2. How close is the solution of P

λ

(y

t

+ w) to the solution of P

0

(y

t

) ?

When the amplitudes of the spikes are arbitrary compl ex numbers, the an-

swers to the above questions require a large enough minimum separation dis-

tance ∆(tz

0

) between t he spikes where

∆(tz

0

)

def.

= min

i6=j

d

X

(tz

0,i

, tz

0,j

). (2)

where d

X

is either the canonical distance on R i.e.

∀x, y ∈ R, d

X

(x, y) = |x − y|, (3)

when X = R, or the canonical induced distance on T i.e.

∀x, y ∈ R, d

X

(x + Z, y + Z) = min

k∈Z

|x − y + k|, (4)

when X = T. The ﬁ r st question is addressed in [5] where the authors showed,

in the case of Φ being the ideal low-pass ﬁlter on the torus (see (9)), i.e. when

H = T, that m

a

0

,tz

0

is the unique solution of P

0

(y

t

) provided that ∆(tz

0

) >

C

f

c

where C > 0 is a universal constant and f

c

the cutoﬀ frequency of the ideal

low-pass ﬁlter. In the same paper, it is shown that C 6 2 when a

0

∈ C

N

and

C 6 1.87 when a

0

∈ R

N

. In [12], the authors show that ne ce ssar i l y C >

1

2

.

The second question receives partial answers in [14, 3, 4, 13]. In [3], it is

shown that if the solution of P

0

(y

t

) is unique then the measures recovered by

P

λ

(y

t

+ w) converge in the weak-* sense to the solution of P

0

(y

t

) when λ → 0

and kwk

H

/λ → 0. In [4], the authors measure the reconstructi on error usin g

the L

2

norm of an ideal low-pass ﬁltered version of the recovered measures.

In [14, 13], error bounds are given on the locat i ons of the recovered spi kes with

respect to those of the input measure m

a

0

,tz

0

. However, those works provide

little information about the structure of the measures recovered by P

λ

(y

t

+ w).

That point is addressed in [12] where the authors show that under the Non

Degenerate Source Condition (see Section 2 for more details), the r e exist s a

unique solution to P

λ

(y

t

+ w) with the exact same numb e r of spikes as the

original measure provided t h at λ and kwk

H

/λ are small enough. Moreover

in that re gi me, this solution converges to the original measure when the noise

drops to zero.

LASSO/BLASSO for positive spikes. For positive spikes (i.e. a

0,i

> 0),

the picture is radically diﬀe re nt, since exact recovery of m

a

0

,tz

0

without noise

(i.e. (w, λ) = (0, 0)) holds for all t > 0, see for instanc e [8]. Stability con-

stants however explode as t → 0. A recent work [19] shows however that stable

recovery is obtained if the signal -t o-n oi se ratio grows f ast er than O(1/t

2N

),

closely matching optimal lower bounds of O(1/t

2N− 1

) obtained by combinato-

rial methods, as also proved recently [10] . Our main contribution is to show

that the same O(1/t

2N− 1

) signal-to-noise scaling in fact guarantees a perfect

4

support recovery of the spikes under a certain non-degeneracy condi t ion on the

ﬁlter. This extends, for positive measures, the initial results of [12] by providing

an asymptotic analysis wh en t → 0.

MUSIC and related methods. There is a large body of literature in signal

processing on spectral methods to perform spikes location from low frequency

measurements. One of the most popular methods is MUSIC (for MUltiple Signal

Classiﬁcation) [21] and we refer to [15] for an overview of its use in sign al pro c es s-

ing for line spect ral estimation. In the noiseless case, exact reconstructi on of the

initial si gn al is guaranteed as long as there are enough observations compared

to the numbe r of distinct frequencies [18]. Stabi l i ty to noise is known to h old

under a minimum separation di s t anc e similar to the one of the BLASSO [18].

However, on sharp contrast with the behavior of the BLASSO, numerical obser-

vations (see for instance [7]), as well as a recent work of Demanet and Nguyen,

show that this stability continues to hold regardless of the sign of the ampli-

tudes a

0,i

, as soon as the signal-to-noise ratio scales l ike O(1/t

2N− 1

). Note that

this m at ches (when w is a Gaussian white noise) the Cramer-Rao lower bound

achievable by any unbiased es t im at or [2]. This class of methods are thus more

eﬃcient than BLASSO for arbitrary measures, but they are restricted to oper-

ators Φ that are convolution with a low-pass ﬁlter, which is not the case of our

analysis for the BLASSO.

1.4 Contributions

Main results. From these p re v ious works, one can ask whet h er exact support

estimation by BLASSO for positive spikes is achievable when t tends to 0. Our

main result, Theorem 2, shows t hat this is indeed the case. It states, under some

non-degeneracy condition on Φ, that there exists a unique solution t o P

λ

(y

t

+w)

with the exact same number of spikes as the original measure provided that

kwk

H

/λ, kwk

H

/t

2N− 1

and λ/t

2N− 1

are small enough. Moreover we give an

upper bound, in that regime, on the error of the recovered measure with respect

to the initial measure. As a by-product, we show that the amplitudes and

positions of the spikes of the solution both converge towards those of the input

measure when the noise and the regular iz at i on parameter tend to zero faster

than t

2N− 1

.

Extensions. We consider in this article t he case where all the spikes locations

tz

0

cluster near zero. Following for instance [19], it is possible to consider a

more general model w it h several cluster points, where the sign of the Diracs

is the same around each of these points. Our analysis, although more diﬃcult

to perform, could be e x t en de d to this set ti n g, at the pr ic e of modifying the

deﬁnition of η

W

(see Deﬁnition 3) to account for several cluster poi nts.

Lastly, if the kernel Φ under consideration has some speciﬁc scale σ (such

as the standard deviation of a Gaussian kernel, or σ = 1/f

c

for the ideal low-

pass ﬁlter in the case of the deconvolution on the torus), then it is possible to

5

Support Recovery for Sparse Super-Resolution of Positive Measures

Citations

Modern Regularization Methods for Inverse Problems

Harnessing Sparsity Over the Continuum: Atomic norm minimization for superresolution

The Sliding Frank-Wolfe Algorithm and its Application to Super-Resolution Microscopy

Continuous-Domain Solutions of Linear Inverse Problems With Tikhonov Versus Generalized TV Regularization

Stable super-resolution limit and smallest singular value of restricted Fourier matrices

References

Regression Shrinkage and Selection via the Lasso

Multiple emitter location and signal parameter estimation

Atomic Decomposition by Basis Pursuit

Graduate Texts in Mathematics

Two decades of array signal processing research: the parametric approach

Related Papers (5)

Exact Support Recovery for Sparse Spikes Deconvolution

Towards a Mathematical Theory of Super-resolution

Super-Resolution from Noisy Data

Compressed sensing

Compressed Sensing Off the Grid