What are the contributions in "A generalized iterated shrinkage algorithm for non-convex sparse coding" ?

In this paper, by extending the popular soft-thresholding operator, the authors propose a generalized iterated shrinkage algorithm ( GISA ) for lp-norm non-convex sparse coding.

What is the main purpose of this paper?

Inspired by the great success of soft thresholding [16] and iterative shrinkage/thresholding (IST) [15] methods, in this paper, the authors propose a generalized iterated shrinkage algorithm (GISA) for ℓp-norm non-convex sparse coding.

What is the lp-norm non-convex coding problem?

To use IRLS for ℓp-norm non-convex sparse coding, the problem in Eq. (3) is approximated by [26]min x 1 2 ∥y − Ax∥ 2 2 + λ ∑ i (x2i + ε) p/2−1 x2i , (5)where ε → 0 is a small positive number to avoid division by zeros.

What is the typical parameter setting for the SRC?

In the original SRC, the typical parameter setting is q = 2 and p = 1 (for FR without corruption) or q = 1 and p = 1 (for robust FR with corruption).

What is the regularization parameter of the gradient operator?

∥Dx∥ p p , (37)where λ is the regularization parameter, D = [Dh,Dv] denotes the gradient operator, and Dh and Dv are the horizontal and vertical gradient operators, respectively.

What is the fidelity term of the model?

A typical image deconvolution model usually includes a fidelity term and a regularization term, where the fidelity term is modeled based on the degradation process, and the regularization term is modeled based on image priors.

What is the effect of p on the recognition accuracy of the SRC model?

By fixing q = 2 and varying p, in the first FR experiment the authors use the extended Yale B dataset [22, 27] to test the influence of p on recognition accuracy.

How many images are used in the experiments?

In their experiments, the authors randomly select 30 images from each subject to construct a training dataset of 1, 140 images, and use the remaining images for test.

What is the simplest way to model the marginal distributions of filtering responses?

Recent studies on natural image statistics have shown that the marginal distributions of filtering responses can be modeledas hyper-Laplacian with 0 < p < 1 [25, 28, 35], which had been adopted in many low level vision problems [13, 36].

What is the thresholding function in Eq. (4)?

To guarantee that f (x) has a minimum in (x(λ,p)0 ,+∞), the authors shouldfurther require f ′(x(λ,p)0 ) ≤ 0. In [33], She let f ′(x (λ,p) 0 ) = 0 and solved the following equationf ′(x(λ,p)0 )= (λp(1−p)) 1 2−p −τIT Mp (λ)+λp(λp(1−p)) p−1 2−p =0.(18) The corresponding threshold on y isτIT Mp (λ) = λ 1/(2−p)(2 − p)[p/(1 − p)1−p]1/(2−p). (19)In ITM, She [33] extended the soft-thresholding with the thresholding function in Eq. (11).

How did the authors solve the lpminimization problem in Eq. (4)?

Inspired by soft-thresholding, the authors proposed a generalized shrankage/thresholding operator to solve the ℓpminimization problem in Eq. (4) by modifying the thresholding and the shrinkage rules.

What is the current estimation of x(k)?

Given the current estimation x(k), IRLS iteratively solves the following problemmin x 1 2 ∥y − Ax∥2 2 + ∑ i wix2i , (6)and updates x by x(k+1) = ( AT A + diag (w) ) AT y, (7)where the ith component of weight vector w is defined as wi = pλ /( (x(k)i ) 2 + ε )1−p/2 .

What is the way to implement SRC-p?

The authors denote by SRC-p, q the SRC method with0 < q = p < 1, and embed the proposed GISA into ALM to implement SRC-p, q for robust face recognition.

(Open Access) A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding (2013) | Wangmeng Zuo

A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding

Wangmeng Zuo

1,3

, Deyu Meng

, Lei Zhang

, Xiangchu Feng

, David Zhang

Harbin Institute of Technology

Xi’an Jiaotong University

Hong Kong Polytechnic University

Xidian University

cswmzuo@gmail.com,dymeng@mail.xjtu.edu.cn,{cslzhang,csdzhang}@comp.polyu.edu.hk,xcfeng@mail.xidian.edu.cn

Abstract

In many sparse coding based image restoration and im-

age classiﬁcation problems, using non-convex ℓ

-norm min-

imization (0 ≤ p < 1) can often obtain better results than

the convex ℓ

-norm minimization. A number of algorithms,

e.g., iteratively reweighted least squares (IRLS), iterative-

ly thresholding method (ITM-ℓ

), and look-up table (LUT),

have been proposed for non-convex ℓ

-norm sparse coding,

while some analytic solutions have been suggested for some

speciﬁc values of p. In this paper, by extending the popular

soft-thresholding operator, we propose a generalized iter-

ated shrinkage algorithm (GISA) for ℓ

-norm non-convex

sparse coding. Unlike the analytic solutions, the proposed

GISA algorithm is easy to implement, and can be adopted

for solving non-convex sparse coding problems with arbi-

trary p values. Compared with LUT, GISA is more gen-

eral and does not need to compute and store the look-up

tables. Compared with IRLS and ITM-ℓ

, GISA is theoret-

ically more solid and can achieve more accurate solutions.

Experiments on image restoration and sparse coding based

face recognition are conducted to validate the performance

of GISA.

1. Introduction

Sparse coding [7, 18, 31] is an eﬀective tool in a myri-

ad of applications such as compressed sensing [11], image

restoration [24, 25], face recognition [38], etc. Originally,

it aims to solve the following minimization problem:

min

∥

y − Ax

∥

+ λ

∥

, (1)

where y is an n × 1 vector, A is an n × m redundant matrix

with m > n, and the ℓ

-norm

∥

•

∥

simply counts the number

of non-zero entries in x. Unfortunately, solving the mini-

mization in Eq. (1) is NP hard [32] and is computationally

infeasible for large scale problems.

Rather than solving the above ℓ

-minimization problem,

one can replace the ℓ

-norm with the ℓ

-norm

∥

•

∥



and seek for the desired x by solving the following convex

optimization problem

min

∥

y − Ax

∥

+ λ

∥

. (2)

It has been proved that, under certain conditions on A

[6, 17], the ℓ

-minimization in (2) is equivalent to the ℓ

minimization in (1) with high probability.

However, when the conditions on A are not satisﬁed, the

solution by ℓ

-minimization becomes suboptimal. Actually,

both theoretical analysis and numerical experiments [9, 10,

11] have shown that the solution of ℓ

-norm sparse coding

(0 ≤ p < 1)

min

∥

y − Ax

∥

+ λ

∥

, (3)

is close to that of the ℓ

-minimization and it is sparser. In

image restoration, it has been shown that the image gradi-

ents of the natural images can be better modeled with hyper-

Laplacian distribution with 0.5 ≤ p ≤ 0.8 [25, 28]. In fea-

ture selection and compressed sensing, ℓ

can bridge ℓ

and

ℓ

, and can achieve better solutions [12, 31].

So far, a number of algorithms have been proposed

for solving ℓ

-norm non-convex sparse coding problems,

and they have been applied to various vision and learn-

ing tasks, e.g., compressed sensing [10], image restoration

[25], face recognition [29], and variable selection [33]. Sev-

eral typical algorithms include iteratively reweighted least

squares (IRLS) [12, 14, 23, 24, 28], iteratively reweighted

ℓ

-minimization (IRL1) [8], iteratively thresholding method

(ITM-ℓ

) [33, 34], and look-up table (LUT) [25]. These al-

gorithms, however, suﬀer from several limitations. Even for

the simplest ℓ

-minimization problem

min

(y − x)

+ λ

, (4)

IRLS, IRL1, and ITM-ℓ

would not converge to the global

optimal solution. LUT uses look-up tables to store the solu-

tions w.r.t. diﬀerent values of variable x and regularization

parameter λ. If the values of x and λ are unconstrained and

p changes dynamically (e.g., multi-stage relaxation), more

computational and memory costs are required to construct

and store the look-up table. Other algorithms, such as the

analytic solutions in [25, 39], can only be used for some

speciﬁc values of p.

Inspired by the great success of soft thresholding [16]

and iterative shrinkage/thresholding (IST) [15] methods,

in this paper, we propose a generalized iterated shrinkage

algorithm (GISA) for ℓ

-norm non-convex sparse coding.

The proposed GISA is simple and eﬃcient, and can be

adopted for solving ℓ

-norm sparse coding problems with

arbitrary p, λ and y values. Compared with IRLS, IRL1,

and ITM-ℓ

, GISA would converge to more accurate solu-

tions. It is easy to implement and can be readily used to

solve the many ℓ

-norm minimization problems in various

vision and learning applications.

2. Related work

To date, various algorithms have been proposed for ℓ

norm non-convex sparse coding. Based on the problems in

Eq. (3) and Eq. (4), we provide a brief survey and discus-

sion on IRLS, IRL1, ITM-ℓ

, and LUT.

To use IRLS for ℓ

-norm non-convex sparse coding, the

problem in Eq. (3) is approximated by [ 26]

min

∥

y − Ax

∥

+ λ



+ ε)

p/2−1

, (5)

where ε → 0 is a small positive number to avoid division

by zeros. Given the current estimation x

(k)

, IRLS iteratively

solves the following problem

min

∥

y − Ax

∥



, (6)

and updates x by

(k+1)



A + diag

(

)



y, (7)

where the i

component of weight vector w is deﬁned as

= pλ





(k)

)

+ ε



1−p/2

. (8)

Similarly, to use IRL1 for

ℓ

-norm minimization, the prob-

lem in Eq. (3) is approximated by

min

∥

y − Ax

∥

+ λ



λp

(

+ ε

)

p−1

. (9)

Given x

(k)

, IRL1 [8, 21] updates x by solving the following

problem

(k+1)

= arg min

∥

y − Ax

∥



λp





(k)



+ ε



p−1

(10)

using the existing ℓ

-minimization algorithms [1, 3, 41].

Based on the theoretical analysis in [20, 26], both IRLS and

IRL1 can guarantee to converge, while Chartland and Yin

[12] showed that IRLS is theoretically better than IRL1.

However, even for the simplest ℓ

-minimization problem

in Eq. (4), IRLS and IRL1 sometimes cannot converge to

the desired solutions. As shown in Fig. 1, given p = 0.5,

λ = 1, and y = 1.3, by initializing x

(0)

= y, IRLS and IR-

L1 would converge to the same local minimum. Since the

problem in Eq. (4) is for 1D optimization, one can deﬁne

a proper thresholding function [33] or construct look-up ta-

bles (LUTs) [25] in advance. For several special values of

0 0.2 0.4 0.6 0.8 1 1.2 1.4

0.85

0.9

0.95

1.05

1.1

1.15

1.2

GISA

IRL1

IRLS

ITM-ℓ

Figure 1. The solutions of GISA, IRL1, IRLS, and ITM- ℓ

for

solving the problem in Eq. (4) with p = 0.5, λ = 1, and y = 1.3.

IRL1, IRLS, and ITM-ℓ

converge to the same local minimum, but

GISA can converge to a better solution.

p, e.g., 1/2 or 2/3, the analytic solutions can be derived

[25, 39]. She [33] deﬁned the following ℓ

-norm threshold-

ing function

IT M

(y; λ) =



0, if|y| ≤ τ

(λ)

sgn(y)S

IT M

(y; λ), if|y| > τ

(λ)

, (11)

where sgn(y) denotes the sign of y, τ

(λ) = λ

1/(2−p)

(2 −

p)[p/ (1 − p)

1−p

]

1/(2−p)

, g

(θ; λ) = θ + λpθ

p−1

, θ

[λp(1 − p)]

1/(2−p)

, and S

IT M

(y; λ) is the root of the equation

(θ; λ) = |y|. Since g

(θ; λ) is monotonically increasing in

the range of [θ

, +∞), for any |y| ∈ [θ

, +∞), g

(θ; λ) = |y|

has one unique root which can be obtained using numerical

methods. However, as shown in Fig. 1, the thresholding

function in Eq. (11) cannot always guarantee to converge

to the global solution. Krishnan and Fergus [25] proposed

an LUT method to correctly solve the problem in Eq. (4).

In image restoration, the p value can be ﬁxed and |y| should

fall into the range of [0, 1], and thus LUT is very eﬃcien-

t. However, for general ℓ

-norm non-convex sparse coding

problems where the values of x, λ and p are unconstrained,

LUT will not be an eﬀective and eﬃcient solution.

In addition, Marjanovic and Solo [30] proposed a very

similar method to ours for solving the one-scalar l

minimization problem (4). However, our proposed GISA is

diﬀerent from this method across the context. On one hand,

we use a direct and very intuitive way to accurately present

the global solution of the non-convex problem (4) (see Sec-

tion 3.2 and Fig. 2 for details), while [30] makes the prob-

lem somewhat more complicated through pure mathemati-

cal deductions. In particular, our method uses two simple

equations (21) and (22) to obtain the two most importan-

t numerical values of the problem: τ

GS T

(λ) (the threshold

value) and x

∗

(the minimum at the threshold). The method

proposed in [30], however, uses complex mathematics to

accomplish the similar task. Our work is thus much easier

to understand, and it reveals clearly the physical meaning

underlies such kind of non-convex optimization problems,

which are previously believed hard to be solved precisely

and understood intuitively. Furthermore, the motivation-

s and the main mechanisms of our method and [30] are

signiﬁcantly diﬀerent. The main goal of our method is to

solve the non-convex sparse coding problems through itera-

tive shrinkage mechanism for computer vision tasks such as

image deconvolution and face recognition, while [30] aims

mainly at matrix completion by majorization-minimization

strategy for DNA microarray analysis.

3. Generalized shrinkage / thresholding

function

3.1. Soft-thresholding

To solve the ℓ

-minimization problem:

min

(y − x)

+ λ

, (12)

Donoho [16] proposed a soft-thresholding operator:

(y; λ) =



0, if|y| ≤ λ

sgn(y)(|y| − λ), if|y| > λ

. (13)

Generally, if |y| ≤ λ, the soft-thresholding operator uses the

thresholding rule to assign T

(y; λ) to 0; otherwise, uses the

shrinkage rule to assign T

(y; λ) to sgn(y)(|y| − λ).

3.2. Generalization of soft-thresholding

Inspired by soft-thresholding, we proposed a gener-

alized shrankage/thresholding operator to solve the ℓ

minimization problem in Eq. (4) by modifying the thresh-

olding and the shrinkage rules.

If y > 0, the solution to Eq. (4) should fall into the range

of [0, y]; otherwise, into the range of [y, 0]. Without loss

of generality, in the following we only consider the case of

y > 0. Let

f (x) =

(x − y)

+ λ

. (14)

Note that f (x) is diﬀerentiable in the range of (0, +∞). By

setting p = 0.5 and λ = 1, in Fig. 2 we show the plots of

f (x) with ﬁve typical y values. As shown in Fig. 2, given

p and λ there exists a speciﬁc threshold τ

GS T

(λ). If y <

GS T

(λ), x = 0 is the global minimum; otherwise, the non-

zero solution would be optimal. Thus, to generalize soft

thresholding for solving the problem in Eq. (4), we could

focus on two issues: (1) the calculation of threshold τ

GS T

(λ)

and (2) the fast searching of the non-zero solution.

The ﬁrst- and second-order derivatives of f (x) are:

′

(x) = x − y + λpx

p−1

, (15)

′′

(x) = 1 + λp(p − 1)x

p−2

. (16)

By solving f

′′

(λ,p)

) = 0, we have

(λ,p)

(

λp(1 − p)

)

2−p

. (17)

One can easily verify that f (x) is concave in the range of

(0, x

(λ,p)

), and is convex in the range of (x

(λ,p)

, +∞). To guar-

antee that f (x) has a minimum in (x

(λ,p)

, +∞), we should

further require f

′

(λ,p)

) ≤ 0. In [33], She let f

′

(λ,p)

) = 0

and solved the following equation

′

(λ,p)

(

λp(1− p)

)

2−p

−τ

IT M

(λ)+λp

(

λp (1− p)

)

p−1

2−p

= 0.

(18)

The corresponding threshold on y is

IT M

(λ) = λ

1/(2−p)

(2 − p)[p/(1 − p)

1−p

]

1/(2−p)

. (19)

In ITM, She [33] extended the soft-thresholding with the

thresholding function in Eq. (11).

However, the thresholding rule in [33] is problematic.

Although y > τ

IT M

(λ) can guarantee that equation

∗

− y + λp

(

∗

)

p−1

= 0 (20)

has a unique solution in (x

(λ,p)

, +∞), as shown in Fig. 2(c),

this minimum f (x

∗

) might be higher than f (0). Thus, the

thresholding function in Eq. (11) actually is not a good

generalization of the soft-thresholding operator for ℓ

-norm

minimization.

From Fig. 2(d), one can see that there exists a speciﬁc

y, where f (x

∗

) is exactly f (0). Thus, to generalize soft-

thresholding, we should solve the following nonlinear e-

quation system to determine a correct thresholding value

GS T

(λ) and its corresponding x

∗



∗

− τ

GS T

(λ)



+ λ



∗





GS T

(λ)



(21)

∗

− τ

GS T

(λ) + λp



∗



p−1

= 0. (22)

Based on Eq. (22), we can substitute τ

GS T

(λ) in Eq. (21)

with x

∗

+ λp



∗



p−1

, and obtain the following equation



∗





2λ(1 − p) −



∗



2−p



= 0. (23)

Thus the only solution of x

∗

in the range of (x

(λ,p)

, +∞) can

be obtained as

∗

(

2λ(1 − p)

)

2−p

, (24)

and the thresholding value τ

GS T

(λ) is

GS T

(λ) =

(

2λ(1 − p)

)

2−p

+ λp

(

2λ(1 − p)

)

p−1

2−p

. (25)

We have the following two theorems, and the proofs of

them can be found in the supplementary materials.

Theorem 1 For any y ∈ (τ

GS T

(λ), +∞), f (x) has one u-

nique minimum S

GS T

(y; λ) in the range of (x

∗

, +∞), which

can be obtained by solving the following equation:

GS T

(y; λ) − y + λp



GS T

(y; λ)



p−1

= 0. (26)

Theorem 2 For any y ∈ (τ

GS T

(λ), +∞), let S

GS T

(y; λ) be

the unique minimum of f (x) in the range of (x

∗

, +∞). We

have the following inequality:

f (0) > f



GS T

(y; λ)



. (27)

0 0.5 1

0.6

0.8

1.2

(a)

0 0.5 1

0.8

0.9

1.1

1.2

(b)

0 0.5 1

0.9

1.1

1.2

(c)

0 0.5 1

1.2

1.3

1.4

1.5

(d)

0 0.5 1

1.2

1.3

1.4

1.5

1.6

1.7

(e)

Figure 2. Plots of the function f (x) in Eq. (14) with diﬀerent values of y: (a) y = 1, (b) y = 1.19, (c) y = 1.3, (d) y = 1.5, and (e) y = 1.6.

Algorithm 1 (GST): T

GS T

(y; λ) = GS T (y, λ, p, J)

Input: y, λ, p, J

1. τ

GS T

(λ) =

(

2λ(1 − p)

)

2−p

+ λp

(

2λ(1 − p)

)

p−1

2−p

2. if |y| ≤ τ

GS T

(λ)

3. T

GS T

(y; λ) = 0

4. else

5. k = 0, x

(k)

= |y|

6. Iterate on k = 0, 1, ..., J

7. x

(k+1)

− λp



(k)



p−1

8. k ← k + 1

9. T

GS T

(y; λ) = sgn(y)x

(k)

10. end

Output: T

GS T

(y; λ)

To solve Eq. (26), we propose an iterative algorithm

GS T (y, λ, p), which is summarized in Algorithm 1.

In Algorithm 1, the output would converge to the correct

solution when J → ∞. Empirically we found that satisfac-

tory results can be obtained by choosing J = 2 or 3.

Finally, we propose a generalized soft-thresholding

(GST) function for solving the ℓ

-norm minimization in Eq.

(4):

GS T

(y; λ) =



0, if|y| ≤ τ

GS T

(λ)

sgn(y)S

GS T

(|y|; λ), if|y| > τ

GS T

(λ)

. (28)

Like the soft-thresholding function, the GST function al-

so involves a thresholding rule T

GS T

(y; λ) = 0 when |y| ≤

GS T

(λ) and a shrinkage rule T

GS T

(y; λ) = sgn(y)S

GS T

(y; λ)

when |y| > τ

GS T

(λ). Compared with the thresholding func-

tion in [33], in GST we adopt a diﬀerent thresholding val-

ue τ

GS T

(λ), and propose an algorithm, i.e., Algorithm 1, to

solve the equation in Eq. (26). Based on Theorem 1 and

Theorem 2, GST can always ﬁnd the correct solution to the

simple ℓ

-minimization problem in Eq. (4). Thus, GST can

be regarded a better generalization of soft-thresholding for

ℓ

-minimization.

3.3. Discussions

Let’s further discuss two important cases of GST, i.e.,

when p = 1 and p = 0, and their relationships with soft-

thresholding [16] and hard-thresholding [2, 19].

When p = 1, GST will converge after one iteration. S-

ince

lim

p→1

GS T

(λ) = λ lim

p→1

(

1 − p

)

p−1

= λ, (29)

the thresholding value of GST will become λ, and the GST

function becomes

GS T

(y; λ) =



0, if

≤ λ

sgn(y)

(

− λ

)

, if

> λ

. (30)

One can see that the soft-thresholding function is a special

case of GST with p = 1.

When p = 0, GST will also converge after one iteration.

The thresholding value of GST will be

GS T

(λ) =

(

2λ

)

, (31)

and the GST function becomes

GS T

(y; λ) =











0, if

≤

(

2λ

)

y, if

(

2λ

)

, (32)

which is exactly the hard-thresholding function [2, 19] de-

ﬁned for solving the following problem

min

(y − x)

+ P(x; λ), (33)

where the penalty function P [2, 19, 33] is deﬁned as

P(x; λ) =



0, ifx = 0

λ, ifx , 0

. (34)

Clearly, the hard-thresholding function is a special case of

GST with p = 0.

4. Generalized iterated shrinkage algorithm

With the proposed GST in Eq. (28), we can readily have

a generalized iterated shrinkage algorithm (GISA) for solv-

ing the ℓ

-norm non-convex sparse coding problem. GST

can also be easily applied for image restoration.

4.1. GISA

The proposed GISA is an iterative algorithm, and in each

iteration it involves a gradient descent step based on A or y,

followed by a generalized shrinkage/thresholding step:

(k+1)

= T

GS T

(k)

−

∥

−2

(Ax − y);

∥

−2

λ), (35)

where

∥

denotes the spectral norm of the matrix A. The

proposed GISA algorithm is summarized in Algorithm 2.

Algorithm 2 (GISA): x = GIS A(y, λ, p, J)

Input: y, λ, p, J

1. Initialize x

(0)

, t =

∥

−2

2. while not converge do

3. x

(k+0.5)

= x

(k)

− tA

(Ax

(k)

− y).

4. x

(k+1)

= GS T (x

(k+0.5)

, tλ, p, J).

5. end while

6. x = x

(k)

Output: x

Actually, GISA is a generalization of the iterative shrink-

age/thresholding (IST) method [15], and an example of the

iterative thresholding method (ITM) [33]. In [33], She

proved that, for any thresholding function Θ

(

y; λ

)

deﬁned

for −∞ < y < +∞ and 0 ≤ λ < +∞, if Θ

(

y; λ

)

satisﬁes the

following properties:

i) Θ(−y; λ) = −Θ(y; λ),

ii) Θ(y; λ) ≤ −Θ(y

′

; λ) if y ≤ y

′

iii) lim

y→∞

Θ(y; λ) = ∞,

iv) 0 ≤ Θ(y; λ) ≤ y for 0 ≤ y < ∞,

the ITM method would converge to a stationary point. One

can easily see that the GST function in Eq. (28) satisﬁes all

these four properties. Thus the convergence of GISA could

be guaranteed. From Theorems 1 and 2, one can easily see

that, GISA converges to the global optimum when A is a

positive diagonal matrix. When A is unitary, by exploiting

the unitary-invariant property of the ℓ

-norm, GISA can al-

so converge to the optimal solution. Moreover, if p = 1,

GISA would degenerate to IST, and would converge to the

global minimum.

Besides, several algorithms, e.g., Two-step IST (TwIST)

[5] and accelerated proximal gradient (APG) [4], have

been proposed to speedup IST. By substituting the soft-

thresholding function with GST, we can also use these al-

gorithms for ℓ

-norm non-convex sparse coding.

4.2. Sparse gradient based deconvolution using

GST

One important application of sparse coding is image

restoration. As an example, in this subsection we apply the

proposed GST to image deconvolution. Let x be the origi-

nal image. In image deconvolution, the degraded image y is

modeled as ﬁrst convolving x with a blur kernel k and then

adding additive white Gaussian noise

y = x ⊗ k + e, (36)

where ⊗ denotes the convolution operator, and e is the ad-

ditive white Gaussian noise with variance σ

A typical image deconvolution model usually includes

a ﬁdelity term and a regularization term, where the ﬁdelity

term is modeled based on the degradation process, and the

regularization term is modeled based on image priors. Re-

cent studies on natural image statistics have shown that the

marginal distributions of ﬁltering responses can be modeled

as hyper-Laplacian with 0 < p < 1 [25, 28, 35], which had

been adopted in many low level vision problems [13, 36].

By using the sparse gradient based image prior, the image

deconvolution model can be formulated as

min

∥

x ⊗ k − y

∥

+ λ

∥

, (37)

where λ is the regularization parameter, D = [D

, D

] de-

notes the gradient operator, and D

and D

are the horizontal

and vertical gradient operators, respectively.

Based on [25, 37], we introduce a new variable d = Dx,

and reformulate the problem in (37) as

min

x,d

∥

x ⊗ k − y

∥

ηλ

∥

Dx − d

∥

+ λ

∥

. (38)

When η → ∞, the problem in Eq. (38) would have the same

solution as the problem in Eq. (37).

We adopt an alternating minimization strategy to solve

the problem in Eq. (38). In each iteration, given a ﬁxed d,

x can be obtained by solving the following subproblem

min

∥

x ⊗ k − y

∥

ηλ

∥

Dx − d

∥

. (39)

Actually, the solution to x can be written in the closed form

[25, 37]

x= F

−1









µλD



+ F (k)

∗

◦ F (y)

µλ



F (D

)+F (D

)



+F (k)

∗

◦ F (k)







, (40)

where F denotes the 2D Fourier transform, F

−1

denotes 2D

inverse Fourier transform, “∗” denotes complex conjugate,

“◦” stands for the component-wise multiplication, and the

division is also operated component-wisely.

Given a ﬁxed x, let d

re f

= Dx, and d can be obtained by

solving the following subproblem:

min



d − d

re f



∥

. (41)

Using GST, the solution to each d

can be written as

= T

GS T

re f

; 1/η). (42)

Finally, we summarize the GST based image deconvolution

algorithm in Algorithm 3.

Algorithm 3 is similar to the algorithms in [25, 37], but

Wang and Yin [ 37] only studied the Laplacian prior (p =

1), and Krishnan and Fergus [25] used look-up table (LUT)

to solve the subproblem in Eq. (41). Here we empirically

choose J = 1, making our algorithm very eﬃcient for sparse

gradient based image deconvolution.

5. Experimental results

In this section, we evaluate the proposed GISA on two

representative vision applications: image deconvolution

and face recognition. In image deconvolution experiments,

we compare GISA with four state-of-the-art algorithms

of ℓ

-norm non-convex sparse coding: LUT, IRLS, IRL1,

and ITM-ℓ

. The results show that GISA is as accurate

as LUT but is more eﬃcient, and it is more accurate and

A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding

Figures

Citations

Rank Minimization for Snapshot Compressive Imaging

Weighted Schatten $p$ -Norm Minimization for Image Denoising and Background Subtraction

Weighted Schatten $p$-Norm Minimization for Image Denoising and Background Subtraction

Outlier-robust extreme learning machine for regression problems

Hyper-Laplacian Regularized Unidirectional Low-Rank Tensor Recovery for Multispectral Image Denoising

References

A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

Robust Face Recognition via Sparse Representation

De-noising by soft-thresholding

Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?

Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?

Related Papers (5)

Enhancing Sparsity by Reweighted ℓ 1 Minimization

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

A Singular Value Thresholding Algorithm for Matrix Completion

Robust principal component analysis

Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering

Frequently Asked Questions (14)

Q1. What are the contributions in "A generalized iterated shrinkage algorithm for non-convex sparse coding" ?

Q2. What is the main purpose of this paper?

Q3. What is the lp-norm non-convex coding problem?

Q4. what is the spectral norm of the matrix A?

Q5. What is the typical parameter setting for the SRC?

Q6. What is the regularization parameter of the gradient operator?

Q7. What is the fidelity term of the model?

Q8. What is the effect of p on the recognition accuracy of the SRC model?

Q9. How many images are used in the experiments?

Q10. What is the simplest way to model the marginal distributions of filtering responses?

Q11. What is the thresholding function in Eq. (4)?

Q12. How did the authors solve the lpminimization problem in Eq. (4)?

Q13. What is the current estimation of x(k)?

Q14. What is the way to implement SRC-p?