What is the inner product space of symmetric matrices?

In this inner product space, the set S of symmetric matrices form a closed subspace and SDD+ is a closed and convex polyhedral cone [1, 18, 16].

How is the solution of the problem solved?

An effective way of solving (41) is by means of alternating projection methods combined with a geometrical understanding of the feasible region.

What is the bounded set of indices?

Since the set {dk}k∈K4 is bounded, there exists a sequence of indices K5 ⊂ K4 such that limk∈K5 dk = d and B ∈ B such that limk∈K5 Bk = B.

What is the simplest way to find the ncolsncols?

For the feasible region, L and U are given ncols×ncols real matrices, and SDD+ represents the cone of symmetric and diagonally dominant matrices with positive diagonal, i.e.,SDD+ = {X ∈ IRncols×ncols | XT = X and xii ≥ ∑j 6=i|xij | for all i}.

What is the function of the alternating projection algorithm?

Let us recall that for a given nonempty closed and convex set Ω of IRn, and any y0 ∈ IRn, there exists a unique solution y∗ to the problemmin y ∈ Ω‖y0 − y‖, (21)which is called the projection of y0 onto Ω and is denoted by PΩ(y0).

What is the dk of the algorithm?

Let dk be such that xk + dk ∈ Ω andQk(dk) ≤ η Qk(d̄k). (4)If dk = 0, stop the execution of the algorithm declaring that xk is a stationary point.

What is the Frobenius norm of a real matrix?

‖A‖F denotes the Frobenius norm of a real matrix A, defined as‖A‖2F = 〈A,A〉 = ∑i,j(aij) 2 ,where the inner product is given by 〈A,B〉 = trace(AT B).

What is the spectral choice of Bk?

In this case,Bk = 1λspgk Iwhereλspgk ={min(λmax,max(λmin, s T k sk/s T k yk)), if s T k yk > 0, λmax, otherwise,sk = xk − xk−1 and yk = gk − gk−1; so thatQk(d) = ‖d‖22λspgk + gTk d. (20)When Bk = (1/λ spg k )I (spectral choice) the optimal direction d̄k is obtained by projecting xk − λ spg k gk onto Ω, with respect to the Euclidean norm.

What is the simplest way to find a symmetric ncolsncol?

In particular, the authors consider the following problem:Minimize ‖AX −B‖2F subject toX ∈ SDD+0 ≤ L ≤ X ≤ U,(41)where A and B are given nrows× ncols real matrices, nrows ≥ ncols, rank(A) = ncols, and X is the symmetric ncols×ncols matrix that the authors wish to find.

What was the performance of the experiment?

All the experiments were run on a Sun Ultra 60 Workstation with 2 UltraSPARC-II processors at 296-Mhz, 512 Mb of main memory, and SunOS 5.7 operating system.

What is the way to solve the problemmin y y0?

the authors assume that for all y ∈ IRn, the calculation of PΩ(y) is a difficult task, whereas, for each Ωi, PΩi(y) is easy to obtain.

(Open Access) Inexact spectral projected gradient methods on convex sets (2003) | Ernesto G. Birgin

Q: What have the authors contributed in "Inexact spectral projected gradient methods on convex sets" ?

A specific algorithm, the Inexact Spectral Projected Gradient method ( ISPG ), is implemented using inexact projections computed by Dykstra ’ s alternating projection method and generates interior iterates. The ISPG method is a generalization of the Spectral Projected Gradient method ( SPG ), but can be used when projections are difficult to compute.

Q: How many variables can be solved in the SPG method?

The SPG method was able to solve problems of this family with up to 96254 variables and up to 578648 constraints in very few seconds of computer time.

Q: What are the good features of the ISPG method?

Numerical experiments were presented concerning constrained least-squares rectangular matrix problems to illustrate the good features of the ISPG method.

Inexact Spectral Projected Gradient Methods on Convex Sets

Ernesto G. Birgin

∗

Jos´e Mario Mart´ınez

†

Marcos Raydan

‡

March 26, 2003

Abstract

A new method is introduced for large scale convex constrained optimization. The

general model algorithm involves, at each iteration, the approximate minimization

of a convex quadratic on the feasible set of the original problem and global conver-

gence is obta ined by means of nonmonotone line sear ches. A speciﬁc algorithm, the

Inexact Spectral Projected Gradient method (ISPG), is implemented using inexact

projections computed by Dykstra’s alternating projection method and generates inte-

rior iterates. The ISPG method is a generalization of the Spectral Projected Gradient

method (SPG), but can be used when projections are diﬃcult to c ompute. Numerical

results for constra ined least-squares rectangular matrix problems are presented.

Key words: Convex constrained optimization, projected gradient, nonmonotone line

search, spectral gradient, Dykstra’s algorithm.

AMS Subject Classiﬁcation: 49M07, 49M10, 65K, 90C06, 90C20.

1 Introduction

We consider the problem

Minimize f(x) subject to x ∈ Ω, (1)

where Ω is a closed convex set in IR

. Th roughout this paper we assume that f is deﬁn ed

and has continuous partial derivatives on an open set that contains Ω.

The Spectral Projected Gradient (SPG) m ethod [6, 7] was recently proposed for solving

(1), especially for large-scale problems since the storage requirements are minimal. This

∗

Department of Computer Science, Institute of Mathematics and Statistics, University of S˜ao Paulo, Rua

do Mat˜ao 1010 Cidade Universit´aria, 05508-090 S˜ao Paulo, SP - Brazil (egbirgin@ime.usp.br). Sponsored

by FAPESP (Grants 01/04597-4 and 02/00094-0), CNPq (Grant 300151/00-4) and Pronex.

†

Departamento de Matem´atica Aplicada, IMECC-UNICAMP, CP 6065, 13081-970 Campinas SP, Brazil

(martinez@ime.unicamp.br). Sponsored by FAPESP (Grant 01/04597-4), CNPq and FAEP-UNICAMP.

‡

Departamento de Computaci´on, Facultad de Ciencias, Universidad Central de Venezuela, Ap. 47002,

Caracas 1041-A, Venezuela (mraydan@reacciun.ve). Sponsored by the Center of Scientiﬁc Computing at

UCV.

method has proved to be eﬀective for very large-scale convex programming problems.

In [7] a family of location problems was described with a variable number of variables

and constraints. The SP G method was able to solve problems of this family with up to

96254 variables and up to 578648 constraints in very few seconds of compu ter time. The

computer code that implements S P G and produ ces the mentioned results is pu blished [7]

and available. More recently, in [5] an active-set method which uses SPG to leave the faces

was introduced, and bound-constrained problems with up to 10

variables were solved.

The SPG method is related to the practical version of Bertsekas [3] of the classical

gradient projected method of Goldstein, Levitin and Polyak [21, 25]. However, some

critical diﬀerences make this method much more eﬃcient than its gradient-projection

predecessors. The main point is that the ﬁrst trial step at each iteration is taken using

the spectral steplength (also known as the Barzilai-Borwein choice) introduced in [2] and

later analyzed in [9, 19, 27] among others. The spectral step is a Rayleigh quotient related

with an average Hessian matrix. For a review containing the more recent advances on this

special choice of steplength see [20]. The second improvement over traditional gradient

projection methods is that a nonmonotone search must be used [10, 22]. This feature seems

to be essential to preserve the n ice and nonmonotone behaviour of the iterates produced

by single spectral gradient steps.

The reported eﬃciency of the SPG method in very large problems motivated us to

introduce the inexact-projection version of the method. I n fact, the main drawback of the

SPG method is that it requires the exact projection of an arbitrary point of IR

onto Ω

at every iteration.

Projecting onto Ω is a diﬃcult problem unless Ω is an easy set (i.e. it is easy to

project onto it) as a box, an aﬃne subspace, a ball, etc. However, for many important

applications, Ω is not an easy set and the projection can only be achieved inexactly. For

example, if Ω is the intersection of a ﬁnite collection of closed and convex easy sets, cycles

of alternating projection methods could be used. This sequence of cycles could be stopped

prematurely leading to an inexact iterative scheme. In this work we are mainly concerned

with extending the machinery developed in [6, 7] for the more general case in which the

projection onto Ω can only be achieved inexactly.

In Section 2 we deﬁne a general model algorithm and prove global convergence. In

Section 3 we introduce the ISPG method and we describe the use of Dykstra’s alternat-

ing projection method for obtaining inexact projections onto closed and convex s ets. In

Section 4 we present numerical experiments and in Section 5 we draw some conclusions.

2 A general model algorithm and its global convergence

We say that a point x ∈ Ω is stationary, for problem (1), if

g(x)

d ≥ 0 (2)

for all d ∈ IR

such that x + d ∈ Ω.

In this work k · k denotes the 2-norm of vectors and matrices, although in some cases it

can be replaced by an arbitrary norm. We also denote g(x) = ∇f (x) and IN = {0, 1, 2, . . .}.

Let B be the set of n × n positive deﬁnite matrices such that kBk ≤ L and kB

−1

k ≤ L.

Therefore, B is a compact set of IR

n×n

. In the spectral gradient approach, the matrices

will be diagonal. However, the algorithm and theorem that we present below are quite

general. The matrices B

may be thought as deﬁning a sequence of diﬀerent metrics in IR

according to which we perform projections. For this reason, we give the name “Inexact

Variable Metric” to the method introduced below.

Algorithm 2.1: Inexact Variable Metric Met hod

Assume η ∈ (0, 1], γ ∈ (0, 1), 0 < σ

< σ

< 1, M a positive integer. Let x

∈ Ω be an

arbitrary initial point. We denote g

= g(x

) for all k ∈ IN. Given x

∈ Ω, B

∈ B, the

steps of the k−th iteration of the algorithm are:

Step 1. Compute the search direction

Consider the subproblem

Minimize Q

(d) subject to x

+ d ∈ Ω, (3)

where

(d) =

d + g

Let

be the minimizer of (3). (This minimizer exists and is un ique by the strict convexity

of the subproblem (3), but we will see later that we do not need to compute it.)

Let d

be such that x

+ d

∈ Ω and

) ≤ η Q

(

). (4)

If d

= 0, stop the execution of the algorithm declaring that x

is a stationary point.

Step 2. Compute the steplength

Set α ← 1 and f

max

= max{f(x

k−j+1

) | 1 ≤ j ≤ min{k + 1, M}}.

f(x

+ αd

) ≤ f

max

+ γαg

, (5)

set α

= α, x

k+1

= x

+α

and ﬁnish the iteration. O th erwise, choose α

new

∈ [σ

α, σ

α],

set α ← α

new

and repeat test (5).

Remark. I n the deﬁnition of Algorithm 2.1 the possibility η = 1 corresponds to th e case

in which the subproblem (3) is solved exactly.

Lemma 2.1. The algorithm is well deﬁned.

Proof. Since Q

is strictly convex and the domain of (3) is convex, the problem (3) has a

unique solution

. If

= 0 then Q

(

) = 0. Since d

is a feasible point of (3), and, by

(4), Q

) ≤ 0, it turns out that d

. Therefore, d

= 0 and the algorithm stops.

6= 0, th en , since Q

(0) = 0 and the solution of (3) is unique, it follows that

(

) < 0. Then, by (4), Q

) < 0. Since Q

is convex and Q

(0) = 0, it follows that

is a descent direction for Q

, therefore, g

< 0. So, for α > 0 small enough,

f(x

+ αd

) ≤ f(x

) + γαg

Therefore, the condition (5) must be satisﬁed if α is small enough. This completes the

proof. 2

Theorem 2.1. Assume that the level set {x ∈ Ω | f (x) ≤ f(x

)} is bounded. Then, ei-

ther the algorithm stops at some stationary point x

, or every limit point of the generated

sequence is stationary.

The proof of Theorem 2.1 is based on the following lemmas.

Lemma 2.2. Assume that the sequence generated by Algorithm 2.1 stops at x

. Then, x

is stationary.

Proof. If the algorithm stops at some x

, we have that d

= 0. Therefore, Q

) = 0.

Then, by (4), Q

(

) = 0. So,

= 0. Therefore, for all d ∈ IR

such that x

+ d ∈ Ω we

have g

d ≥ 0. Thus, x

is a stationary point. 2

For the remaining results of this section we assume that the algorithm does not stop.

So, inﬁnitely many iterates {x

}

k∈IN

are generated and, by (5), f(x

) ≤ f(x

) for all

k ∈ IN. Thus, under the hypothesis of T heorem 2.1, the s equ en ce {x

}

k∈IN

is bounded.

Lemma 2.3. Assume that {x

}

k∈IN

is a sequence generated by Algorithm 2.1. Deﬁne,

for all j = 1, 2, 3, . . .,

= max{f (x

jM−M+1

), f (x

jM−M+2

) . . . , f(x

)},

and ν(j) ∈ {jM − M + 1, jM − M + 2, . . . , jM} such that

f(x

ν(j)

) = V

Then,

j+1

≤ V

+ γα

ν(j+1)−1

. (6)

for all j = 1, 2, 3, . . ..

Proof. We will prove by ind uction on ℓ that for all ℓ = 1, 2, . . . , M and f or all j = 1, 2, 3, . . .,

f(x

jM+ℓ

) ≤ V

+ γα

jM+ℓ−1

< V

. (7)

By (5) we have that, for all j ∈ IN,

f(x

jM+1

) ≤ V

+ γα

< V

so (7) holds f or ℓ = 1.

Assume, as the inductive hypothesis, that

f(x

jM+ℓ

′

) ≤ V

+ γα

jM+ℓ

′

−1

jM+ℓ

′

−1

jM+ℓ

′

−1

< V

(8)

for ℓ

′

= 1, . . . , ℓ.

Now, by (5), and the deﬁnition of V

, we have that

f(x

jM+ℓ+1

) ≤ max

1≤t≤M

{f(x

jM+ℓ+1−t

} + γα

jM+ℓ

= max{f (x

(j−1)M +ℓ+1

), . . . , f(x

jM+ℓ

)} + γα

jM+ℓ

≤ max{V

, f(x

jM+1

), . . . , f(x

jM+ℓ

)} + γα

jM+ℓ

But, by the inductive hypothesis,

max{f(x

jM+1

), . . . , f(x

jM+ℓ

)} < V

so,

f(x

jM+ℓ+1

) ≤ V

+ γα

jM+ℓ

< V

Therefore, the inductive proof is complete and, so, (7) is proved. Since ν(j + 1) = jM + ℓ

for some ℓ ∈ {1, . . . , M}, this implies the d esired result. 2

From now on, we deﬁne

K = {ν(1) − 1, ν(2) − 1, ν(3) − 1, . . .},

where {ν(j)} is the sequence of indices deﬁned in Lemma 2.3. Clearly,

ν(j) < ν(j + 1) ≤ ν(j) + 2M (9)

for all j = 1, 2, 3, . . ..

Lemma 2.4.

lim

k∈K

(

) = 0.

Proof. By (6), since f is continuous and bounded below,

lim

k∈K

= 0. (10)

But, by (4),

0 > Q

) =

+ g

≥ g

∀ k ∈ IN.

Inexact spectral projected gradient methods on convex sets

Figures

Citations

Probing the Pareto Frontier for Basis Pursuit Solutions

On Augmented Lagrangian Methods with General Lower-Level Constraints

Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming

Spectral residual method without gradient information for solving large-scale nonlinear systems of equations

A scaled gradient projection method for constrained image deblurring

References

Nonlinear Programming

Two-Point Step Size Gradient Methods

A nonmonotone line search technique for Newton's method

Nonmonotone Spectral Projected Gradient Methods on Convex Sets

Constrained minimization methods

Related Papers (5)

Nonmonotone Spectral Projected Gradient Methods on Convex Sets

Two-Point Step Size Gradient Methods

A nonmonotone line search technique for Newton's method

The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem

Nonlinear Programming

Frequently Asked Questions (15)

Q1. What have the authors contributed in "Inexact spectral projected gradient methods on convex sets" ?

Q2. What is the inner product space of symmetric matrices?

Q3. How is the solution of the problem solved?

Q4. What is the bounded set of indices?

Q5. How many variables can be solved in the SPG method?

Q6. What is the simplest way to find the ncolsncols?

Q7. What is the function of the alternating projection algorithm?

Q8. What are the good features of the ISPG method?

Q9. What is the dk of the algorithm?

Q10. What is the Frobenius norm of a real matrix?

Q11. What is the spectral choice of Bk?

Q12. What is the simplest way to find a symmetric ncolsncol?

Q13. What was the performance of the experiment?

Q14. What is the main drawback of the SPG method?

Q15. What is the way to solve the problemmin y y0?