THE TORTOISE AND THE HARE RESTART GMRES
MARK EMBREE
Abstrat.
When solving large nonsymmetri systems of linear equations with the restarted
GMRES algorithm, one is inlined to selet a relatively large restart parameter in the hop e of
mimiking the full GMRES pro ess. Surprisingly, ases exist where small values of the restart
parameter yield onvergene in fewer iterations than larger values. Here, two simple examples are
presented where GMRES(1) onverges exatly in three iterations, while GMRES(2) stagnates. One
of these examples reveals that GMRES(1) onvergene an b e extremely sensitive to small hanges
in the initial residual.
Key words.
Restarted GMRES, Krylov subspae methods.
AMS sub jet lassiations.
65F10, 37N30
1. Intro dution.
GMRES is an iterative method for solving large nonsymmetri
systems of linear equations,
Ax
=
b
[8℄. Throughout siene and engineering, this
algorithm and its variants routinely solve problems with millions of degrees of freedom.
Its p opularity is rooted in an optimality ondition: At the
k
th iteration, GMRES
omputes the solution estimate
x
k
that minimizes the Eulidean norm of the residual
r
k
=
Ax
k
b
over a subspae of dimension
k
,
k
r
k
k
= min
p
2
P
k
p
(0)=1
k
p
(
A
)
r
0
k
;
(1.1)
where
P
k
denotes those polynomials with degree not exeeding
k
, and
r
0
=
b
Ax
0
is the initial residual. As eah iteration enlarges the minimizing subspae, the residual
norm dereases monotonially.
GMRES optimality omes at a ost, however, sine eah iteration demands both
more arithmeti and memory than the one b efore it. A standard work-around is
to restart the pro ess after some xed number of iterations,
m
. The resulting algo-
rithm, GMRES(
m
), uses the approximate solution
x
m
as the initial guess for a new
run of GMRES, ontinuing this proess until onvergene. The global optimality of
the original algorithm is lost, so although the residual norms remain monotoni, the
restarted pro ess an stagnate with a non-zero residual, failing to ever onverge [8℄.
Sine GMRES(
m
) enfores loal optimality on
m
-dimensional spaes, one antiipates
that inreasing
m
will yield onvergene in fewer iterations. Many pratial examples
onrm this intuition.
We denote the
k
th residual of GMRES(
m
) by
r
(
m
)
k
. To b e preise, one yle
between restarts of GMRES(
m
) is ounted as
m
individual iterations. Conventionally,
then, one exp ets
k
r
(
m
)
k
k k
r
(
`
)
k
k
for
` < m
. Indeed, this must be true when
k
m
.
Surprisingly, inreasing the restart parameter sometimes leads to
slower
onver-
gene:
k
r
(
m
)
k
k
>
k
r
(
`
)
k
k
for
` < m < k
. The author enountered this phenomenon
while solving a disretized onvetion-diusion equation desribed in [4℄. In unpub-
lished exp eriments, de Sturler [1℄ and Walker and Watson [11℄ observed similar b e-
havior arising in pratial appliations. One wonders, how muh smaller than
k
r
(
m
)
k
k
might
k
r
(
`
)
k
k
be? The smallest possible ases ompare GMRES(1) to GMRES(2) for
3-by-3 matries. Eiermann, Ernst, and Shneider present suh an example for whih
Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford OX1 3QD,
United Kingdom (mark.embreeomlab.ox.a.uk). Supp orted by UK Engineering and Physial Si-
enes Researh Counil Grant GR/M12414.
1
2
MARK EMBREE
k
r
(1)
4
k
=
k
r
(2)
4
k
= 0
:
2154
: : :
[2, pp. 284{285℄. Otherwise, the phenomenon we desrib e
has apparently reeived little attention in the literature.
The purp ose of this artile is twofold. First, we desribe a pair of extreme ex-
amples where GMRES(1) onverges exatly at the third iteration, while GMRES(2)
seems to never onverge. The seond example leads to our seond p oint: Small p er-
turbations to the initial residual an dramatially alter the onvergene b ehavior of
GMRES(1).
2. First Example.
Consider using restarted GMRES to solve
Ax
=
b
for
A
=
0
1 1 1
0 1 3
0 0 1
1
A
;
b
=
0
2
4
1
1
A
:
(2.1)
Taking
x
0
=
0
yields the initial residual
r
0
=
b
. Using the fat that
A
and
r
0
are
real, we an derive expliit formulas for GMRES(1) and GMRES(2) diretly from the
GMRES optimality ondition (1.1). The reurrene for GMRES(1),
r
(1)
k
+1
=
r
(1)
k
r
(1)T
k
Ar
(1)
k
r
(1)T
k
A
T
Ar
(1)
k
Ar
(1)
k
;
(2.2)
was studied as early as the 1950s [3,
x
71℄,[7℄. For the
A
and
r
0
=
b
dened in (2.1),
this iteration onverges
exatly
at the third step:
r
(1)
1
=
0
3
3
0
1
A
;
r
(1)
2
=
0
3
0
0
1
A
;
r
(1)
3
=
0
0
0
0
1
A
:
Expressions for one GMRES(2) yle an likewise b e derived using elementary alu-
lus. The up dated residual takes the form
r
(2)
k
+2
=
p
(
A
)
r
(2)
k
, where
p
(
z
) = 1 +
z
+
z
2
is a quadrati whose o eÆients
=
(
A
;
r
(2)
k
) and
=
(
A
;
r
(2)
k
) are given by
=
(
r
(2)T
k
AAr
(2)
k
)(
r
(2)T
k
A
T
AAr
(2)
k
)
(
r
(2)T
k
Ar
(2)
k
)(
r
(2)T
k
A
T
A
T
AAr
(2)
k
)
(
r
(2)T
k
A
T
Ar
(2)
k
)(
r
(2)T
k
A
T
A
T
AAr
(2)
k
)
(
r
(2)T
k
A
T
AAr
(2)
k
)(
r
(2)T
k
A
T
AAr
(2)
k
)
;
=
(
r
(2)T
k
Ar
(2)
k
)(
r
(2)T
k
A
T
AAr
(2)
k
)
(
r
(2)T
k
AAr
(2)
k
)(
r
(2)T
k
A
T
Ar
(2)
k
)
(
r
(2)T
k
A
T
Ar
(2)
k
)(
r
(2)T
k
A
T
A
T
AAr
(2)
k
)
(
r
(2)T
k
A
T
AAr
(2)
k
)(
r
(2)T
k
A
T
AAr
(2)
k
)
:
Exeuting GMRES(2) on the matrix and right hand side (2.1) reveals
r
(2)
1
=
0
3
3
0
1
A
;
r
(2)
2
=
1
2
0
3
0
3
1
A
;
r
(2)
3
=
1
28
0
24
27
33
1
A
;
r
(2)
4
=
1
122
0
81
108
162
1
A
:
The inferiority of GMRES(2) ontinues well b eyond the fourth iteration. For example:
k
k
r
(2)
k
k
=
k
r
0
k
5 0.376888290025532. . .
10 0.376502488858910. . .
15 0.376496927936533. . .
20 0.376496055944867. . .
25 0.376495995285626. . .
30 0.376495984909087. . .
RESTARTED GMRES
3
k
r
(
m
)
k
k
k
r
0
k
iteration,
k
GMRES(1)
GMRES(2)
0 5 10 15 20 25 30
10
0
10
5
10
10
10
15
Fig. 1
.
Convergene urves for GMRES
(1)
and GMRES
(2)
applied to
(2.1)
with
x
0
=
0
.
The entire onvergene urve for the rst thirty iterations is shown in Figure 1, based
on p erforming GMRES(2) in exat arithmeti using Mathematia.
The partiular value of
b
(and thus
r
0
) studied ab ove is exeptional, as it is
unusual for GMRES(1) to onverge exatly in three iterations. Remarkably, though,
GMRES(1) maintains sup eriority over GMRES(2) for a wide range of initial residuals.
For this matrix
A
, GMRES(2) onverges exatly in one yle for any initial residual
with zero in the third omp onent, so we restrit attention to residuals normalized to
the form
r
0
= (
; ;
1)
T
. Figure 2 indiates that GMRES(2) makes little progress for
most suh residuals, while GMRES(1) onverges to high auray for the vast ma jor-
ity of these
r
0
values. The olor in eah plot reets the magnitude of
k
r
(
m
)
100
k
=
k
r
0
k
:
Blue indiates satisfatory onvergene, while red signals little progress in one hun-
dred iterations. (To ensure this data's delity, we performed these omputations in
both double and quadruple preision arithmeti; dierenes b etween the two were
negligible.)
To gain an appreiation for the dynamis behind Figure 2, we rst examine the
ation of a single GMRES(1) step. From (2.2) it is lear that GMRES(1) will om-
pletely stagnate only when
r
T
0
Ar
0
= 0. For the matrix
A
speied in (2.1) and
r
0
= (
; ;
1)
T
, this ondition redues to
2
+
+
2
+
+ 3
+ 1 = 0
;
(2.3)
the equation for an oblique ellipse in the (
;
) plane.
Now writing
r
(1)
k
= (
; ;
1)
T
, onsider the map
r
(1)
k
7!
s
(1)
k
+1
that pro jets
r
(1)
k
+1
into the (
;
) plane,
s
(1)
k
+1
= (
r
(1)
k
+1
)
1
3
0
(
r
(1)
k
+1
)
1
(
r
(1)
k
+1
)
2
1
A
;
4
MARK EMBREE
10
5 0 5 10
10
5
0
5
10
10
5 0 5 10
10
5
0
5
10
10
15
10
10
10
5
10
0
Fig. 2
.
Convergene of GMRES
(1) (
left
)
and GMRES
(2) (
right
)
for the matrix in
(2.1)
over
a range of initial residuals of the form
r
0
= (
; ;
1)
T
. The olor indiates
k
r
(
m
)
100
k
=
k
r
0
k
on a loga-
rithmi sale: blue regions orrespond to initial residuals that onverge satisfatorily, while the red
regions show residuals that stagnate or onverge very slow ly.
where (
r
(1)
k
+1
)
j
denotes the
j
th entry of
r
(1)
k
+1
, whih itself is derived from
r
(1)
k
via (2.2).
For the present example, we have
s
(1)
k
+1
=
0
B
B
B
3
4
2
+ 3
+ 9
4
1
2
+
+
+ 5
+ 10
3
+
2
3
2
+ 2
2
2
3
+
3
2
+
+
+ 5
+ 10
1
C
C
C
A
:
(2.4)
We an lassify the xed p oints (
;
) satisfying (2.3) by investigating the Jaobian
of (2.4). One of its eigenvalues is always one, while the other eigenvalue varies ab ove
and b elow one in magnitude. In the left plot of Figure 2, we show the stable portion
of the ellipse (2.3) in blak and the unstable part in white.
We an similarly analyze GMRES(2). This iteration will never progress when, in
addition to the stagnation ondition for GMRES(1),
r
0
also satises
r
T
0
AAr
0
= 0.
For the present example, this requirement implies
2
+ 2
+
2
+ 5
+ 6
+ 1 = 0
;
the equation for an oblique parab ola. This urve intersets the ellipse (2.3) at two
points, drawn as dots in the right plot of Figure 2, the only stagnating residuals
(
; ;
1)
T
for GMRES(2). We an analyze their stability as done above for GMRES(1).
The pro jeted map for this iteration,
r
(2)
k
7!
s
(2)
k
+2
, takes the form
s
(2)
k
+2
=
0
B
B
3
2
3
+ 4
+ 9
4
2
3
+ 4
+ 9
1
C
C
A
:
(2.5)
Analyzing the Jaobian for this GMRES(2) map at the pair of xed p oints, we nd one
to b e unstable (shown in blak in the right plot of Figure 2) while the other is stable
(shown in white). This stable xed p oint is an attrator for stagnating residuals.
RESTARTED GMRES
5
k
r
(
m
)
k
k
k
r
0
k
iteration,
k
GMRES(1)
GMRES(2)
0 5 10 15 20 25 30
10
0
10
5
10
10
10
15
Fig. 3
.
Convergene urves for GMRES
(1)
and GMRES
(2)
applied to
(3.1)
with
x
0
=
0
.
We return briey to the initial residual
r
0
= (2
;
4
;
1)
T
. After the rst few itera-
tions, the angle between
r
(2)
k
and the xed vetor steadily onverges to zero at the
rate 0
:
6452
: : :
suggested by the Jaobian's dominant eigenvalue. We onlude with
high ondene that GMRES(2) never onverges for this initial residual. (If one yle
of GMRES(
m
) pro dues a residual parallel to
r
0
, then either
r
(
m
)
m
=
r
0
or
r
(
m
)
m
=
0
.
Thus a residual an't remain xed in the nite (
;
) plane, but still onverge to
0
.)
3. Seond Example.
The matrix
A
in (2.1) is nondiagonalizable, and one might
be tempted to blame its surprising onvergene behavior on this fat. To demonstrate
that nondiagonalizablity is not an essential requirement, we exhibit a diagonalizable
matrix with eigenvalues
f
1
;
2
;
3
g
for whih restarted GMRES also pro dues extreme
behavior. Take
A
=
0
B
1 2
2
0 2 4
0 0 3
1
C
A
;
b
=
0
3
1
1
1
A
;
(3.1)
with
x
0
=
0
. Again, we onstrut the rst few residuals. For GMRES(1),
r
(1)
1
=
0
2
1
0
1
A
;
r
(1)
2
=
0
2
0
0
1
A
;
r
(1)
3
=
0
0
0
0
1
A
;
while GMRES(2) yields
r
(2)
1
=
0
2
1
0
1
A
;
r
(2)
2
=
0
1
0
1
1
A
;
r
(2)
3
=
1
17
0
8
12
8
1
A
;
r
(2)
4
=
1
67
0
12
12
28
1
A
:
Figure 3 illustrates the onvergene urve for thirty iterations, again omputed using
exat arithmeti.