(Open Access) Numerical Gradient Algorithms for Eigenvalue and Singular Value Calculations (1994) | John B. Moore

SIAM J. MATRIXANAL.APPL.

@ 1994

Society for Industrial and Applied Mathematics

Vol. 15, No. 3, pp. 881-902, July 1994

011

NUMERICAL GRADIENT ALGORITHMS FOR EIGENVALUE AND

SINGULAR VALUE CALCULATIONS*

J.B. MOOREt , R.E. MAHONYt , AND U. HELMKE$

Abstract. Recent work has shown that the algebraic question of determining the eigenvalues,

or singular values, of a matrix can be answered by solving certain continuous-time gradient flows on

matrix manifolds, To obtain computational methods based on this theory, it is reasonable to develop

algorithms that iteratively approximate the continuous-time flows. In this paper the authors propose

two algorithms, based on a double Lie-bracket equation recently studied by Brockett, that appear

to be suitable for implementation in parallel processing environments. The algorithms presented

achieve, respectively, the eigenvalue decomposition of a symmetric matrix and the singular value

decomposition of an arbitrary matrix. The algorithms have the same equilibria as the continuous-

time flows on wh~ch they are based and inherit the exponential convergence of the continuous-time

solutions.

Key words. eigenvalue decomposition, singular value decomposition, numerical gradient algo-

rithm

AMS subject classifications. 15A18, 65F1O

1. Introduction. A traditional algebraic approach to determining the eigen-

value and eigenvector structure of an arbitrary matrix is the QR-algorithm. In the

early 1980s it was observed that the QR-algorithm is closely related to a continuous-

time differential equation that has become known through study of the Toda lattice.

Symes [13] and Deift, Nanda, and Tomei [6] showed that for tridiagonal real sym-

metric matrices, the QR-algorithm is a discrete-time sampling of the solution to a

continuous-time differential equation. This result was generalised to full complex ma-

trices by Chu [3], and Watkins and Elsner [14] provided further insight in the late

1980s.

Brockett [2] studied dynamic matrix flows generated by the double Lie-bracket

equation

if= [H, [H, N]]>

H(o)=Ho

for constant symmetric matrices N and Ho, and where we use the Lie-bracket notation

[X, Y] = XY – YX. We call this differential equation the double-bracket equation,

and we call solutions of this equation double-bracket flows. Similar matrix differential

equations in the area of Physics were known and studied prior to the references given

above. An example, is the Landau–Lifschitz–Gilbert equation of micromagnetics

drn

–*(fix R-wlzx(fix77)) [ril’=1,

-Z–”

as a + w and -y/a + k, a constant.

In this equation m, ~ E R3 and the cross-

product is equivalent to a Lie-bracket operation.

The relevance of such equations

● Received by the editors April 13, 1992; accepted for publication (in revised form) September

25, 1992. The authors acknowledge the funding of the activities of the Cooperative Research Centre

for Robust and Adaptive Systems by the Australian Commonwealth Government under the Coop-

erative Research Centres Program. The authors also acknowledge additional support from Boeing

Commercial Aircraft Corporation, Inc.

t Department of Systems Engineering, Research School of Physical Sciences and Systems Engi-

neering, Australian National University, A. C.T., 0200, Australia (robert. mahony@smu.edu. au).

1 Department of Mathematics, University of Regensburg, 8400 Regensburg, Germany.

881

882

J. B.

MOORE, R. E. MAHONEY, AND U. HELMKE

to traditional linear algebra problems, however, has only recently been studied and

discretisations of such flows have not been investigated.

The double-bracket equation is not known to be a continuous-time version of any

previously existing linear algebra algorithm; however, it exhibits exponential conver-

gence to an equilibrium point on the manifold of self-equivalent symmetric matrices

[2], [5], [9]. Brockett [2] was able to show that this flow could be used to diagonalise

real symmetric matrices, and thus, to find their eigenvalues, sort lists, and even to

solve linear programming problems. Part of the flexibility and theoretical appeal of

the double-bracket equation follows from its dependence on the arbitrary matrix pa-

rameter N, which can be varied to cent rol the transient behaviour of the differential

equation.

In independent work by Driessel [7], Chu and Driessel [5], Smith [12] and Helmke

and Ivfoore [8], a similar gradient flow approach is developed for the task of comput-

ing the singular values of a general nonsymmetric, nonsquare matrix. The differential

equation obtained in these approaches is almost identical to the double-bracket equa-

tion. In [8], it is shown that these flows can also be derived as special cases of the

double-bracket equation for a nonsymmetric matrix, suitably augmented to be sym-

metric.

With the theoretical aspects of these differential equations becoming known, and

with applications in the area of balanced realizations [10], [11] aiong with the more

traditional matrix eigenvalue problems, there remains the question of efficiently com-

puting their solutions. No explicit solutions to the differential equations have been

obtained and a direct numerical estimate of their integral solutions seems unlikely to

be an efficient computational algorithm. Iterative algorithms that approximate the

cent inuous-t ime flows, however, seem more likely to yield useful numerical methods.

Furthermore, discretisations of such isospectral matrix flows are of general theoretical

interest in the field of numerical linear algebra. For example, the algorithms proposed

in this paper involve adjustable parameters, such as step-size selection schemes and

a matrix parameter N, which are not present in traditional algorithms such as the

QR-algorithm or the Jacobi method.

In this paper, we propose a new algorithm termed the Lie-bracket algorithm, for

computing the eigenvalues of an arbitrary symmetric matrix

For suitably small ak, termed time-steps, the algorithm is an approximation of the

solution to the continuous time double-bracket equation. Thus, the algorithm rep-

resents an approach to developing new recursive algorithms based on approximating

suitable continuous-time flows. We show that for suitable choices of time-steps, the

Lie-bracket algorithm inherits the same equilibria as the double-bracket flow. Further-

more, exponential convergence of the algorithm is shown. This paper presents only

theoretical results on the Lie-bracket algorithm and does

not attempt to compare its

performance to that of existing methods for calculating the eigenvalues of a matrix.

Continuous-time gradient flows that compute the singular values of arbitrary

nonsymmet ric mat rices, such as those covered in [5], [8], [9], [12], have a similar

form to the double-bracket equation on which the Lie-bracket algorithm was based.

We use this similarity to generate a new scheme for computing the singular values

of a general matrix termed the singular value algorithm. The natural equivalence

between the Lie-bracket algorithm and the singular value algorithm is demonstrated

and exponential convergence results follow almost directly.

NUMERICAL GRADIENT ALGORITHMS

883

Associated with the main algorithms presented for the computation of the eigen-

values or singular values of matrices are algorithms that compute the full eigenspace

decompositions of given matrices.

These algorithms are closely related to the Lie-

bracket algorithm and also display exponential convergence.

The paper is divided into eight sections including the Introduction and an Ap-

pendix. In ~2 of this paper, we consider the Lie-bracket algorithm and prove a propo-

sition that ensures the algorithm converges to a fixed point. Section 3 deals with

choosing step-size selection schemes and proposes two valid deterministic functions

for defining the time-steps.

Considering the particular step-size selection schemes

presented in 53 we return to the question of stability in 54 and show that the Lie-

bracket algorithm has a unique exponentially attractive fixed point, though several of

the technical proofs are deferred to the Appendix. This completes the discussion for

the symmetric case and $5 considers the nonsymmetric case and the singular value

decomposition. Section 6 presents associated algorithms that compute the eigenspace

decompositions of given initial conditions. A number of computational issues are

briefly mentioned in $7, while 58 provides a conclusion.

2. The Lie-bracket algorithm. In this section, we begin by introducing the

least squares potential that underpins the recent gradient flow results and then we

describe the double Lie-bracket equation first derived by Brockett [2]. The Lie-bracket

recursion is introduced and conditions are given that guarantee convergence of the

algorithm.

Let N and H be real symmetric matrices and consider the potential function

(1)

@(H) := Ijll- Nl\2

= lpI112+ IIN112- 2tr(N~)

where the norm used is the Frobenius norm 11X112:= tr(XTX) = ~ z~j, with Zij the

elements of X. Note that @(H) measures the least squares difference between the

elements of H and the elements of N. Let kf(Ho) be the set of orthogonally similar

matrices, generated by some symmetric initial condition Ho = H; c I?nxn. Then

(2) M(H~) = {WHJ7 Iu E

o(n)},

where O(n) denotes the group of all n x n real orthogonal matrices. It is shown

in [9, p. 48] that Af(Ho) is a smooth compact Riemannian manifold with explicit

forms given for its tangent space and Rlemannian metric. Furthermore, in [1], [5] the

gradient of +(H), with the respect to the normal Riemannian metric on Al(Ho) [9,

p. 50], is shown to be V@(H) = – [H, [H, N]]. Consider the gradient flow given by

the solution of

(3)

H = –v’@(H)

= [H, [H, N]], with H(0)= Ho,

which we call the double-bracket jlow [2], [5]. Thus, the double-bracket flow is a

gradient flow that acts to decrease or minimise the least squares potential @ on the

manifold M (Ho). Note that from (1), this is equivalent to increasing or maximizing

tr(NH). We refer to the matrix Ho as the initial condition and the matrix N as the

target matrix.

The Lie-bTacket algorithm proposed in this paper is

(4) Hk+~ = e

–Cx~[&~]Hke@k,~]

884

J. B. MOORE, R. E. MAHONEY, AND U. HELMKE

for arbitrary symmetric n x n matrices Ho and N and some suitably small scalars

a~ termed time-steps. To motivate the Lie-bracket algorithm, consider the curve

lf~+l(t) = e-ti~k’~lk?~et[~ ”~l.

Thus, Hk+l(o) = Hk and Hk+l = ffk+l(~k), the

(k+ l)th iteration of (4). Observe that

~(e-’[H’~]~ke~[H~,~])

[~k,[~k,N]],

t=o

and thus, e–t IHk‘N]Hket‘Hk‘N] is a first approximateion of the double-bracket flow at

Hk C A4(HO). It follows that for small ~/c, the solution to (3) evaluated at time t = (%

with H(0) = Hk is approximately Hk+l = Hk+l (~k).

Itis easily seen from above that stationary points of (3) are fixed points of (4).

In general, (4) may have more fixed points than just the stationary points of (3),

however, Proposition 2.1 shows that this is not the case for a suitable choice of time-

step ~k. We use the term equilibrium point to mean a fixed point of the algorithm

that is also a stationary point of (3).

To implement (4) it is necessary to specify the time-steps ok. We do this by

considering functions ct~ :

AI(HO) - R+ and setting ok := crjv(Hk). We refer to the

function ffN as the step-size selection scheme. We require that the step-size selection

scheme satisfies the following condition.

CONDITION 2.1, Let ~N : M(HO) ~ R+ be a step-size selection scheme for the

Lie-bracket algorithm on M(HO). Then ~jv is well defined and continuous on all of

M(Ho), except possibly those points H G M(Ho) where HN = NH. Furthermore,

there exist real numbers B, -y >0, such that B > CYN(H) 2 ~ for all H G M(HO)

where ~N is well defined.

Remark 2.1. We find that the variable step-size selection scheme proposed in this

paper, which provides the best simulation results, is discontinuous at all the points

H G M(HO), such that [H, N] = O.

Remark 2.2. Note that the definition of a step-size selection scheme depends

implicitly on the matrix parameter N. Indeed, ~N can be thought of as a function in

two matrix variables N and H.

CONDITION 2.2. Let N be a diagonal n x n matrix with distinct diagonal entries

/.l,l>/Q?>. ..> #n.

Remark 2.3. This condition on N, along with Condition 2.1 on the step-size

selection scheme, is chosen to ensure that the Lie-bracket algorithm converges to a

diagonal matrix from which the eigenvalues of Ho can be directly determined.

Let A1>A2 >...

> & be the eigenvalues of Ho with associated algebraic

multiplicities nl, . . . .

nr satisfying ~~=1 ni

= n. Note that as Ho is symmetric, the

eigenvalues of Ho are all real. Thus, the diagonalisation of Ho is

where Ini is the ni x ni identity matrix. For generic initial conditions and a target

matrix N that satisfies Condition 2.2, the continuous-time equation (3) converges

exponentially fast to A [2], [9]. Thus, the eigenvalues of Ho are the diagonal entries

of the limiting value of the infinite time solution to (3). The Lie-bracket algorithm

behaves similarly to (3) for small ~k and, given a suitable step-size selection scheme,

should converge to the same equilibrium as the cent inuous-time equation.

NUMERICAL GRADIENT ALGORITHMS

885

PROPOSITION

2.1.

Let HO and N be n x n real symmetric matrices where N

satisfies Condition 2.2. Let V(H) be given by (1) and let ON : M(HO) + R+ be a step-

size selection scheme that satisfies Condition 2.1. For H~ G M (Ho),

let c% = ~N (H,+)

and define

(6) ~

?)(Hk,(l~) := @(H~+I) – ?/!(Hk),

where Hk+l is given by (4). Suppose

(7)

~?/J(Hk,a~) <0 when [Hk, N] + O.

Then (a) The iterative equation (4) defines an isospectral (eigenvalue preserving) re-

cursion on the manifold M(HO).

(b) The fixed points of (4) are characterised by matrices H 6 M(HO) satisfying

(8) [H, N]= O.

of (4), converges as k + cm, to some

Hm G M(HO) where [Hm, N] = O.

Proof To prove part (a), note that the Lie-bracket [H, N]T = – [H, N] is skew-

symmetric. As the exponential of a skew-symmetric matrix is orthogonal, (4) is an

orthogonal conjugation of Hk and hence is isospectral.

For part (b) note that if [Hk, N] = O, then by direct substitution into (4) we see

Hk+~ = Hk and thus, Hk+~ = Hk for 1Z 1, and Hk is a fixed point of (4). Conversely

if [Hk, N] # O, then from (7), ~@(Hk, ~k) # O, and thus Hk+l # Hk. By inspection,

points satisfying (8) are stationary points of (3), and indeed are known to be the only

stationary points of (3) [9, pg. 50]. Thus, the fixed points of (4) are equilibrium

points in the sense that they are all stationary points of (3). To prove part (c) we

need the following lemma.

LEMMA 2.2. Let N satisfy Condition 2.2 and CYNsatisfy Condition 2.1 such

that the Lie-bracket algorithm satisfies (7). The Lie-bracket algorithm (4) has exactly

n!/ ~~=1 (nil) distinct equilibrium points in M(Ho). These equilibrium points are

characterised by the matrices XTAZ7 where ~ is an n

x n permutation matrix, a

rearrangement of the rows of the identity matrix, and A is given by (5).

Proo\ Note that part (b) of Proposition 2.1 characterises equilibrium points of

(4) as H 6 M(HO) such that [H, N] = O. Evaluating this condition componentwise

for H = {hij } gives

hzj(pj – /.42)=

and hence by Condition 2.2, hij = O for i # j. Using the fact that (4) is isospectral, it

follows that equilibrium points are diagonal matrices that have the same eigenvalues

as Ho. Such matrices are distinct and can be written in the form nTA~ for ~ an n x n

permutation matrix. A simple counting argument yields the number of matrices that

satisfy this condition to be n!/ ~~=1 (ni !). II

Consider for a fixed initial condition Ho, the sequence Hk generated by the Lie-

bracket algorithm. Observe that condition (7) implies that @(Hk) is strictly monotonic

decreasing for all k where [Hk, N] #=O. Also, since @ is a continuous function on the

compact set M(HO ), then @ is bounded from below and + (Hk) will converge to some

nonnegative value @m. As @(Hk) - @m then A@(Hk, ak) + O.

For an arbitrary positive number q define the open set De c M(HO), consisting

of all points of M(HO), within an c neighbourhood of some equilibrium point of (4).

Numerical Gradient Algorithms for Eigenvalue and Singular Value Calculations

Citations

Optimization Algorithms on Matrix Manifolds

The Magnus expansion and some of its applications

Optimization and Dynamical Systems

The Sylvester equation and approximate balanced reduction

Learning algorithms utilizing quasi-geodesic flows on the Stiefel manifold

References

Dynamical systems that sort lists, diagonalize matrices and solve linear programming problems

The QR algorithm and scattering for the finite nonperiodic Toda Lattice

A new formulation of the generalized Toda lattice equations and their fixed point analysis via the momentum map

Balanced realizations via gradient flow techniques

On Isospectral Gradient Flows — Solving Matrix Eigenprdblems using Differential Equations

Related Papers (5)

Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems

Matrix computations

The Geometry of Algorithms with Orthogonality Constraints

Optimization and Dynamical Systems

Geometric optimization methods for adaptive filtering

Frequently Asked Questions (1)

Q1. What are the contributions in "Numerical gradient algorithms for eigenvalue and singular value calculations*" ?