New matrix transformations for obtaining characteristic vectors

doi:10.1090/QAM/39373

325

QUARTERLY OF APPLIED MATHEMATICS

Vol. VIII January, 19S1 No. 4

NEW MATRIX TRANSFORMATIONS FOR OBTAINING

CHARACTERISTIC VECTORS*

By WILLIAM FELLER (Princeton University)

AND

GEORGE E. FORSYTHE (National Bureau of Standards, Los Angeles)

1. Summary of methods. Let A be a non-defective (see §2) square matrix of order n,

symmetric or not, for which it is desired to determine some of the characteristic values

v and associated column-vectors X and row-vectors Y. In terms of matrix products

these quantities are defined by the relations

AX = vX, YA = vY. (1)

A class of numerical procedures is based on iteration methods to obtain one character-

istic value X and the associated vectors C, R. Then A is transformed into a matrix A'

and a new iteration is used to obtain a characteristic value v' and characteristic vectors

X', Y' of A', which can then be converted into corresponding quantities v, X, Y for A.

If more values are wanted, one can continue by transforming A' to A", and so on.

Vector iteration schemes for getting one characteristic value of a matrix were described

in 1929 in [15]; these methods are explained and extended in [1, 13, 8, 9, 5].

In the present paper we are not interested in the iteration procedure as such, but

wish to discuss a class of transformations whereby A' is obtained from A. The earliest

of these known to us is "deflation," suggested by Hotelling [6, 7] for symmetric matrices

and extended in Aitken's thorough study [1] to non-symmetric matrices, defective or

not. In [3] and [4, p. 143] Duncan and Collar introduced a different transformation

(see §3) for non-defective matrices; this was restated in [10] and [16]. It has the ad-

vantage that it reduces the order of the matrix, but it destroys the symmetry. In a

relatively inaccessible paper [13] Semendiaev gave a careful exposition of Aitken's

techniques, and extended them to cover the case of multiple characteristic values in

full generality. His transformation is very general; in the simplest case it somewhat

resembles that of Duncan and Collar. Semendiaev expressed his transformation in the

form of a matrix relation A' = UAU"l. Blanch has devised (unpublished) another

modification of the Duncan-Collar reduction in the form UAU'1.

In [14] Tucker published a related transformation yielding a matrix A' of order

n + 1 which is defective with respect to a double characteristic value zero. Although

the coefficients are obtained easily by bordering A, the increased order may be a dis-

advantage. Tucker's method is not directly a special case of our (5).

A distantly related matrix transformation is the "escalator method" of Morris and

Head [12, 11]. It relates the complete set of characteristic values and vectors of A to

*Received May 20, 1950. The preparation of this paper was sponsored (in part) by the Office of

Naval Research.

326 WILLIAM FELLER AND GEORGE E. FORSYTHE [Vol. VIII, No. 4

the complete set for a submatrix of order n — 1. For this reason the escalator method

cannot be compared to the transformations considered in this paper, in which at each

stage one deals with only two characteristic values of A.

In §2 we present in formulas (5) a four-parameter family of transformations from

A to A'. This family is general enough to include deflation and the procedures of Duncan

and Collar, Semendiaev, and Blanch as special cases; see §3. In §4 two subclasses of

the transformation are discussed: order-reducing and symmetry-preserving transforma-

tions. Two new methods which are both order-reducing and symmetry-preserving appear

promising for practical work with symmetric matrices A. Even for non-symmetric

matrices we feel that our family of transformations offers a choice of procedures which

may occasionally prove useful.

2. The general reduction method. The reduction formulas will be proved in all cases

by means of the following lemma, which can be easily verified.

Lemma. Let the matrix A have the characteristic value v with corresponding column-vector

X and row-vector Y. Let U be a non-singular matrix. Then the matrix A' = UAXJ_1 has

the characteristic value v with corresponding vectors

X' = UX, Y' = YU~\ (2)

We assume for simplicity that A is not defective.* Let X be a (known) characteristic

value of A, with a corresponding column-vector C = {cx , • • • , c„} and row-vector

R — (rj , • • • , r„). Let v be some other (unknown) characteristic value of A, with corre-

sponding column X = [Xi , • • • , x„} and row Y = (yl , ■ ■ ■ , y„). Assume C, R, X, Y

to be so normalized that

RC = YX = 1. (3)

The characteristic values X and v are permitted to be equal, provided that X, Y satisfy

the orthogonality conditions

YC = RX = 0, (4)

which are automatic when X v.

Let y, p, 13, t be complex parameters, and put for abbreviation

r = 1 — ycn , P = 1 - prn , f = TP — yp.

We now introduce a new matrix A' = A'(y, p, /3; t), defined for all values of the param-

eters except when /3 = 0 and at the same time f 0:

a'a = au — yCiani — prjain + ypctrj(ann — X) — /fc.r, ,

a'in = — P[ain - yCiann — fc;r„r + (X - t)yCi\,

-/i8_1[a»; - prjann - fc„r-,P + (X - t)prt (if /3 ^ 0)

0 (if / = p = 0) (1 < i < n - 1, 1 < j < n - 1),

aL = f(am - tc„rn) + (X - i)(l — /).

a'- =

(5)

*In the terminology of [9] a defective matrix A is one for which no transform PAP 1 is a diagonal

matrix. An equivalent definition is that A has one or more non-linear elementary divisors; see [2].

1951] MATRIX TRANSFORMATIONS FOR OBTAINING CHARACTERISTIC VECTORS 327

For all values of the parameters a matrix U = U(y, p, (3) will be defined below, with

the property that

A'(y, p, 0; t) = U(A - tCR)U~\ (6)

By the lemma A' has the characteristic values of A — tCR, i.e., those of A except that

the single characteristic value X is changed to X — t. In particular, v is a characteristic

value of A', and the vectors X, Y are transformed into characteristic vectors X' =

TJX and Y' = YU~l of A' corresponding to v. By use of the formulas for U given below,

it is easily shown in each case that

x'i = Xi — yCiXn (1 < i < n — 1),

(if 0 ^ 0)

Xn -

(if / = 0 = 0),

y'i = Hi - pfiVn (1 < j <n - 1), y'n = -Pyn .

(7)

The vectors C, R are transformed into characteristic vectors C' = UC, R' = RLT1 of

A' corresponding to X — t. The formulas for the components of C, R' are given for

each case below, as are the definitions of U, IT"1.

An arbitrary choice of the parameters y, p, 0, t may be used to analyze the matrix

A. Knowing X, C, R, one computes A' from (5). By iteration or otherwise one next

determines a new characteristic value of A'. By the lemma v is also a characteristic

value of A, and the vectors of A corresponding to v may be calculated from the relations

(7); these give xk and yk in terms of x'k and y'k except when / = 0. (The case (3 = 0 is

included in the exceptional case / = 0, as was stated before (5).) When / = 0, one first

gets xn , yn from formulas (8), which are derived from (7), (4), and (3):

n—1

X„ = — (7 + r„r)_1 2 riXi >

t-1

(8)

Vn = -(p + c„P)_1 y'fii .

i = l

If still more characteristic values of A are desired, one can use v, X', Y' in (5) to trans-

form A' into A", and so on. The procedure is useful to the extent that it is easier to

find v, X', Y' from A' than it is to find v, X, Y from A.

It remains only to exhibit U = U(y, p, 0), so that the reader may verify equation (5).

For completeness C and R' are also given. There will be three cases. All matrices are

exhibited in a partitioned form, with a square matrix of (n — l)-th order at the upper

left.

Case 1. r 9^ 0, ^ 0. Here we define

U = U(y, P, 0)

X-

328 WILLIAM FELLER AND GEORGE E. FORSYTHE [Vol. VIII, No. 4

where Sa = 0 (tV j) and 5,,- = 1 (i = j). It can be verified that

/ Sa — ypT or,

U' =

V-pr.T"1

-/sr_,7Ci

-isr-1

It is found that

c'i = Tci(l < i < n — 1), ci = — p/3_1 — fcnj8-1;

r'i = /r~V,(l < j < n - 1), K = -j9rr_1 - /3r„

Case 2. T = 0. /3 5^ 0. Here 7 = <£l, and we define

'Sa + CJi

\—(2p + c„P)/3~V,-

' Si, — c<r, + pc~\r„cn - l)c,r,

E/ = c/fc1, P, js) =

*7-' =

\ — (2p + c„P)r,

It is found that

CiC~ (rncn - 1) N

p/3-c;1 — (2 p + c„P)/TV„y

fic,c~\r„cn — 1) \

—/3 + /3(r„c„ — 1)/

c5 = C;(l < i < n — 1), ci = —(p + c„P)/3 \

r? = —prfin\ 1 < j < n — 1), ri = —/8<£\

Case 3. / = 0 = 0. Here 7, p are restricted to such values that / =

(1 — 7C„)(1 — pr„) — 7p = 0. We define

U = 17(7, P, 0) =

IT--|

It is found that

Sa — Tc>r,

-7Ci — rc;0

5,,- — Pc,r,

v - pTj - Pc„r,

cj = 0 (1 < i < n — 1), ci = 1,

r'i = 0 (1 < j < n - 1), r'n = 1.

3. Known special cases of our transformation. For certain values of the parameters

7, p, j8, t the matrices A' defined by (5) have previously been used to analyze matrices

A. We know of the following special cases:

(a) Duncan and Collar [3] and [4, p. 143]: Case 2, with 7 = ci"1, p = 0, /3 = —1,

t = X. (Rows and columns have been interchanged in our presentation.) The

matrix A' is the result of subtracting from each of the other rows ci"1 times the

matrix product of C by the last row of A.

1951] MATRIX TRANSFORMATIONS FOR OBTAINING CHARACTERISTIC VECTORS 329

(b) Hotelling [6] (deflation): Case 1, with y = p = 0, /3 = — 1, < = X. Here U is

the identity matrix.

(c) Semendiaev [13, p. 212]: Case 3, with y = c"1, p = /3 = t — 0. This is only a

special case of Semendiaev's general reduction.

(d) Blanch (unpublished procedure used at the National Bureau of Standards, Los

Angeles): Case 1, with y = 0, p = r~\ /3 = 1, t = 0. To compare method (d)

with (a), to which it is closely related, one should interchange rows and columns.

4. New special cases of our transformation. One useful class of matrix transforma-

tions consists of those in §2 for which / = 0; here /3 and t remain unrestricted. For these

it is seen that a'ni = 0 (j = 1, 2, • • • , n — 1), so that A' is essentially reduced to order

n — 1. We call these transformations order-reducing; by their use subsequent iterations

become shorter. The methods (a), (c), (d) in §3 are order-reducing.

Another special class of matrix transformations consists of those in §2 for which

y = p, (82 = /, with t unrestricted. When A is symmetric it is reasonable to pick c,- = r, ,

and then A' is also symmetric; hence this class of transformations is called symmetry-

preserving. In §3 only method (b) is symmetry-preserving.

New transformations which are both symmetry-preserving and order-reducing are

those in Case 3 of §2 for which p = y. Except for the unessential freedom allowed t,

there are commonly two of these transformations. When c< = r,- these may be defined by

Y - P = (c„ + 1)_1 = (r„ + I)"1, (9)

y = p = (cn - I)"1 = (r„ - I)"1. (10)

Symmetric matrices are more convenient to deal with than non-symmetric ones, in that

by their use the storage requirement is approximately halved and the round-off errors

are more easily estimated. For dealing with symmetric matrices A, therefore, the

transformations defined by (9) and (10) look promising. For non-symmetric matrices

A the method (d) of §3 seems quite satisfactory, but it may occasionally be useful to

have other subcases of (5) available.

5. Numerical example. From [8, p. 327] we obtain the symmetric matrix*

-2 -2 0 3-1

-2 0-3 5 0

A = 0-3-5 1 1

3 5 1-3-1

,-1 0 1 -1 -1

By an iteration one can obtain the dominant characteristic value X = —9.88649 and

corresponding normalized row-vector

R = (-.35616, -.52348, -.46374, .61437, .08124).

Since A is symmetric, the column-vector C has the same components.

*We have corrected a misprint in [8], Mr. William Paine of the National Bureau of Standards,

Los Angeles, assisted with the calculations.

New matrix transformations for obtaining characteristic vectors

Citations

Deflation Techniques for an Implicitly Restarted Arnoldi Iteration

Matrix algorithms

The Use of Multiple Deflations in the Numerical Solution of Singular Systems of Equations, with Applications to Potential Theory

The numerical solution of singular integral equations of potential theory

George Forsythe and the development of computer science

References

Analysis of a complex of statistical variables into principal components.

An iteration method for the solution of the eigenvalue problem of linear differential and integral operators

Praktische Verfahren der Gleichungsauflösung .

Introduction to higher algebra

Elementary Matrices: And Some Applications to Dynamics and Differential Equations

Related Papers (5)

On the characteristic vectors of a matrix

Mathematical and Computer Programming Techniques for Computer Graphics

Vectors in three-dimensional space

A general method for differentiation of vectors in orthogonal systems of coordinates

New parameters for the analysis of statokinezigram vectors