Numerical methods for solving linear least squares problems

doi:10.1007/BF01436075

Aplikace matematiky

Gene Howard Golub

Numerical methods for solving linear least squares problems

Aplikace matematiky, Vol. 10 (1965), No. 3, 213–216

Persistent URL: http://dml.cz/dmlcz/102951

Terms of use:

Institute of Mathematics of the Czech Academy of Sciences provides access to digitized documents

strictly for personal use. Each copy of any part of this document must contain these Terms of use.

This document has been digitized, optimized for electronic delivery and

stamped with digital signature within the project DML-CZ: The Czech Digital

Mathematics Library http://dml.cz

SVAZEK

10 (1965)

APLIKACE

MATEMATIKY

ČÍSLO

3

NUMERICAL

METHODS FOR SOLVING LINEAR LEAST

SQUARES PROBLEMS

1

)

GENE

H.

GOLUB

(to

topic

d)

One

of the problems which arises most frequently in a Computer Laboratory is

that

of finding linear least squares solutions. These problems arise in a variety of

contexts, e.g., statistical applications, numerical solution of integral equations of the

first

kind, etc. Linear least squares problems are particularly difficult to

solve

because

they frequently

involve

large

quantities of data, and they are ill-conditioned by their

nature.

Let A be a

given

m x n real matrix with m

=

n and of rank r, and b a

given

vector.

We

wish

to determine a vector x such that

(i)

Ax

= min .

where || ... || indicates the euclidean norm. It is

well

known (cf. [4]) that x

satisfies*

the

equation

(2)

A

T

Ax = A

T

b

Unfortunately,

the matrix A

T

A is frequently ill-conditioned [4] and influenced greatly

by roundoff errors. The

following

example of

LAUCHLI

[2] illustrates this

well.

Suppose

A

=

1,1,

1,

1,1"

«, 0,

0,

0, 0

0, £,

0,

0

0,

£,

0,

0

0,

E,

0

0,

£

1

) This report was supported in part by Office of Naval Research Contract Nonr — 225(37)

(NR 044-11) at Stanford University.

213

then

(3)

A'A =

l+£

2

, 1, 1

L, 1, L

1,

1 + є

2

, 1

U

1, 1,

•J

1, 1

+

e

2

, 1, 1,

1,

1, 1 l, 1 + s

2

, 1,

1,

1, 1

l, 1, 1 + e

Clearly for 8 + 0, the rank of A

T

A is

five

since the eigenvalues of A

T

A are 5 + 8

2

,

2

2 2 2

8

, 8 , 8 , 8 .

Let us assume that the elements of A

T

A are computed using double precision

arithmetic,

and then rounded to

single

precision accuracy. Now let n be the largest

number

on the computer such that

fl(1.0

+ rj) = 1.0 where fl(...) indicates the

floating

point

computation. Then if 8 <

v

A//2, the rank of the computed representa-

tion

of (3)

will

be one. Consequently, no matter how accurate the linear equation

solver,

it is impossible to

solve

the normal equations (2).

In

[1],

HOUSEHOLDER

stressed the use of orthogonal transformations for

solving

linear least squares problems. In this paper, we shall exploit these transformations.

Since the euclidean norm of a vector is unitarily invariant,

||b  Ax\\ = ||c  QAx||

where c = Qb and Q is an orthogonal matrix. We choose Q so that

K

N

(4)

QA = R =

0/}(mл)xи

where R is an upper triangular matrix.

Clearly,

x = K

_1

c

where c is the

first

n components of c and consequently,

j

=

m+l

Since K is an upper triangular matrix and R

T

R = A

T

A, R

T

R is simply the Choleski

decomposition of A

T

A.

There are a number of

ways

to achieve the decomposition (4); e.g., one could apply

a

sequence of plane rotations to annihilate the elements below the diagonal of A.

A

very

effective

method to realize the decomposition (4) is via Householder transfor-

mations

[1]. Let A = A

(1)

, and let A

(2)

, A

(3)

, ..., A

(n+1)

be defined as

follows:

A

(k

+

i)

=

p(k)

A

(k)

(

kz=

i

$

2

9

..'.,n).

214

P

(k)

is a symmetric, orthogonal matrix of the form

P

(k)

= / -

2w

(k)

w

(k)T

for suitable w

(k)

such that w

(k)T

w

(k)

= 1. A derivation of

P(*) j

s

g

i

ve

n in [5]. In order

to simplify the calculations, we redefine P

(k)

as follows:

P

(fc)

= / -

p

k

u

(k)

u

(k)T

m

where a

k

= (£ (a£»)

2

)*, ft = [a

t

(cr

k

+ |

a

<«|)]-i, and

i =

A;

«<*> = 0 for i < k ,

u[

k

>

= sgn «>)

(

fflk

+

|

a

«|),

„(*>

= a

«

for

,- >

k

.

Thus

^tt+D

= 4

w _ „(") (ftuW

T

^("))

_

After P

{k)

has been applied to A

{k)

,

A

ik+1)

appears as follows:

£(*+!)

A

(k

+ 1)

=

where

R^

k+1)

is a k x k upper triangular matrix which is unchanged by subsequent

transformations. Now

a

(

k

X)

= - (sgn

a

(k)

k

)

a

k

so that the rank of A is less than n

if

G

fc

= 0. Clearly,

R

=

A

(n

+ 1)

and

Q __ p(n)p(n-l) p(l)

although one need not compute Q explicitly.

Let x be the intial solution obtained, and let x

—

x + e. Then

where

|| b

—

Ax| — || r

—

Ae||

r

—

b — Ax , the residual vector.

Thus the correction vector e is itself the solution to a linear least squares problem.

Once A has been decomposed then it is a fairly simple matter to compute r and solve

for e. Since e critically depends upon the residual vector, the components of r should

be computed using double precision inner products and then rounded to single

precision accuracy. Naturally, one should continue to iterate as long as improved

estimates of x are obtained.

215

The

above

iteration technique

will

converge only if the initial approximation to x

is

sufficiently

accurate. Let

x

(«

+

n

= x

(t)

+ e

(«) (

q

= o,

1,...)

with x

(0)

= 0.

Then

if ||

e(1)

||/||

x(1)

|| > c and if c < J, i.e., "at least one bit of the initial solution

is correct", one should not iterate since there is little likelihood that the iterative

method

will

converge. Since convergence tends to be linear, one should terminate the

procedure as soon as ||

e(k+1)

|| > c||e

(k)

[|.

References

[1] A. S.

Householder:

Unitary Triangularization of a Nonѕymmеtriс Matrix, J. Аѕѕoс. Comput.

Maсh.,

Vol. 5 (1958), pp. 339—342.

[2] P. Läuchli: Jordan-Elimination und Аuѕglеiсhung naсh klеinѕtеn Quadratеn, Numеr. Math.,

Vol. 3 (1961), pp. 226—240.

[3] Y. Linnik: Method of Lеaѕt Ѕquarеѕ and Prinсiplеѕ of thе Thеory of Obѕеrvationѕ, tranѕlatеd

from Ruѕѕian by R. C Elandt, Pеrgamon Pгеѕѕ, Nеw

York,

1961.

[4] E. E.

Osborne:

On Lеaѕt Ѕquarеѕ Ѕolutionѕ of Linеar Equationѕ, J. Аѕѕoс. Comput. Maсh.,

Vol. 8 (1961), pp. 628—636.

[5] J. H. Wilkinson: Нouѕеholdеťѕ Mеthod for thе Ѕolution of thе Аlgеbraiс Eigеnproblem,

Comput.

Ј., Vol. 3 (1960), pp. 23—27.

Gene

H.

Golub,

Computation Center, Ѕtanford Univerѕity, Ѕtanford, California, U.Ѕ.А.

216

Numerical methods for solving linear least squares problems

Summary (1 min read)

Summary

Citations

References

"Numerical methods for solving linea..." refers methods in this paper

"Numerical methods for solving linea..." refers methods in this paper

Related Papers (5)