scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Numerical methods for solving linear least squares problems

01 Jun 1965-Numerische Mathematik (Springer-Verlag)-Vol. 7, Iss: 3, pp 206-216
TL;DR: This paper considers stable numerical methods for handling linear least squares problems that frequently involve large quantities of data, and they are ill-conditioned by their very nature.
Abstract: A common problem in a Computer Laboratory is that of finding linear least squares solutions. These problems arise in a variety of areas and in a variety of contexts. Linear least squares problems are particularly difficult to solve because they frequently involve large quantities of data, and they are ill-conditioned by their very nature. In this paper, we shall consider stable numerical methods for handling these problems. Our basic tool is a matrix decomposition based on orthogonal Householder transformations.

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • Institute of Mathematics of the Czech Academy of Sciences provides access to digitized documents strictly for personal use.
  • Each copy of any part of this document must contain these Terms of use.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Aplikace matematiky
Gene Howard Golub
Numerical methods for solving linear least squares problems
Aplikace matematiky, Vol. 10 (1965), No. 3, 213–216
Persistent URL: http://dml.cz/dmlcz/102951
Terms of use:
© Institute of Mathematics AS CR, 1965
Institute of Mathematics of the Czech Academy of Sciences provides access to digitized documents
strictly for personal use. Each copy of any part of this document must contain these Terms of use.
This document has been digitized, optimized for electronic delivery and
stamped with digital signature within the project DML-CZ: The Czech Digital
Mathematics Library http://dml.cz

SVAZEK
10 (1965)
APLIKACE
MATEMATIKY
ČÍSLO
3
NUMERICAL
METHODS FOR SOLVING LINEAR LEAST
SQUARES PROBLEMS
1
)
GENE
H.
GOLUB
(to
topic
d)
One
of the problems which arises most frequently in a Computer Laboratory is
that
of finding linear least squares solutions. These problems arise in a variety of
contexts, e.g., statistical applications, numerical solution of integral equations of the
first
kind, etc. Linear least squares problems are particularly difficult to
solve
because
they frequently
involve
large
quantities of data, and they are ill-conditioned by their
nature.
Let A be a
given
m x n real matrix with m
=
n and of rank r, and b a
given
vector.
We
wish
to determine a vector x such that
(i)
Ax
= min .
where || ... || indicates the euclidean norm. It is
well
known (cf. [4]) that x
satisfies*
the
equation
(2)
A
T
Ax = A
T
b
Unfortunately,
the matrix A
T
A is frequently ill-conditioned [4] and influenced greatly
by roundoff errors. The
following
example of
LAUCHLI
[2] illustrates this
well.
Suppose
A
=
1,1,
1,
1,1"
«, 0,
0,
0, 0
0, £,
0,
0,
0
0,
0,
£,
0,
0
0,
0,
0,
E,
0
0,
0,
0,
0,
£
1
) This report was supported in part by Office of Naval Research Contract Nonr 225(37)
(NR 044-11) at Stanford University.
213

then
(3)
A'A =
l+£
2
, 1, 1
L, 1, L
1,
1 + є
2
, 1
U
1, 1,
J
1, 1
+
e
2
, 1, 1,
1,
1, 1 l, 1 + s
2
, 1,
1,
1, 1
l, 1, 1 + e
Clearly for 8 + 0, the rank of A
T
A is
five
since the eigenvalues of A
T
A are 5 + 8
2
,
2
2 2 2
8
, 8 , 8 , 8 .
Let us assume that the elements of A
T
A are computed using double precision
arithmetic,
and then rounded to
single
precision accuracy. Now let n be the largest
number
on the computer such that
fl(1.0
+ rj) = 1.0 where fl(...) indicates the
floating
point
computation. Then if 8 <
v
A//2, the rank of the computed representa-
tion
of (3)
will
be one. Consequently, no matter how accurate the linear equation
solver,
it is impossible to
solve
the normal equations (2).
In
[1],
HOUSEHOLDER
stressed the use of orthogonal transformations for
solving
linear least squares problems. In this paper, we shall exploit these transformations.
Since the euclidean norm of a vector is unitarily invariant,
||b Ax\\ = ||c QAx||
where c = Qb and Q is an orthogonal matrix. We choose Q so that
K
N
(4)
QA = R =
0/}(mл)xи
where R is an upper triangular matrix.
Clearly,
x = K
_1
c
where c is the
first
n components of c and consequently,
j
=
m+l
Since K is an upper triangular matrix and R
T
R = A
T
A, R
T
R is simply the Choleski
decomposition of A
T
A.
There are a number of
ways
to achieve the decomposition (4); e.g., one could apply
a
sequence of plane rotations to annihilate the elements below the diagonal of A.
A
very
effective
method to realize the decomposition (4) is via Householder transfor-
mations
[1]. Let A = A
(1)
, and let A
(2)
, A
(3)
, ..., A
(n+1)
be defined as
follows:
A
(k
+
i)
=
p(k)
A
(k)
(
kz=
i
$
2
9
..'.,n).
214

P
(k)
is a symmetric, orthogonal matrix of the form
P
(k)
= / -
2w
(k)
w
(k)T
for suitable w
(k)
such that w
(k)T
w
(k)
= 1. A derivation of
P(*) j
s
g
i
ve
n in [5]. In order
to simplify the calculations, we redefine P
(k)
as follows:
P
(fc)
= / -
p
k
u
(k)
u
(k)T
m
where a
k
= (£ (a£»)
2
)*, ft = [a
t
(cr
k
+ |
a
<«|)]-i, and
i =
A;
«<*> = 0 for i < k ,
u[
k
>
= sgn «>)
(
fflk
+
|
a
«|),
„(*>
= a
«
for
,- >
k
.
Thus
^tt+D
= 4
w _ „(") (ftuW
T
^("))
_
After P
{k)
has been applied to A
{k)
,
A
ik+1)
appears as follows:
£(*+!)
A
(k
+ 1)
=
where
R^
k+1)
is a k x k upper triangular matrix which is unchanged by subsequent
transformations. Now
a
(
k
k
k
X)
= - (sgn
a
(k)
k
)
a
k
so that the rank of A is less than n
if
G
fc
= 0. Clearly,
R
=
A
(n
+ 1)
and
Q __ p(n)p(n-l) p(l)
although one need not compute Q explicitly.
Let x be the intial solution obtained, and let x
x + e. Then
where
|| b
Ax| || r
Ae||
r
b Ax , the residual vector.
Thus the correction vector e is itself the solution to a linear least squares problem.
Once A has been decomposed then it is a fairly simple matter to compute r and solve
for e. Since e critically depends upon the residual vector, the components of r should
be computed using double precision inner products and then rounded to single
precision accuracy. Naturally, one should continue to iterate as long as improved
estimates of x are obtained.
215

The
above
iteration technique
will
converge only if the initial approximation to x
is
sufficiently
accurate. Let
x
(«
+
n
= x
(t)
+ e
) (
q
= o,
1,...)
with x
(0)
= 0.
Then
if ||
e(1)
||/||
x(1)
|| > c and if c < J, i.e., "at least one bit of the initial solution
is correct", one should not iterate since there is little likelihood that the iterative
method
will
converge. Since convergence tends to be linear, one should terminate the
procedure as soon as ||
e(k+1)
|| > c||e
(k)
[|.
References
[1] A. S.
Householder:
Unitary Triangularization of a Nonѕymmеtriс Matrix, J. Аѕѕoс. Comput.
Maсh.,
Vol. 5 (1958), pp. 339—342.
[2] P. Läuchli: Jordan-Elimination und Аuѕglеiсhung naсh klеinѕtеn Quadratеn, Numеr. Math.,
Vol. 3 (1961), pp. 226—240.
[3] Y. Linnik: Method of Lеaѕt Ѕquarеѕ and Prinсiplеѕ of thе Thеory of Obѕеrvationѕ, tranѕlatеd
from Ruѕѕian by R. C Elandt, Pеrgamon Pгеѕѕ,w
York,
1961.
[4] E. E.
Osborne:
On Lеaѕt Ѕquarеѕ Ѕolutionѕ of Linеar Equationѕ, J. Аѕѕoс. Comput. Maсh.,
Vol. 8 (1961), pp. 628—636.
[5] J. H. Wilkinson: Нouѕеholdеťѕ Mеthod for thе Ѕolution of thе Аlgеbraiс Eigеnproblem,
Comput.
Ј., Vol. 3 (1960), pp. 23—27.
Gene
H.
Golub,
Computation Center, Ѕtanford Univerѕity, Ѕtanford, California, U.Ѕ.А.
216
Citations
More filters
Journal ArticleDOI
TL;DR: Numerical tests are described comparing I~QR with several other conjugate-gradient algorithms, indicating that I ~QR is the most reliable algorithm when A is ill-conditioned.
Abstract: An iterative method is given for solving Ax ~ffi b and minU Ax b 112, where the matrix A is large and sparse. The method is based on the bidiagonalization procedure of Golub and Kahan. It is analytically equivalent to the standard method of conjugate gradients, but possesses more favorable numerical properties. Reliable stopping criteria are derived, along with estimates of standard errors for x and the condition number of A. These are used in the FORTRAN implementation of the method, subroutine LSQR. Numerical tests are described comparing I~QR with several other conjugate-gradient algorithms, indicating that I~QR is the most reliable algorithm when A is ill-conditioned.

4,189 citations

Journal ArticleDOI
TL;DR: The authors propose an alternative learning procedure based on the orthogonal least-squares method, which provides a simple and efficient means for fitting radial basis function networks.
Abstract: The radial basis function network offers a viable alternative to the two-layer neural network in many applications of signal processing. A common learning algorithm for radial basis function networks is based on first choosing randomly some data points as radial basis function centers and then using singular-value decomposition to solve for the weights of the network. Such a procedure has several drawbacks, and, in particular, an arbitrary selection of centers is clearly unsatisfactory. The authors propose an alternative learning procedure based on the orthogonal least-squares method. The procedure chooses radial basis function centers one by one in a rational way until an adequate network has been constructed. In the algorithm, each selected center maximizes the increment to the explained variance or energy of the desired output and does not suffer numerical ill-conditioning problems. The orthogonal least-squares learning strategy provides a simple and efficient means for fitting radial basis function networks. This is illustrated using examples taken from two different signal processing applications. >

3,414 citations

Journal ArticleDOI
Donald W. Marquaridt1
TL;DR: In this article, the authors discuss a class of biased linear estimators employing generalized inverses and establish a unifying perspective on nonlinear estimation from nonorthogonal data.
Abstract: A principal objective of this paper is to discuss a class of biased linear estimators employing generalized inverses. A second objective is to establish a unifying perspective. The paper exhibits theoretical properties shared by generalized inverse estimators, ridge estimators, and corresponding nonlinear estimation procedures. From this perspective it becomes clear why all these methods work so well in practical estimation from nonorthogonal data.

1,828 citations

Journal ArticleDOI
TL;DR: An algorithm for the analysis of multivariate data is presented and is discussed in terms of specific examples to find one-and two-dimensional linear projections of multivariable data that are relatively highly revealing.
Abstract: An algorithm for the analysis of multivariate data is presented and is discussed in terms of specific examples. The algorithm seeks to find one-and two-dimensional linear projections of multivariate data that are relatively highly revealing.

1,635 citations

References
More filters
Journal ArticleDOI
TL;DR: This note points out that the same result can be obtained with fewer arithmetic operations, and, in particular, for inverting a square matrix of order N, at most 2(N-1) square roots are required.
Abstract: A method for the inversion of a nonsymmetric matrix has been in use at ORNL and has proved to be highly stable numerically but to require a rather large number of arithmetic operations, including a total of N(N-1)/2 square roots. This note points out that the same result can be obtained with fewer arithmetic operations, and, in particular, for inverting a square matrix of order N, at most 2(N-1) square roots are required. For N > 4, this is a savings of (N-4)(N-1)/4 square roots. (T.B.A.)

577 citations


"Numerical methods for solving linea..." refers methods in this paper

  • ...There are a number of ways to achieve the decomposition (2.1); e.g., one could apply a sequence of plane rotations to annihilate the elements below the diagonal of A. A very effective method to realize the decomposition (2.t) is via HOUSEHOLDER transformations [ 2 ]....

    [...]

  • ...In [ 2 ], HOUSEHOLDER stressed the use of orthogonal transformations for solving linear least squares problems....

    [...]

Journal ArticleDOI
TL;DR: This paper determines error bounds for a number of the most effective direct methods of inverting a matrix by analyzing the effect of the rounding errors made in the solution of the equations.
Abstract: 1. In order to assess the relative effectiveness of methods of inverting a matrix it is useful to have a priori bounds for the errors in the computed inverses. In this paper we determine such error bounds for a number of the most effective direct methods. To illustrate fully the techniques we have used, some of the analysis has been done for floating-point computat ion and some for fixed-point. In all cases it has been assumed tha t the computat ion has been performed using a precision of t binary places, though it should be appreciated tha t on a computer which has both fixed and floating-point facilities the number of permissible digits in a fixed-point number is greater than the number of digits in the mantissa of a floating-point number. The techniques used for analyzing floating-point computat ion are essentially those of [8], and a familiarity with tha t paper is assumed. 2. The error bounds are most conveniently expressed in terms of vector and matr ix norms, and throughout we have used the Euclidean vector norm and the spectral matrix norm except when explicit reference is made to the contrary. For convenience the main properties of these norms are given in Section 9. In a recent paper [7] we analyzed the effect of the rounding errors made in the solution of the equations

381 citations

Journal ArticleDOI

105 citations


"Numerical methods for solving linea..." refers methods in this paper

  • ...In [ 7 ], RILEY suggested the following algorithm for solving linear least squares problems for r=n....

    [...]