LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares
Summary (3 min read)
L i n e a r least squares: m i n i m i z e ][ A x -b 112
- A typical application is to the large least-squares problems arising from analysis of variance (6.g., [12] ).
- They are characterized by their need for only a few vectors of working storage and by their theoretical convergence within at most n iterations (if exact arithmetic could be performed).
- These properties occur naturally in many applications.
- In other cases it is often possible to divide the solution procedure into a direct and an iterative part, such that the iterative part has a better conditioned matrix for which CG-like methods will converge more quickly.
- The FORTRAN implementation of LSQR [22] is designed for practical application.
2. MOTIVATION VIA THE LANCZOS PROCESS
- In any event, eq. (2.2) holds to within machine precision.
- Hence xk may be taken as the exact solution to a perturbed system and will solve the original system whenever 7/kflk+l is negligibly small.
- The above arguments are not complete, but they provide at least some motivation for defining the sequence of vectors {Xk} according to (2.3) .
- In particular, the method of conjugate gradients is known to be equivalent to using the Cholesky factorization Tk LkDk L~ and is reliable when B (and hence Tk) is positive definite, while algorithm SYMMLQ employs the orthogonal factorization Tk --/:k Q~ to retain stability for arbitrary symmetric B. (See [20] for further details of these methods.).
mioll[:l]X-[:]ll: ,2,,
- When the Lanczos process is applied to this system, a certain structure appears in relations (2.1)-(2.3).
- This observation forms the basis of algorithm LSQR.
3.1 Relationship Between the Bidiagonalizations
- The principal connection between the two bidiagonalization procedures is that the matrices.
- Vk are the same for each, and that the identity BTBk ffi RTRk (3.9) holds.
- The rather surprising conclusion is that Rk must be identical to the matrix that would be obtained from the conventional QR faetorization of Bk.
Thus
- In the presence of rounding errors, these identities will, of course, cease to hold.
- They throw light on the advantages of algorithm LSQR over two earlier methods, LSCG and LSLQ, as discussed in Section 7.4.
- The relationship between the orthonormal matrices Uh and Pk can be shown to be for some vector rk.
- The vectors yk and tk+l could then be found from EQUATION [o] and only the most recent iterates need be saved.
- The broad outline of algorithm LSQR is now complete.
5. ESTIMATION OF NORMS
- For the purposes of estimating norms, the authors shall often assume that the orthogonality relations UkTUk =.
- In actual computations these are rarely true, but the resulting estimates have proved to be remarkably reliable.
- Clearly the product of sines in (5.2) decreases monotonically.
5.4 Standard Errors
- This facility has not been implemented in LSQR and the authors have not investigated the accuracy of such approximations.
- Clearly only a limited number of pairs (i, j) could be dealt with efficiently on large problems.
6. STOPPING CRITERIA
- The user may therefore set ATOL and BTOL according to the accuracy of the data.
- It can be seen that these perturbations are within their allowable bounds when the inequality of S 1 holds.
- Hence, criterion S 1 is consistent with the ideas of backward rounding error analysis and with knowledge of data accuracy.
- Since this argument does not depend on orthogonality, $1 can be used in any method for solving compatible linear systems.
6.2 Incompatible Systems
- In their particular method it happens that Ekxk ffi 0, since (5.3) shows that x~ ffi Vkyk is theoretically orthogonal to ATrk.
- This strengthens the case for using rule $2.
6.4 Singular Systems
- For large enough k, and hence will lie very nearly in the null space of A.
- The vector 2k may reveal to the user where certain unexpected dependencies exist.
7. OTHER METHODS
- Several other conjugate-gradient methods are discussed here.
- All except the first (CGLS) are stated using notation consistent with Sections 3-6 ih order to illustrate certain analytic identities.
Algorithm C G L S
- Otherwise the method is clearly simple and economical.
- The vectors v,+~ and d, in LSQR are proportional to si and p,, respectively.
- Note that qi and s, just given can share the same workspace.
- A FORTRAN implementation of CGLS has been given by Bj6rck and Elfving [3] .
- This incorporates an acceleration technique in a way that requires minimal additional storage.
7.7 Storage and Work
- All methods require one product Av and one product ATo each iteration.
- This could dominate the work requirements in some applications.
8.1 Generation of Test Problems
- It is clear that results obtained from them could be very misleading.
- This possibility is discussed in Section 8.4.
- In fairness, Chen [4] did not intend RRLS to be applied to compatible systems.
- At first sight it may appear that the residual-reducing methods possess some advantage on least-squares problems.
- This anomalous behavior cannot be guaranteed; for example, it did not occur on P(80, 40, 4, 6), as shown in obtain more accurate solutions in fewer iterations.
9. SUMMARY
- (1) The symmetric conjugate-gradient algorithm should be applied to the normal equations ATAx -AWb only if there is reason to believe that very few iterations will produce a satisfactory estimate of x. (2) The least-squares adaptation of symmetric CG will always be more reliable, at the expense of slightly more storage and work per iteration.
- (This is the algorithm of Hestenes and Stiefel, described in Section 7.1 of this paper as algorithm CGLS.).
Did you find this useful? Give us your feedback
Citations
17,420 citations
9,950 citations
Cites methods from "LSQR: An Algorithm for Sparse Linea..."
...Similarly, the algorithms for Au and ATv can be used directly in conjugate-gradient methods such as LSQR [32, 33] for solving the least-squares problem (6.5)....
[...]
...Similarly, the algorithms for Au and Av can be used directly in conjugate-gradient methods such as LSQR [32, 33] for solving the least-squares problem (6....
[...]
4,387 citations
3,949 citations
3,585 citations
Cites methods from "LSQR: An Algorithm for Sparse Linea..."
...If the Lanczos bidiagonalization process [3, ?20] is halted after q steps, and a least squares solution is computed on the basis of the (q + 1) x q bidiagonal matrix [34], then it can be shown that the associated L is the identity matrixwhile the associated filter factors are 1 -Rq(z/?), where Rq denotes the qth degree Ritz polynomial....
[...]
References
7,598 citations
"LSQR: An Algorithm for Sparse Linea..." refers methods in this paper
...An algorithm with better numerical properties is easily derived by a slight algebraic rearrangement, making use of the intermediate vector A p , [10]....
[...]
...The method, to be called algorithm LSQR, is similar in style to the well-known method of conjugate gradients (CG) as applied to the least-squares problem [10]....
[...]
7,422 citations
3,947 citations
"LSQR: An Algorithm for Sparse Linea..." refers methods in this paper
...MOTIVATION VIA THE LANCZOS PROCESS In this section we review the symmetric Lanczos process [13] and its use in solving symmetric linear equations B x = b....
[...]
1,683 citations