scispace - formally typeset
Search or ask a question

Showing papers on "QR decomposition published in 1991"


Proceedings ArticleDOI
01 May 1991
TL;DR: An algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling is proposed, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation, LU decomposition without pivoting, and Givens QR factorization.
Abstract: This paper proposes an algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling. The loop transformation algorithm is based on two concepts: a mathematical formulation of reuse and locality, and a loop transformation theory that unifies the various transforms as unimodular matrix transformations.The algorithm has been implemented in the SUIF (Stanford University Intermediate Format) compiler, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation (SOR), LU decomposition without pivoting, and Givens QR factorization. Performance evaluation indicates that locality optimization is especially crucial for scaling up the performance of parallel code.

1,352 citations


Journal ArticleDOI
TL;DR: In this paper, a method for structural analysis of multivariate data is proposed that combines features of regression analysis and principal component analysis, which is based on the generalized singular value decomposition of a matrix with certain metric matrices.
Abstract: A method for structural analysis of multivariate data is proposed that combines features of regression analysis and principal component analysis. In this method, the original data are first decomposed into several components according to external information. The components are then subjected to principal component analysis to explore structures within the components. It is shown that this requires the generalized singular value decomposition of a matrix with certain metric matrices. The numerical method based on the QR decomposition is described, which simplifies the computation considerably. The proposed method includes a number of interesting special cases, whose relations to existing methods are discussed. Examples are given to demonstrate practical uses of the method.

188 citations


Journal ArticleDOI
TL;DR: The authors show that fast QR methods and lattice methods in least squares adaptive filtering are duals and follow from identical geometric principles, and develop a fast least squares algorithm of minimal complexity that is a hybrid between a QR and a lattice algorithm.
Abstract: The authors show that fast QR methods and lattice methods in least squares adaptive filtering are duals and follow from identical geometric principles. Whereas the lattice methods compute the residuals of a projection operation via the forward and backward prediction errors, the QR methods compute instead the weights used in the projections. Within this framework, the parameter identification problem is solved using fast QR methods by showing that the reflection coefficients and tap parameters of a least squares lattice filter operating in the joint process mode are immediately available as internal variables in the fast QR algorithms. This parameter set can be readily exploited in system identification, signal analysis, and linear predictive coding, for example. The relations derived also lead to a fast least squares algorithm of minimal complexity that is a hybrid between a QR and a lattice algorithm. The algorithm combines the order recursive properties of the lattice approach with the robust numerical behavior of the QR approach. >

104 citations


Journal ArticleDOI
TL;DR: Lower bounds and upper bounds on |G|/|L| in terms of |E|/ |A| are given and perturbation bounds are given for the QR factorization of a complexm ×n matrixA of rankn.
Abstract: LetA, A+E be Hermitian positive definite matrices. Suppose thatA=LL H andA+E=(L+G)(L+G)H are the Cholesky factorizations ofA andA+E, respectively. In this paper lower bounds and upper bounds on |G|/|L| in terms of |E|/|A| are given. Moreover, perturbation bounds are given for the QR factorization of a complexm ×n matrixA of rankn.

79 citations


Journal ArticleDOI
TL;DR: This paper presents a square root and division free Givens rotation (SDFG) to be applied to the QR-decomposition (QRD) for solving linear least squares problems on systolic arrays.
Abstract: This paper presents a square root and division free Givens rotation (SDFG) to be applied to the QR-decomposition (QRD) for solving linear least squares problems on systolic arrays. The SDFG is based on a special kind of number description of the matrix elements and can be executed by mere application of multiplications and additions. Therefore, it is highly suited for the VLSI-implementation of the QRD on systolic arrays. Roundofi error and stability analyses indicate that the SDFG is numerically as stable as known Givens rotation methods.

65 citations


Journal ArticleDOI
01 Aug 1991
TL;DR: The least squares lattice algorithm for adaptive filtering based on the technique of QR decomposition (QRD) is derived from first principles and only requires O(p) operations for the solution of a pth order problem.
Abstract: The least squares lattice algorithm for adaptive filtering based on the technique of QR decomposition (QRD) is derived from first principles. In common with other lattice algorithms for adaptive filtering, this algorithm only requires O(p) operations for the solution of a pth order problem. The algorithm has as its root the QRDbased recursive least squares minimisation algorithm and hence is expected to have superior numerical properties when compared with other fast algorithms. This algorithm contains within it the QRD-based lattice algorithm for solving the least squares linear prediction problem. The algorithm is presented in two forms: one that involves taking square-roots and one that does not. The relationship between the QRD-based lattice algorithm and other least squares lattice algorithms is briefly discussed. The results of some computer simulations of a channel equaliser, using finiteprecision floating-point arithmetic, are presented.

59 citations


Journal ArticleDOI
TL;DR: A framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices, using a restricted pivoting strategy guarded by incremental condition estimation (ICE) and the algorithm suggested by Chan and Foster to this QR-factorization.
Abstract: The rank-revealing QR-factorization (RRQR factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using a restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compu...

47 citations


Journal ArticleDOI
TL;DR: In this article, the authors compared the Stieltjes procedure and a method in which an inverse eigenvalue problem for a tridiagonal symmetric matrix is solved by an algorithm proposed by Rutishauser, Gragg, and Harrod.
Abstract: Let f and g be functions defined at the real and distinct nodes $x_k $, and consider the inner product $( f,g ): = \sum_{k = 1}^m f ( x_k ) g ( x_k ) w_k^2 $ with positive weights $w_k^2 $. The present paper discusses the computation of orthonormal polynomials $\pi _0 ,\pi _1 , \cdots ,\pi _{n - 1} ,n\leqq m$, with respect to this inner product, and the use of these polynomials in a fast scheme for computing a QR decomposition of the transpose of Vandermonde-like matrices. Two methods are compared for computing the recurrence coefficients for the polynomials $\pi _j $ and their values at the nodes $x_k $: the Stieltjes procedure and a method in which an inverse eigenvalue problem for a tridiagonal symmetric matrix is solved by an algorithm proposed by Rutishauser, Gragg, and Harrod. The latter method is found to generally yield higher accuracy than the Stieltjes procedure if n is close to m, and roughly the same accuracy otherwise. This method for solving an inverse eigenvalue problem is applied in an alg...

39 citations


Journal ArticleDOI
TL;DR: This work extends existing results to show that fixed precision iterative refinement renders anarbitrary linear equations solver backward stable in a strong, componentwise sense, under suitable assumptions.
Abstract: Iterative refinement is a well-known technique for improving the quality of an approximate solution to a linear system In the traditional usage residuals are computed in extended precision, but more recent work has shown that fixed precision is sufficient to yield benefits for stability We extend existing results to show that fixed precision iterative refinement renders anarbitrary linear equations solver backward stable in a strong, componentwise sense, under suitable assumptions Two particular applications involving theQR factorization are discussed in detail: solution of square linear systems and solution of least squares problems In the former case we show that one step of iterative refinement suffices to produce a small componentwise relative backward error Our results are weaker for the least squares problem, but again we find that iterative refinement improves a componentwise measure of backward stability In particular, iterative refinement mitigates the effect of poor row scaling of the coefficient matrix, and so provides an alternative to the use of row interchanges in the HouseholderQR factorization A further application of the results is described to fast methods for solving Vandermonde-like systems

39 citations


Journal ArticleDOI
TL;DR: A new version of the Householder algorithm with column pivoting for computing a QR factorization that identifies rank and range space of a given matrix that is well suited for implementation on a parallel machine, in particular, a MIMD machine with distributed memory.
Abstract: This paper presents a new version of the Householder algorithm with column pivoting for computing a QR factorization that identifies rank and range space of a given matrix. The standard pivoting technique is not well suited for parallel computation, since it requires synchronization at every step in order to choose the next pivot column. In contrast, a restricted pivoting scheme that restricts the choice of pivot columns and avoids this synchronization constraint is employed. Incremental condition estimation is used to assess the effect that the addition of a candidate pivot column would have on the condition number of the matrix being generated. This safeguard ensures that this local strategy selects pivot columns that make sense in the global context of the computation. The resulting algorithm is well suited for implementation on a parallel machine, in particular, a MIMD machine with distributed memory. Simulations demonstrate that the numerical behavior of the restricted pivoting strategy is comparable to the traditional global pivoting strategy. Implementation results of the QR factorization algorithm without pivoting and with local and traditional pivoting on the Intel iPSC/1 and iPSC/2 hypercubes show that our scheme about halves the extra time required for pivoting.

37 citations


Journal ArticleDOI
TL;DR: The Fast Least Squares algorithms based on the QR triangular decomposition of the input signal matrix and developed for the case of one-dimensional signals can be extended to handle the cases of multi-dimensional (MD) or multichannel signals.

Proceedings ArticleDOI
14 Apr 1991
TL;DR: An algorithm for updating the null space of a matrix is described, based on a decomposition, which can be updated in O(N/sup 2/) and serves as an intermediary between the QR decomposition and the singular value decomposition.
Abstract: An algorithm for updating the null space of a matrix is described. The algorithm is based on a decomposition, called the URV decomposition, which can be updated in O(N/sup 2/) and serves as an intermediary between the QR decomposition and the singular value decomposition. the URV decomposition is applied to a high-resolution direction-of-arrival problem based on the MUSIC algorithm. A virtue of the updating algorithm is the running estimate of rank. >

Journal ArticleDOI
TL;DR: Not only can a finite-precision QRD RLS systolic array be designed with a minimum wordlength that ensures correct operations, but also a fault-tolerant system that can detect a given error size and is false-alarm-free under the quantization effect can be provided.
Abstract: The QR decomposition recursive least-squares (QRD RLS) algorithm for mapping onto a systolic array for signal processing and communication applications is considered. Detailed analysis is presented to show that the rotation parameters of the RLS algorithm based on the Givens rotation method will eventually reach the quasi-steady-state if the forgetting factor lambda is very close to 1. With this model, the dynamic range of each processing cell can be derived, and from this a proper wordlength can be chosen to ensure correct operation of the algorithm. The proposed solutions are simple and effective. Simulations have demonstrated that the wordlengths chosen by the proposed dynamic range work well. The stability of the QRD RLS algorithm is demonstrated under a finite-precision implementation with this observation. Finally, the missing error detection and false alarm problems are considered based on the results obtained from the model. The wordlength is overflow-free without missing error detection and false alarm problems. The results in this study are of practical importance. Not only can a finite-precision QRD RLS systolic array be designed with a minimum wordlength that ensures correct operations, but also a fault-tolerant system that can detect a given error size and is false-alarm-free under the quantization effect can be provided. >

Journal ArticleDOI
TL;DR: A novel technique for direction-of-arrival estimation based on computing a permutation matrix E and a QR factorization RE=HB of the permuted covariance matrix R, such that a possible rank deficiency of R is revealed in the triangular factor B having a minimum norm lower right block.
Abstract: The authors describe a novel technique for direction-of-arrival estimation based on computing a permutation matrix E and a QR factorization RE=HB of the permuted covariance matrix R, such that a possible rank deficiency of R is revealed in the triangular factor B having a minimum norm lower right block. A subset of the columns of the orthogonal matrix, H, is shown to be orthogonal to the direction vectors of sources and hence can be used to estimate their bearings. The cost of this algorithm is only slightly more than that of one QR factorization, but is much lower than that of an eigen-decomposition. Simulation results are included to show that the proposed method performs nearly as well as MUSIC in terms of signal resolution, bias, and variance of the estimated bearings. >

Proceedings ArticleDOI
26 Jun 1991
TL;DR: A novel n-dimensional (n-D) CORDIC algorithm for Euclidean and pseudo-Euclidean rotations is proposed, which is closely related to Householder transformations and shown to converge faster than CORDic algorithms developed earlier for n=3 and 4.
Abstract: A novel n-dimensional (n-D) CORDIC algorithm for Euclidean and pseudo-Euclidean rotations is proposed. This algorithm is closely related to Householder transformations. It is shown to converge faster than CORDIC algorithms developed earlier for n=3 and 4. Processor architectures for the algorithm are presented. The area and time performance of n-D CORDIC processors are evaluated. For a comparable time performance, the processors require significantly less area than parallel Householder processors. Furthermore, arrays of n-D Euclidean CORDIC processors are shown to speed up the QR decomposition of rectangular matrices by a factor of n-1 in comparison with a 2-D CORDIC processor array. >

Journal ArticleDOI
TL;DR: A numerical method to determine the identifiable parameters used in geometric robot calibration using QR decomposition, which is easy to implement using available software package and given an application for a 6 degree of freedom robot.

Journal ArticleDOI
Peter Strobach1
TL;DR: Two recursive-least-squares ladder algorithms for implementation on triangular systolic arrays are presented, based entirely on numerically stable and robust covariance recursions.
Abstract: Two recursive-least-squares ladder algorithms for implementation on triangular systolic arrays are presented. The first algorithm computes transversal forward/backward predictor coefficients, ladder reflection coefficients, and forward/backward residual energies. This algorithm has a complexity of three multiplications and additions per rotational (triangular array) element. A second algorithm is presented that facilitates the computation of only the ladder reflection coefficients and the forward/backward residual energies at a cost of two multiplications and additions per rotational element. This way, both algorithms are computationally more efficient than the traditional recursive QR decomposition (Gentleman and Kung array) for any order. The second algorithm is more efficient than Cioffi's pipelineable linear array fast QR adaptive filter for an order of less than 22 in the prewindowed case, and more efficient than the fast QR for an order of less than 43 in the more general covariance case. A comparison of the presented algorithms and the prominent QR methods is given. The algorithms remain unchanged and the number of arithmetic operations is not increased when finite duration windows are used. The algorithms are based entirely on numerically stable and robust covariance recursions. >

Journal ArticleDOI
TL;DR: A parallel algorithm for the calculation of the QR factorization on a hypercube architecture of the SIMD type with distributed memory is described, choosing the modified Gram-Schmidt method with pivoting as it is characterized by good numerical stability.


Proceedings ArticleDOI
14 Apr 1991
TL;DR: The Householder transformation outperforms the Givens rotation in numerical stability under finite-precision implementation, and requires fewer arithmetic operations than the modified Gram-Schmidt, which is promising for VLSI implementation and real-time throughput signal processing.
Abstract: The Householder transformation outperforms the Givens rotation in numerical stability under finite-precision implementation, and requires fewer arithmetic operations than the modified Gram-Schmidt. As a result, the QR decomposition using the Householder transformation is promising for VLSI implementation and real-time throughput signal processing. A recursive complex Householder transformation with a fast initializing algorithm is presented, and its associated parallel/pipelined architecture is discussed. >

Book ChapterDOI
01 Jan 1991
TL;DR: A survey is given of the singular value decomposition (SVD) and its use for analyzing and solving linear least squares problems and two recent algorithms for numerically rank deficient problems based instead on the QR factorization are discussed.
Abstract: A survey is first given of the singular value decomposition (SVD) and its use for analyzing and solving linear least squares problems. Refined perturbation bounds based on componentwise perturbations in the data are given. The SVD is expensive to compute, and for large sparse problems is not a practical alternative. We discuss two recent algorithms for numerically rank deficient problems based instead on the QR factorization.

Proceedings ArticleDOI
M. Bellanger1
14 Apr 1991
TL;DR: A generic representation of fast least squares algorithms based on the QR decomposition technique is given, pointing out the equivalence with the normalized lattice.
Abstract: A generic representation of fast least squares algorithms based on the QR decomposition technique is given, pointing out the equivalence with the normalized lattice. The numerical stability observed in simulations is justified, and the implementation in adaptive filters is discussed, with emphasis on the computational complexity issue. >


Journal ArticleDOI
TL;DR: The standard theory describing the result of the QR algorithm with k shifts on a Hessenberg matrix A is extended to the case where some of the shifts can be eigenvalues, which has a practical value in special cases such as eigenvalue allocation.
Abstract: A new approach is suggested for deriving the theory of implicit shifting in the QR algorithm applied to a Hessenberg matrix. This is less concise than Francis’ original approach ([Comput. J., 4(1961), pp. 265–271], [Comput. J., 4(1962), pp. 332–345]) but is more instructive, and extends easily to more general cases. For example, it enables us to design implicitly shifted QR algorithms for band and block Hessenberg matrices. It can also be applied to related algorithms such as the LR algorithm, and to algorithms which do not produce triangular matrices in the factorization step. The approach provides details that can be useful in designing numerically effective algorithms in various areas.In addition to the above, the standard theory describing the result of the QR algorithm with k shifts on a Hessenberg matrix A is extended to the case where some of the shifts can be eigenvalues. This has a practical value in special cases such as eigenvalue allocation. The extension is given for both the explicitly and i...

Journal ArticleDOI
TL;DR: A novel systolic array for LS system identification based on QR factorization via Givens rotations due to the back-substitution step is circumvented, and the structure is fully pipelineable.
Abstract: A novel systolic array for LS system identification based on QR factorization via Givens rotations is proposed. The back-substitution step is circumvented, and the structure is fully pipelineable. Thus, the method is appropriate for continuous, sample-by-sample mode, adaptive operation. A modification of this structure is also suggested which is suitable for a wide range of linear algebraic operations and is solely based on Givens rotations. >

Journal ArticleDOI
01 Sep 1991
TL;DR: Various parallel implementations of algorithms for the QR decomposition of a matrix are compared using shared memory multiprocessors and results indicate that one version is significantly better than the others.
Abstract: Various parallel implementations of algorithms for the QR decomposition of a matrix are compared using shared memory multiprocessors. Algorithms based on both Givens and Householder transformations are considered. A number of parallelisation techniques are used with particular emphasis on algorithms which allocate work to tasks dynamically. The results indicate that one version is significantly better than the others.

Proceedings ArticleDOI
M.A. Syed1
14 Apr 1991
TL;DR: The author presents QR decomposition based fast RLS (recursive least squares) algorithms for multichannel adaptive signal processing based on length-preserving orthogonal transformations which have good numerical properties and are amenable to parallel implementations using systolic and wavefront array architectures.
Abstract: The author presents QR decomposition based fast RLS (recursive least squares) algorithms for multichannel adaptive signal processing. These algorithms are based on length-preserving orthogonal transformations which have good numerical properties. Hence, these algorithms are numerically stable. Also, they are amenable to parallel implementations using systolic and wavefront array architectures. One of the algorithms is a block algorithm in the sense that it processes all the channels simultaneously. The other is a sequential algorithm. >

Proceedings ArticleDOI
04 Nov 1991
TL;DR: In this paper, an updating scheme for the rank revealing QR (RRQR) algorithm described earlier by T.F. Chan was investigated and applied to the direction of arrival problem, which allows tracking of moving sources by taking advantage of the simplicity of the regular QR updating scheme and the rank-revealing property of the RRQR factorization.
Abstract: The author investigates an updating scheme for the rank revealing QR (RRQR) algorithm described earlier by T.F. Chan (see Linear Algebr. Appl., vol.88, no 89, p.67-82 1987) and applies it to the direction of arrival problem. This technique allows for tracking of moving sources by taking advantage of the simplicity of the regular QR updating scheme and the rank-revealing property of the RRQR factorization. Subspace methods and the RRQR technique are reviewed. It is shown that the RRQR algorithm can be used to update signal and noise subspaces from the noise-free correlation matrix. Experimental results and comparisons with eigen-based signal and noise subspaces are presented. >


Journal ArticleDOI
TL;DR: A survey of generalizations of the ordinary singular value decomposition can be found in this paper, which contains all existing generalizations for two matrices (such as the product SVD and the quotient SVD) and for three matrices, such as the restricted SVD, as special cases.