scispace - formally typeset
Search or ask a question

Showing papers on "QR decomposition published in 1992"


Journal ArticleDOI
TL;DR: It is shown that a different decompositions, called the URV decomposition, is equally effective in exhibiting the null space and can be updated in O(p/sup 2/) time.
Abstract: In certain signal processing applications it is required to compute the null space of a matrix whose rows are samples of a signal with p components. The usual tool for doing this is the singular value decomposition. However, the singular value decomposition has the drawback that it requires O(p/sup 3/) operations to recompute when a new sample arrives. It is shown that a different decomposition, called the URV decomposition, is equally effective in exhibiting the null space and can be updated in O(p/sup 2/) time. The updating technique can be run on a linear array of p processors in O(p) time. >

354 citations


Journal ArticleDOI
TL;DR: In this paper, a constructive proof of the existence of the rank-revealing QR factorization of any matrix A of size m x n with numerical rank r is given. But it is not clear how to find a rank revealing RRQR of A if A has numerical rank deficiency.
Abstract: T. Chan has noted that, even when the singular value decomposition of a matrix A is known, it is still not obvious how to find a rank-revealing QR factorization (RRQR) of A if A has numerical rank deficiency. This paper offers a constructive proof of the existence of the RRQR factorization of any matrix A of size m x n with numerical rank r . The bounds derived in this paper that guarantee the existence of RRQR are all of order f i ,in comparison with Chan's 0(2"-') . It has been known for some time that if A is only numerically rank-one deficient, then the column permutation l7 of A that guarantees a small rnn in the QR factorization of A n can be obtained by inspecting the size of the elements of the right singular vector of A corresponding to the smallest singular value of A . To some extent, our paper generalizes this well-known result. We consider the interplay between two important matrix decompositions: the singular value decomposition and the QR factorization of a matrix A . In particular, we are interested in the case when A is singular or nearly singular. It is well known that for any A E R m X n (a real matrix with rn rows and n columns, where without loss of generality we assume rn > n) there are orthogonal matrices U and V such that where C is a diagonal matrix with nonnegative diagonal elements: We assume that a, 2 a2 2 . . 2 on 2 0 . The decomposition (0.1) is the singular value decomposition (SVD) of A , and the ai are the singular values of A . The columns of V are the right singular vectors of A , and the columns of U are the left singular vectors of A . Mathematically, in terms of the singular values, Received December 1, 1990; revised February 8, 199 1. 199 1 Mathematics Subject Classification. Primary 65F30, 15A23, 15A42, 15A15.

185 citations


Journal ArticleDOI
TL;DR: This is illustrated by showing how the rank revealing QR factorization can be used to compute solutions to rank deficient least squares problems, to perform subset selection, to compute matrix approximations of given rank, and to solve total least square problems.
Abstract: The rank revealing QR factorization of a rectangular matrix can sometimes be used as a reliable and efficient computational alternative to the singular value decomposition for problems that involve rank determination. This is illustrated by showing how the rank revealing QR factorization can be used to compute solutions to rank deficient least squares problems, to perform subset selection, to compute matrix approximations of given rank, and to solve total least squares problems.

181 citations


Journal Article
TL;DR: The rank revealing QR factorization of a rectangular matrix can sometimes be used as a reliable and efficient computational alternative to the singular value decomposition for problems that involve rank determination as discussed by the authors, which can be used to compute solutions to rank deficient least squares problems, to perform subset selection, to compute matrix approximations of given rank, and to solve total least square problems.
Abstract: The rank revealing QR factorization of a rectangular matrix can sometimes be used as a reliable and efficient computational alternative to the singular value decomposition for problems that involve rank determination. This is illustrated by showing how the rank revealing QR factorization can be used to compute solutions to rank deficient least squares problems, to perform subset selection, to compute matrix approximations of given rank, and to solve total least squares problems.

154 citations


Journal ArticleDOI
TL;DR: The special structure of the product of the Householder transformations is derived, and then used to explain and bound the loss of orthogonality in MGS, which is illustrated by deriving a numerically stable algorithm based on MGS for a class of problems which includes solution of nonsingular linear systems.
Abstract: This paper arose from a fascinating observation, apparently by Charles Sheffield, and relayed to us by Gene Golub, that the QR factorization of an $m \times n$ matrix A via the modified Gram-Schmid...

130 citations


Journal ArticleDOI
TL;DR: In this article, the authors developed an algorithm for adaptively estimating the noise subspace of a data matrix, as is required in signal processing applications employing the signal subspace approach, using a rank-revealing QR factorization instead of the more expensive singular value or eigenvalue decompositions.
Abstract: The authors develop an algorithm for adaptively estimating the noise subspace of a data matrix, as is required in signal processing applications employing the 'signal subspace' approach. The noise subspace is estimated using a rank-revealing QR factorization instead of the more expensive singular value or eigenvalue decompositions. Using incremental condition estimation to monitor the smallest singular values of triangular matrices, the authors can update the rank-revealing triangular factorization inexpensively when new rows are added and old rows are deleted. Experiments demonstrate that the new approach usually requires O(n/sup 2/) work to update an n*n matrix, and that it accurately tracks the noise subspace. >

108 citations


Book
18 Dec 1992
TL;DR: The Systolic Mode of Parallel Processing is introduced, with examples: Mapping Different Filter Banks onto the Same Fixed-Size Processor Array, and Unidirectional Full-Systolic Arrays with Bidirectional Data Flow.
Abstract: The Systolic Mode of Parallel Processing. Introduction to the Underlying Concept. The Original Motivation: VSLI Implementation. The Present Trend: Efficient Algorithms for Massively Parallel Computers. A List of Known Applications. Defining and Expressing Systolic Arrays and Algorithms. Using Automata Notions. Defining Systolic Automata, Arrays, and Algorithms. Expressing Systolic Algorithms. Analysis and Comparison of Systolic Algorithms. Matrix-Vector and Matrix Multiplication. Introduction to Vectors and Matrices. Matrix-Vector Multiplication. Systolic Simulation of Feedforward Artificial Neural Networks. Matrix Multiplication. Solving Systems of Linear Algebraic Equations. Introduction to Linear Systems. Gaussian Elimination. Systolic Arrays for Triangularization and LU/QR Decomposition. Systolic Algorithms for Back Substitution. Systolic Implementation of Iterative Methods. Further Problems of Linear Algebra. Computing the Inverse of a Matrix. Generalized Elimination. Computing the Characteristic Polynomial. Matrix Transposition and Related Operations. Convolution and Linear Filters. Convolution, Correlation, FIR and IIR Filters. Semi-Systolic Realizations. Unidirectional Full-Systolic Arrays. Systolic Arrays with Bidirectional Data Flow. Bit-Level Systolic Convolver. Operations with Polynomials. Introduction. Multiplication of Polynomials and Integers. Division of Polynomials. Computing the Greatest Common Divisor. Polynomial Interpolation. Evaluation of Polynomials. Comparison Problems. Sorting. Selection and Running Order Statistics. Sorting and Order Statistics for Rank Filtering. A Data Structure: Priority Queue. Dynamic Programming and its Applications. Introduction. Implementing the Dynamic Programming Recurrence in a Two-Dimensional Systolic Array. Implementation in One-Dimensional Arrays. Further Dynamic Programming Recurrences. Computational Geometry. Convex Hull. Nearest-Neighbours Problems. Systematic Design of Systolic Algorithms. Dependence Graphs. Systolic Array Dependence Graphs. Extracting Systolic Algorithms from Dependence Graphs. Modifying the Properties of Systolic Algorithms. Partitioning of Systolic Algorithms. Partitioning, Algorithm Mapping, Design of Flexible Systolic Structures, Time Sharing. Application of c-Slow Automata to the Realization of Parallel Structures. Examples: Mapping Different Filter Banks onto the Same Fixed-Size Processor Array. A Summary of the Technique and Alternative Approaches. References and Additional Literature. Subject Index.

93 citations


Journal ArticleDOI
TL;DR: The numerical stability of the block algorithms in the new linear algebra program library LAPACK is investigated and it is shown that these algorithms have backward error analyses in which the backward error bounds are commensurate with the error bounds for the underlying level-3 BLAS (BLAS3).
Abstract: Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar, they have a higher level of granularity than point algorithms, and this makes them well suited to high-performance computers. The numerical stability of the block algorithms in the new linear algebra program library LAPACK is investigated here. It is shown that these algorithms have backward error analyses in which the backward error bounds are commensurate with the error bounds for the underlying level-3 BLAS (BLAS3). One implication is that the block algorithms are as stable as the corresponding point algorithms when conventional BLAS3 are used. A second implication is that the use of BLAS3 based on fast matrix multiplication techniques affects the stability only insofar as it increases the constant terms in the normwise backward error bounds. For linear equation solvers employing LU factorization, it is shown that fixed precision iterative refinement helps to mitigate the effect of the larger error constants. Despite the positive results presented here, not all plausible block algorithms are stable; we illustrate this with the example of LU factorization with block triangular factors and describe how to check a block algorithm for stability without doing a full error analysis.

65 citations


Journal ArticleDOI
TL;DR: This paper discusses multimatrix generalizations of two well-known orthogonal rank factorizations of a matrix: the generalized singular value decomposition and the generalized QR-(or URV-) decomposition.
Abstract: This paper discusses multimatrix generalizations of two well-known orthogonal rank factorizations of a matrix: the generalized singular value decomposition and the generalized QR-(or URV-) decomposition. These generalizations can be obtained for any number of matrices of compatible dimensions. This paper discusses in detail the structure of these generalizations and their mutual relations and gives a constructive proof for the generalized QR-decompositions.

60 citations


Book ChapterDOI
01 Oct 1992
TL;DR: This paper looks at three commonly use decompositions: the singular value decomposition, the pivoted QR decompose, and the URV decomposition.
Abstract: The problem of determining rank in the presence of error occurs in a number of applications. The usual approach is to compute a rank-revealing decomposition and make a decision about the rank by examining the small elements of the decomposition. In this paper we look at three commonly use decompositions: the singular value decomposition, the pivoted QR decomposition, and the URV decomposition.

47 citations


Journal ArticleDOI
TL;DR: A general-purpose structured network is developed which can be programmed to solve all the matrix algebra problems considered in this paper, and time complexities of these structured network approaches are analyzed.

Journal ArticleDOI
01 Jun 1992
TL;DR: The concept of algorithmic engineering is introduced and discussed in the context of parallel digital signal processing, mainly relating to the use of QR decomposition by square-root-free Givens rotations as applied to adaptive filtering and beamforming.
Abstract: Algorithmic engineering provides a rigorous framework for describing and manipulating the type of building blocks commonly used to define parallel algorithms and architectures for digital signal processing. The concept is first illustrated by means of some fairly simple worked examples. These relate to the use of QR decomposition by Givens rotations for the purposes of adaptive filtering and beamforming. It is then shown how a novel modular architecture for linearly constrained adaptive beamforming has been derived by transforming an established least squares processor design using some simple algorithmic engineering techniques. This novel architecture constitutes a stable and efficient recursive realisation of the modular adaptive beamformer proposed by Liu and Van Veen.

Journal ArticleDOI
TL;DR: Different methods for computing the Lyapunov-exponent spectrum from a time series are reviewed, with special attention being given to the practicality of these algorithms and their efficiency and accuracy and the number of adjustable free parameters.
Abstract: Different methods for computing the Lyapunov-exponent spectrum from a time series are reviewed. All algorithms are based on either Gram-Schmidt orthonormalization or Householder QR decomposition, and they use either the linearized map or a higher-order polynomial approximation. They also differ in implementation details. The ability to use these methods for a short time series of low precision is investigated, with special attention being given to the practicality of these algorithms; i.e., their efficiency and accuracy and the number of adjustable free parameters.

Proceedings ArticleDOI
01 Mar 1992
TL;DR: This paper discusses the implementation of the orthogonal transformation of a dense matrix using Givens rotations on the BDFA, which eliminates the requirement for global data communications and significantly reduces the time required for hand shaking for data transmission by using block dataflow.
Abstract: In this paper, we introduce a block dataflow architecture (BDFA) which is very efficient for some computational intensive multidimensional digital signal processing and matrix algorithms. We demonstrate the advantages of this architecture by using the Givens rotations of a dense matrix as an example. The BDFA eliminates the requirement for global data communications and significantly reduces the time required for hand shaking for data transmission by using block dataflow. In this paper, we discuss the implementation of the orthogonal transformation of a dense matrix using Givens rotations on the BDFA.

Journal ArticleDOI
01 Apr 1992
TL;DR: Efficient parallel block algorithms for the LU factorization with partial pivoting, the Cholesky factorization, and the QR factorization transportable over a range of parallel MIMD architectures are presented.
Abstract: Efficient parallel block algorithms for the LU factorization with partial pivoting, the Cholesky factorization, and the QR factorization transportable over a range of parallel MIMD architectures are presented. Parallel implementations of different block algorithms that utilize optimized uniprocessor level-3 BLAS are compared with corresponding routines of LAPACK under development. Parallelism is mainly invoked implicitly in LAPACK by replacing calls to uniprocessor level-3 kernels by calls to parallel level-3 kernels and thereby maintaining portability. However, by parallelizing at the block level explicitly it is possible to overlap and pipeline different matrix-matrix operations and thereby gain some performance. Theoretical models give upper bounds on the best possible speedup of the explicitly and implicitly parallel block algorithms for the target machine.

Proceedings ArticleDOI
10 May 1992
TL;DR: The authors present an approach to the development of fast and numerically stable recursive least squares (RLS) algorithms for adaptive nonlinear filtering using QR-decomposition of the data matrix and introduces a pair of QR-RLS adaptive algorithms for second-order Volterra filtering.
Abstract: The authors present an approach to the development of fast and numerically stable recursive least squares (RLS) algorithms for adaptive nonlinear filtering using QR-decomposition of the data matrix. They introduce a pair of QR-RLS adaptive algorithms for second-order Volterra filtering. Both the algorithms are based solely on Given's rotation. Hence both are numerically stable and highly amenable to parallel implementations using arrays. One of the algorithms is a block processing algorithm in the sense that it processes all the channels simultaneously. The other processes the channels sequentially. The sequential algorithm is computationally much more efficient than the block algorithm and is comparable to that of fast RLS Volterra filters. Another attractive feature of sequential processing is that knowledge of the single-channel algorithm can be applied to the multichannel case. >

Journal ArticleDOI
TL;DR: A computationally efficient and stable implementation of the CFAR algorithm is given which may use either the Cholesky decomposition or the QR decomposition with a rearrangement of the computations.

Journal ArticleDOI
TL;DR: An adaptive condition estimation algorithm, “GRACE,” is developed to monitor the condition number of the matrices during the update process and requires only $O(n)$ overhead beyond the cost of updating the QR factorization.
Abstract: Many applications involve repeatedly solving linear systems of equations while the coefficient matrix is modified by a rank-one matrix at each iteration. The QR factorization is often used in such situations, and algorithms that update the QR factorizations in $O(n^2 )$ time are well known. To avoid excessive round-off error in solving equation systems, it is useful to monitor the condition number of the matrix as the iterations progress. In this paper, general (i.e., nonsymmetric) matrices undergoing rank-one changes are considered and an adaptive condition estimation algorithm, “GRACE,” is developed to monitor the condition number of the matrices during the update process. The algorithm requires only $O(n)$ overhead beyond the cost of updating the QR factorization. Potential numerical difficulties in the algorithm are analyzed and modifications to overcome these are introduced. These modifications are also applicable to the ACE algorithm of Pierce and Plemmons that handles symmetric updates. Finally, ex...

Journal ArticleDOI
TL;DR: A block algorithm for computing rank-revealing QR factorizations (RRQR factorizations) of rank-deficient matrices is presented and it is shown that the block algorithm greatly reduces the number of triangular solves and increases the computational granularity of the RRQR computation.
Abstract: We present a block algorithm for computing rank-revealing QR factorizations (RRQR factorizations) of rank-deficient matrices. The algorithm is a block generalization of the RRQR-algorithm of Foster and Chan. While the unblocked algorithm reveals the rank by peeling off small singular values one by one, our algorithm identifies groups of small singular values. In our block algorithm, we use incremental condition estimation to compute approximations to the nullvectors of the matrix. By applying another (in essence also rank-revealing) orthogonal factorization to the nullspace matrix thus created, we can then generate triangular blocks with small norm in the lower right part ofR. This scheme is applied in an iterative fashion until the rank has been revealed in the (updated) QR factorization. We show that the algorithm produces the correct solution, under very weak assumptions for the orthogonal factorization used for the nullspace matrix. We then discuss issues concerning an efficient implementation of the algorithm and present some numerical experiments. Our experiments show that the block algorithm is reliable and successfully captures several small singular values, in particular in the initial block steps. Our experiments confirm the reliability of our algorithm and show that the block algorithm greatly reduces the number of triangular solves and increases the computational granularity of the RRQR computation.

Proceedings ArticleDOI
04 Aug 1992
TL;DR: In this article, a fault tolerant algorithm for the solution of linear systems of equations using matrix triangularization procedures suitable for implementation on array architectures is presented. But this algorithm is not fault tolerant against two transient errors occurring during the triangularization procedure.
Abstract: The authors present a fault tolerant algorithm for the solution of linear systems of equations using matrix triangularization procedures suitable for implementation on array architectures. Gaussian elimination with partial or pairwise pivoting and QR decomposition are made fault tolerant against two transient errors occurring during the triangularization procedure. The extended Euclidean algorithm is implemented to solve for the locations and values of the errors defined appropriately using the theory of error correcting codes. The Sherman-Morrison Woodbury formula is then used to obtain the correct solution vector to the linear system of equations without requiring a valid decomposition. >

Journal ArticleDOI
01 Feb 1992
TL;DR: A method to implement one-dimensional Systolic Algorithms with data contraflow using Pipelined Functional Units and some procedures are proposed which permit the systematic application of the method.
Abstract: In this paper we present a method to implement one-dimensional Systolic Algorithms with data contraflow using Pipelined Functional Units. Some procedures are proposed which permit the systematic application of the method. The paper includes an example of application of the method to a one-dimensional systolic algorithm with data contraflow for QR decomposition.

Journal ArticleDOI
TL;DR: In this paper, a new approach for the eigenvalue assignment of linear, first-order, time-invariant systems using output feedback is developed, which assigns the maximum allowable number of closed-loop eigenvalues through output feedback provided that the system is fully controllable and observable, and both the input influence and output influence matrices are full rank.
Abstract: A new approach for the eigenvalue assignment of linear, first-order, time-invariant systems using output feedback is developed The approach can assign the maximum allowable number of closed-loop eigenvalues through output feedback provided that the system is fully controllable and observable, and both the input influence and output influence matrices are full rank First, a collection of bases for the space of attainable closed-loop eigenvectors is generated using the Singular Value Decomposition or QR Decomposition techniques Then, an algorithm based on subspace intersections is developed and used to compute the corresponding coefficients of the bases, and the required output feedback gain matrix Moreover, the additional freedom provided by the multi-inputs and multi-outputs beyond the eigenvalue assignment is characterized for possible exploitation A numerical example is given to demonstrate the viability of the proposed approach

Patent
03 Mar 1992
TL;DR: In this article, a finite impulse response filter is used to transform delayed co-ordinates by QR decomposition and least squares fitting so that they are fitted to non-delayed coordinates.
Abstract: A dynamical system analyser (10) incorporates a computer (22) to perform a singular value decomposition of a time series of signals from a nonlinear (possibly chaotic) dynamical system (14). Relatively low-noise singular vectors from the decomposition are loaded into a finite impulse response filter (34). The time series is formed into Takens' vectors each of which is projected onto each of the singular vectors by the filter (34). Each Takens' vector thereby provides the co-ordinates of a respective point on a trajectory of the system (14) in a phase space. A heuristic processor (44) is used to transform delayed co-ordinates by QR decomposition and least squares fitting so that they are fitted to non-delayed co-ordinates. The heuristic processor (44) generates a mathematical model to implement this transformation, which predicts future system states on the basis of respective current states. A trial system is employed to generate like co-ordinates for transforation in the heuristic processor (44). This produces estimates of the trial system's future states predicted from the comparison system's model. Alternatively, divergences between such estimates and actual behavior may be obtained. As a further alternative, mathematical models derived by the analyser (10) from different dynamical systems may be compared.

Journal ArticleDOI
TL;DR: The theoretical back-ground for multiple error detection in matrix triangularizations is developed, and the results are summarized by providing that all the transient errors that occur in a maximum of t different columns can be detected by introducing t checksum vectors.

Journal ArticleDOI
01 Apr 1992
TL;DR: This paper gives a detailed implementation of the updating procedure of a URV decomposition in such a way that it reveals the effective rank of the matrix.
Abstract: A URV decomposition of a matrix is a factorization of the matrix into the product of a unitary matrix (U), an upper triangular matrix (R), and another unitary matrix (V). In [8] it was shown how to update a URV decomposition in such a way that it reveals the effective rank of the matrix. It was also argued that the updating procedure could be implemented in parallel on a linear array of processors; however, no specific algorithms were given. This paper gives a detailed implementation of the updating procedure.

01 Jan 1992
TL;DR: This research discusses RHSQP algorithms for solving large sparse equality constrained optimization problems, and uses the LU decomposition to compute the basis matrices for the null space of the derivatives of the constraints, instead of using the QR factorization to compute an orthonormal basis.
Abstract: We consider equality constrained optimization problems, minimizing a real function subject to a set of constraints. Reduced Hessian methods are a type of successive quadratic programming algorithm for solving this problem that only needs to store the approximation matrices to the reduced Hessian of the Lagrangian function, thus saving a significant amount of storage. In this research we discuss RHSQP algorithms for solving large sparse equality constrained optimization problems. In our implementation, we use the LU decomposition to compute the basis matrices for the null space of the derivatives of the constraints, instead of using the QR factorization to compute an orthonormal basis. This allows us to use sparse matrix decomposition techniques to make use of the sparsity of constraint derivatives. We discuss several implementation issues of RHSQP algorithms, such as reusing the basis and premodifying the quasi-Newton matrix before updating. We study the Null Space Secant update strategy and Step Secant update strategies with several update criteria for updating the quasi-Newton matrices. Without assuming the quasi-Newton matrices $\{B\sb{k}\}$ are bounded, we establish global convergence, R-linear convergence and superlinear convergence for RHSQP algorithms using the $l\sb1$ and the Fletcher merit functions. For different update strategies and merit functions, we present numerical experiments with RHSQP algorithms.

01 Jan 1992
TL;DR: This thesis reports on the implementation of system-level error detection mechanisms for four parallel applications on a 16-processor Intel iPSC-2/D4/MX hypercube multiprocessor, and addresses the difficult task of synthesizing algorithm-based checking techniques for general applications.
Abstract: Numerous algorithms for computationally intensive tasks that are suitable for execution on hypercube multiprocessors have been developed by researchers. In this thesis, we look at parallel algorithm design from a different perspective: the provision of on-line detection of hardware errors using software techniques without any hardware modifications. This approach is called Algorithm-based error detection. We report on the implementation of system-level error detection mechanisms for four parallel applications on a 16-processor Intel iPSC-2/D4/MX hypercube multiprocessor: (1) matrix multiplication; (2) Fast Fourier Transform; (3) QR factorization; (4) singular value decomposition. We describe extensive studies of the error coverage of our system-level error detection schemes in the presence of finite precision arithmetic which affects our system-level encodings. We also provide an in-depth study of the various issues and trade-offs available in Algorithm-based error detection. We illustrate the approach on an extremely useful computation in the field of numerical linear algebra: QR factorization. We discuss the implementation and investigation of numerous ways of applying Algorithm-based error detection using different system-level encoding strategies for QR factorization. Different schemes have been observed to result in varying error coverages and time overheads. We report the result of our studies performed on the Intel iPSC-2 hypercube. Finally, we address the difficult task of synthesizing algorithm-based checking techniques for general applications. We approach the problem at the compiler level by identifying linear transformations in Fortran DO loops, restructuring program statements to convert nonlinear transformations to linear ones, and propose system-level checks based on this property. We discuss the implementation of a source-to-source restructuring compiler for the synthesis of low-cost system-level checks for general numerical Fortran programs, based on the above approach. We present the results of the application of this compiler to various routines from the LINPACK and EISPACK libraries, and from the Perfect Benchmark Suite. We also provide a detailed evaluation of compiler-assisted techniques for the LINPACK routine, DGEFA, and demonstrate the feasibility of our concept experimentally on the Intel iPSC-2 hypercube.

Journal ArticleDOI
TL;DR: The individual processing elements contained in the inverse QR algorithm are outlined, after which it is demonstrated how these processing elements may be connected to implement the overall inverse QR array.

Proceedings ArticleDOI
09 Nov 1992
TL;DR: A novel Kalman filter algorithm for the discrete linear filtering problem has been developed which involves the computation of the singular value decomposition of an unsymmetric matrix without explicitly forming its left factor which has a high dimension.
Abstract: A novel Kalman filter algorithm for the discrete linear filtering problem has been developed. The crucial component of the algorithm involves the computation of the singular value decomposition of an unsymmetric matrix without explicitly forming its left factor which has a high dimension. The proposed algorithm has good numerical stability and can handle correlated measurement noise without any additional transformations. This algorithm is formulated in the form of vector-matrix and matrix-matrix operations, so that it is also useful for parallel computers. Details of the algorithm are provided, and a numerical example is given. >

Journal ArticleDOI
01 Feb 1992
TL;DR: The Householder transformation outperforms the Givens rotation and the modified Gram-Schmidt methods in numerical stability under finite-precision implementations, as well as requiring fewer arithmetical operations, which is promising for VLSI implementation and real-time high throughput modern signal processing.
Abstract: The Householder transformation is considered to be desirable among various unitary transformations due to its superior computational efficiency and robust numerical stability. Specifically, the Householder transformation outperforms the Givens rotation and the modified Gram-Schmidt methods in numerical stability under finite-precision implementations, as well as requiring fewer arithmetical operations. Consequently, the QR decomposition based on the Householder transformation is promising for VLSI implementation and real-time high throughput modern signal processing. In this paper, a recursive complex Householder transformation (CHT) with a fast initialization algorithm is proposed and its associated parallel/pipelined architecture is also considered. Then, a CHT based recursive least-squares algorithm with a fast initialization is presented. Its associated systolic array processing architecture is also considered.