Showing papers on "QR decomposition published in 1996"

PDF

Open Access

Journal Article•DOI•

Efficient algorithms for computing a strong rank-revealing QR factorization

[...]

01 Jul 1996-SIAM Journal on Scientific Computing

TL;DR: Two algorithms are presented for computing rank-revealing QR factorizations that are nearly as efficient as QR with column pivoting for most problems and take O (ran2) floating-point operations in the worst case.

...read moreread less

Abstract: Given anm n matrixM withm > n, it is shown that there exists a permutation FI and an integer k such that the QR factorization MYI= Q(Ak ckBk) reveals the numerical rank of M: the k k upper-triangular matrix Ak is well conditioned, IlCkll2 is small, and Bk is linearly dependent on Ak with coefficients bounded by a low-degree polynomial in n. Existing rank-revealing QR (RRQR) algorithms are related to such factorizations and two algorithms are presented for computing them. The new algorithms are nearly as efficient as QR with column pivoting for most problems and take O (ran2) floating-point operations in the worst case.

...read moreread less

698 citations

Journal Article•DOI•

Multifrontal QR Factorization in a Multiprocessor Environment

[...]

Patrick R. Amestoy¹, Iain S. Duff², Chiara Puglisi•Institutions (2)

ENSEEIHT¹, Rutherford Appleton Laboratory²

01 Jul 1996-Numerical Linear Algebra With Applications

TL;DR: It is shown that the use of Level 3 BLAS can lead to very significant gains in performance and the design and implementation of a parallel QR decomposition algorithm for a large sparse matrix A is described.

...read moreread less

Abstract: We describe the design and implementation of a parallel QR decomposition algorithm for a large sparse matrix A. The algorithm is based on the multifrontal approach and makes use of Householder transformations. The tasks are distributed among processors according to an assembly tree which is built from the symbolic factorization of the matrix ATA. We first address uniprocessor issues and then discuss the multiprocessor implementation of the method. We consider the parallelization of both the factorization phase and the solve phase. We use relaxation of the sparsity structure of both the original matrix and the frontal matrices to improve the performance. We show that, in this case, the use of Level 3 BLAS can lead to very significant gains in performance. We use the eight processor Alliant˜FX/80 at CERFACS to illustrate our discussion.

...read moreread less

56 citations

Proceedings Article•DOI•

Designing fuzzy logic systems for uncertain environments using a singular-value-QR decomposition method

[...]

G.C. Mouzouris¹, Jerry M. Mendel•Institutions (1)

University of Southern California¹

08 Sep 1996

TL;DR: The proposed SVD-QR method selects subsets of independent basis functions which are sufficient to represent a given system, through operations on a nonsingleton fuzzy basis function matrix, and provides an estimate of the number of necessary basis functions.

...read moreread less

Abstract: Nonsingleton fuzzy logic systems (NSFLSs) are generalizations of singleton fuzzy logic systems (FLSs), that are capable of handling set-valued input. In this paper, we extend the theory of NSFLSs by presenting an algorithm to design and train such systems. Since they generalize singleton FLSs, the algorithm is equally applicable to both types of systems. The proposed SVD-QR method selects subsets of independent basis functions which are sufficient to represent a given system, through operations on a nonsingleton fuzzy basis function matrix. In addition, it provides an estimate of the number of necessary basis functions. We present examples to illustrate the ability of the SVD-QR method to operate in uncertain environments.

...read moreread less

54 citations

Journal Article•DOI•

The transmission of shifts and shift blurring in the QR algorithm

[...]

David S. Watkins¹•Institutions (1)

Washington State University¹

01 Jul 1996-Linear Algebra and its Applications

TL;DR: The mechanism by which the shifts are transmitted through the matrix in the course of a multishift QR iteration is identified andumerical evidence showing that the mechanism works well when m is small and poorly whenm is large is presented.

...read moreread less

47 citations

Journal Article•DOI•

A Schur Method for Low-Rank Matrix Approximation

[...]

Alle-Jan vander Veen

01 Jan 1996-SIAM Journal on Matrix Analysis and Applications

TL;DR: This paper describes a much simpler generalized Schur-type algorithm to compute similar low-rank approximants of a matrix H such that H - \Ha has 2-norm less than $\epsilon$.

...read moreread less

Abstract: The usual way to compute a low-rank approximant of a matrix $H$ is to take its singular value decomposition (SVD) and truncate it by setting the small singular values equal to 0. However, the SVD is computationally expensive. This paper describes a much simpler generalized Schur-type algorithm to compute similar low-rank approximants. For a given matrix $H$ which has $d$ singular values larger than $\epsilon$, we find all rank $d$ approximants $\Ha$ such that $H - \Ha$ has 2-norm less than $\epsilon$. The set of approximants includes the truncated SVD approximation. The advantages of the Schur algorithm are that it has a much lower computational complexity (similar to a QR factorization), and directly produces a description of the column space of the approximants. This column space can be updated and downdated in an on-line scheme, amenable to implementation on a parallel array of processors.

...read moreread less

40 citations

Book Chapter•DOI•

A Hierarchical Approach for Performance Analysis of ScaLAPACK-Based Routines Using the Distributed Linear Algebra Machine

[...]

Krister Dackland¹, Bo Kågström¹•Institutions (1)

Umeå University¹

18 Aug 1996

TL;DR: An hierarchical approach for design of performance models for parallel algorithms in linear algebra based on a parallel machine model and the hierarchical structure of the ScaLAPACK library is presented.

...read moreread less

Abstract: Performance models are important in the design and analysis of linear algebra software for scalable high performance computer systems. They can be used for estimation of the overhead in a parallel algorithm and measuring the impact of machine characteristics and block sizes on the execution time. We present an hierarchical approach for design of performance models for parallel algorithms in linear algebra based on a parallel machine model and the hierarchical structure of the ScaLAPACK library. This suggests three levels of performance models corresponding to existing ScaLAPACK routines. As a proof of the concept a performance model of the high level QR factorization routine pdgeqrf is presented. We also derive performance models of lower level ScaLAPACK building blocks such as pdgeqr2, pdlarft, pdlarfb, pdlarfg, pdlarf, pdnrm2, and pdscal, which are used in the high level model for pdgeqrf. Predicted performance results are compared to measurements on an Intel Paragon XP/S system. The accuracy of the top level model is over 90% for measured matrix and block sizes and different process grid configurations.

...read moreread less

37 citations

Journal Article•DOI•

On-line Structure Detection and Parameter Estimation with Exponential Windowing for Nonlinear Systems

[...]

Wan Luo¹, Stephen A. Billings², K.M. Tsang•Institutions (2)

Newcastle University¹, University of Sheffield²

01 Jan 1996-European Journal of Control

TL;DR: A new recursive orthogonal estimation algorithm is derived which updates both the model structure and the parameters of nonlinear models on-line and minimises the loss function at every selection step by selecting significant regression variables.

...read moreread less

30 citations

Journal Article•DOI•

Pipelined RLS adaptive filtering using scaled tangent rotations (STAR)

[...]

K.J. Raghunath¹, Keshab K. Parhi²•Institutions (2)

Alcatel-Lucent¹, University of Minnesota²

01 Oct 1996-IEEE Transactions on Signal Processing

TL;DR: A new scaled tangent rotation (STAR) is used instead of the Givens rotations used in QRD-RLS, designed such that fine-grain pipelining can be accomplished with little hardware overhead.

...read moreread less

Abstract: The QR decomposition-based recursive least-squares (RLS) adaptive filtering algorithm (referred to as QRD-RLS) is very popular because it has good numerical properties and can be mapped onto a systolic array. However, in this architecture, pipelining of the operations within the systolic array cells is difficult. Pipelining would be necessary to operate at high speeds or to reduce the power dissipation in a VLSI implementation. Pipelining QRD-RLS using look-ahead techniques leads to a large hardware overhead. The square-root free forms of QRD-RLS are also difficult to pipeline. In this paper, a new scaled tangent rotation (STAR) is used instead of the Givens rotations used in QRD-RLS. The STAR-based RLS algorithm (referred to as STAR-RLS) is designed such that fine-grain pipelining can be accomplished with little hardware overhead The scaled tangent rotations are not exactly orthogonal transformations but tend to become orthogonal asymptotically. The STAR-RLS algorithm is square-root free and has less complexity and lower intercell communication than the QRD-RLS algorithm. The properties of the STAR-RLS algorithm, such as stability, numerical property, and dynamic range, are examined with and without pipelining and compared with those of QRD-RLS. Simulation results are presented to compare the performance of STAR-RLS and QRD-RLS algorithms.

...read moreread less

26 citations

Journal Article•DOI•

Solving linear inequalities in a least squares sense

[...]

R. Bramley, B. Winnicka

01 Jan 1996-SIAM Journal on Scientific Computing

TL;DR: This paper shows that a modification of Han’s algorithm allows the iterates to be computed using QR factorization with column pivoting, which significantly reduces the computational cost and allows efficient updating/downdating techniques to be used.

...read moreread less

Abstract: In 1980, S.-P. Han [Least-Squares Solution of Linearlnequalities, Tech. Report TR–2141, Mathematics Research Center, University of Wisconsin-Madison, 1980] described a finitely terminating algorithm for solving a system $Ax \leqslant b$ of linear inequalities in a least squares sense. The algorithm uses a singular value decomposition of a submatrix of A on each iteration, making it impractical for all but the smallest problems. This paper shows that a modification of Han’s algorithm allows the iterates to be computed using QR factorization with column pivoting, which significantly reduces the computational cost and allows efficient updating/downdating techniques to be used.The effectiveness of this modification is demonstrated, implementation details are given, and the behaviour of the algorithm discussed. Theoretical and numerical results are shown from the application of the algorithm to linear separability problems.

...read moreread less

24 citations

Proceedings Article•DOI•

Fault tolerant matrix operations using checksum and reverse computation

[...]

Youngbae Kim¹, James S. Plank, Jack Dongarra•Institutions (1)

University of Tennessee¹

27 Mar 1996

TL;DR: A technique, based on checksum and reverse computation, that enables high-performance matrix operations to be fault-tolerant with low overhead is presented and analysis of the overhead of checkpointing and recovery confirms that this technique can provide fault tolerance.

...read moreread less

Abstract: In this paper, we present a technique, based on checksum and reverse computation, that enables high-performance matrix operations to be fault-tolerant with low overhead. We have implemented this technique on five matrix operations: matrix multiplication, Cholesky factorization, LU factorization, QR factorization and Hessenberg reduction. The overhead of checkpointing and recovery is analyzed both theoretically and experimentally. These analyses confirm that our technique can provide fault tolerance for these high-performance matrix operations with low overhead.

...read moreread less

20 citations

Journal Article•DOI•

Self-scaling fast rotations for stiff and equality-constrained linear least squares problems

[...]

Andrew A. Anda¹, Haesun Park¹•Institutions (1)

University of Minnesota¹

01 Feb 1996-Linear Algebra and its Applications

TL;DR: Algorithms which apply self-scaling fast plane rotations to the QR decomposition for stiff least squares problems show that both fast and standard Givens rotation-based algorithms produce accurate results, regardless of row sorting and even with extremely large weights, when equality-constrained most squares problems are solved by the weighting method.

...read moreread less

Journal Article•DOI•

Accurate downdating of a modified Gram-Schmidt QR decomposition

[...]

K. Yoo¹, Haesun Park¹•Institutions (1)

University of Minnesota¹

01 Mar 1996-Bit Numerical Mathematics

TL;DR: An algorithm is derived that improves the Gram-Schmidt downdating algorithm when the columns in the Q factor are not orthonormal and produces far more accurate results than the gram-Sch Schmidt downdation algorithm for certain ill-conditioned problems.

...read moreread less

Abstract: A new algorithm for downdating a QR decomposition is presented. We show that, when the columns in the Q factor from the Modified Gram-Schmidt QR decomposition of a matrixX are exactly orthonormal, the Gram-Schmidt downdating algorithm for the QR decomposition ofX is equivalent to downdating the full Householder QR decomposition of the matrixX augmented by ann ×n zero matrix on top. Using this relation, we derive an algorithm that improves the Gram-Schmidt downdating algorithm when the columns in the Q factor are not orthonormal. Numerical test results show that the new algorithm produces far more accurate results than the Gram-Schmidt downdating algorithm for certain ill-conditioned problems.

...read moreread less

Proceedings Article•DOI•

The critically damped CORDIC algorithm for QR decomposition

[...]

Shaoyun Wang, E.E. Swartzlander

03 Nov 1996

TL;DR: The coordinate rotation digital computer (CORDIC) algorithm is an alternative solution to the traditional multiplication, division, and square root version of QR decomposition that converges faster than the conventional CORDIC algorithm with the penalty of storing all the scale factors in a ROM.

...read moreread less

Abstract: The coordinate rotation digital computer (CORDIC) algorithm is an alternative solution to the traditional multiplication, division, and square root version of QR decomposition. This approach is better as it uses only adders and shifters to do all the calculations. The area that is saved can be used to speed up the CORDIC algorithm even further. The critically damped CORDIC (CD-CORDIC) algorithm converges faster than the conventional CORDIC algorithm with the penalty of storing all the scale factors in a ROM. The ROM size is 2[N-1/2]+1 words, where N is the word length of the processor. The CD-CORDIC algorithm is twice as fast when the word length of the processor is 24 bit.

...read moreread less

Proceedings Article•DOI•

Parallel algorithms and processing architectures for space-time adaptive processing

[...]

A. Farina, L. Timmoneri

08 Oct 1996

TL;DR: This paper describes methodologies for the on-line calculation of the weights to be used in the linear combination of the received radar data by a set of N antennas and M pulse repetition intervals (PRIs) for the derivation of the adapted space-time filter output.

...read moreread less

Abstract: This paper describes methodologies for the on-line calculation of the weights to be used in the linear combination of the received radar data by a set of N antennas and M pulse repetition intervals (PRIs) for the derivation of the adapted space-time filter output. The numerically robust and computationally efficient QR-decomposition is used to derive the so called MVDR (minimum variance distortionless response) and lattice algorithms. Both algorithms are represented as a systolic computational flow graph. The MVDR is able to produce more than one adapted beam focused along different DOAs and Doppler frequencies in the radar surveillance volume. The lattice algorithm offers a computational saving; in fact its computational burden is O(N/sup 2/M) in lieu of O(N/sup 2/M/sup 2/). A comprehensive analysis of the numerical robustness of the algorithms is presented when the CORDIC-algorithm is used to compute the QR decomposition (QRD). Benchmarks on general purpose parallel computers and on a VLSI CORDIC (co-ordinate rotation digital computer) board are presented.

...read moreread less

Journal Article•DOI•

QR-PLSR: Reduced-rank regression for high-speed hardware implementation

[...]

Frank Westad¹, Klaus Diepold, Harald Martens•Institutions (1)

SINTEF¹

01 Sep 1996-Journal of Chemometrics

TL;DR: A version of PLS regression is described that intends to combine the computer hardware implementation advantages of the algebraic technique of ‘QR decomposition’ with the statistical, interpretative and computational advantages of P LS regression.

...read moreread less

Abstract: A version of PLS regression is described that intends to combine the computer hardware implementation advantages of the algebraic technique of ‘QR decomposition’ with the statistical, interpretative and computational advantages of PLS regression. With a QR decomposition based on Givens rotations, the QR-PLS technique appears to be suited for hardware parallelization without sacrificing the modelling flexibility of PLSR. © 1996 by John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Fast algorithms for direction-of-arrival finding using large ESPRIT arrays

[...]

Hongyuan Zha¹•Institutions (1)

Pennsylvania State University¹

02 Jan 1996-Signal Processing

TL;DR: This work proposes fast algorithms for direction-of-arrival (DOA) finding without computing (partial) eigendecompositions in large ESPRIT arrays and presents numerical simulation results to illustrate the efficiency of the proposed algorithms.

...read moreread less

Proceedings Article•DOI•

One-sided algorithm for subspace projection beam-forming

[...]

Mark A.G. Smith¹, Ian K. Proudler¹•Institutions (1)

Defence Research Agency¹

22 Oct 1996

TL;DR: This paper presents an algorithm, based on QR decomposition, that can approximately reveal the rank and signal subspace of a matrix and simultaneously perform a subspace projection and has the potential for very simple parallel implementation.

...read moreread less

Abstract: Conventional least squares minimization beamforming algorithms suffer from `weight jitter' when small data sequences are used. One method for overcoming this problem requires that the SVD of the data matrix is calculated and the `signal' and `noise' subspaces identified. A more stable beampattern can then be formed by projecting the least squares weight vector onto the appropriate subspace. The SVD is computationally expensive to perform and difficult to implement in a parallel architecture. Several approximate `rank revealing' algorithms have been presented of late (e.g. URV, RRQR) which have a much reduced computational load. However, being `two-sided' decompositions, they all suffer from implementation difficulties. In this paper we present an algorithm, based on QR decomposition, that can approximately reveal the rank and signal subspace of a matrix and simultaneously perform a subspace projection. The algorithm has the potential for very simple parallel implementation.

...read moreread less

Journal Article•DOI•

Fault-tolerant QRD recursive least squares

[...]

M. P. Connolly, Patrick Fitzpatrick

01 Mar 1996

TL;DR: The authors present an algorithm-based fault tolerant scheme for recursive least squares, appropriate for applications in adaptive signal processing and extended to a fault-tolerant algorithm for linearly constrained QR decomposition.

...read moreread less

Abstract: The authors present an algorithm-based fault tolerant scheme for recursive least squares, appropriate for applications in adaptive signal processing. The technique is closely focused on the Gentleman-Kung-McWhirter triangular systolic array architecture for QR decomposition. Assuming that the array is subject to transient faults, widely separated in time and each affecting a single processor, an algorithm is given that corrects the full triangular array with a computational overhead equivalent, on average, to the interpolation of a single extra vector into the data stream. No output residuals are lost in the fault recovery. The analysis is extended to a fault-tolerant algorithm for linearly constrained QR decomposition.

...read moreread less

Journal Article•DOI•

Some Results on Structure Prediction in Sparse QR Factorization

[...]

Esmond G. Ng, Barry W. Peyton

01 Apr 1996-SIAM Journal on Matrix Analysis and Applications

TL;DR: It is shown that one can always reorder a weak Hall matrix into block upper triangular form so that there is no increase in the fill incurred by the $QR$ factorization.

...read moreread less

Abstract: In $QR$ factorization of an $m \times n$ matrix $A$ ($m \geq n$), the orthogonal factor $Q$ is often stored implicitly as an $m \times n$ lower trapezoidal matrix $W$, known as the Householder matrix. When the sparsity of $A$ is to be exploited, the factorization is often preceded by a symbolic factorization step, which computes a data structure in which the nonzero entries of $W$ and $R$ are computed and stored. This is achieved by computing an upper bound on the nonzero structure of these factors, based solely on the nonzero structure of $A$. In this paper we use a well-known upper bound on the nonzero structure of $W$ to obtain an upper bound on the nonzero structure of $Q$. Let $U$ be the matrix consisting of the first $n$ columns of $Q$. One interesting feature of the new bound is that the bound on $W$'s structure is identical to the lower trapezoidal part of the bound on $U$'s structure. We show that if $A$ is strong Hall and has no zero entry on its main diagonal, then the bounds on the nonzero structures of $W$ and $U$ are the smallest possible based solely on the nonzero structure of $A$. We then use this result to obtain corresponding smallest upper bounds in the case where $A$ is weak Hall, is in block upper triangular form, and has no zero entry on its main diagonal. Finally, we show that one can always reorder a weak Hall matrix into block upper triangular form so that there is no increase in the fill incurred by the $QR$ factorization.

...read moreread less

Journal Article•DOI•

Perturbation and error analyses for block downdating of a Cholesky decomposition

[...]

Lars Eldén, Haesun Park¹•Institutions (1)

University of Minnesota¹

01 Jun 1996-Bit Numerical Mathematics

TL;DR: An error analysis is given for block downdating using Corrected Seminormal Equations (CSNE), and it is shown that for ill-conditioned downdates this method gives more accurate results than the algorithms based on the LINPACK downdation algorithm or hyperbolic transformations.

...read moreread less

Abstract: A new perturbation result is presented for the problem of block downdating a Cholesky decompositionX T X = R T R. Then, a condition number for block downdating is proposed and compared to other downdating condition numbers presented in literature recently. This new condition number is shown to give a tighter bound in many cases. Using the perturbation theory, an error analysis is presented for the block downdating algorithms based on the LINPACK downdating algorithm and stabilized hyperbolic transformations. An error analysis is also given for block downdating using Corrected Seminormal Equations (CSNE), and it is shown that for ill-conditioned downdates this method gives more accurate results than the algorithms based on the LINPACK downdating algorithm or hyperbolic transformations. We classify the problems for which the CSNE downdating method produces a downdated upper triangular matrix which is comparable in accuracy to the upper triangular factor obtained from the QR decomposition by Householder transformations on the data matrix with the row block deleted.

...read moreread less

Proceedings Article•DOI•

An algorithm and architecture for the parallel solution of systems of linear equations

[...]

V.C. Wilburn¹, Hak-Lim Ko¹, W.E. Alexander¹•Institutions (1)

North Carolina State University¹

27 Mar 1996

TL;DR: A paradigm for the efficient utilization of commercially available processors to implement serial algorithms on a parallel architecture as well as an algorithm for the parallel solution of a nonhomogeneous system of linear equations with constant coefficients is evaluated.

...read moreread less

Abstract: The paper evaluates a paradigm for the efficient utilization of commercially available processors to implement serial algorithms on a parallel architecture. We present an architecture based on this paradigm as well as an algorithm for the parallel solution of a nonhomogeneous system of linear equations with constant coefficients. Major advantages stem from its systolic-like array structure and the versatility of fully programmable processor elements. The method uses a Givens rotation implementation of the well known QR factorization. Unlike other direct methods of factorization followed by backsubstitution, this implementation of the algorithm avoids the backsubstitution bottleneck. The computational complexity of this feedforward direct method of solving nonsingular systems of linear equations is similar to that of QR matrix factorization. Due to the programmability of the processor in the array, the mapping of this algorithm extends to an entire family of algorithms. We map this family of algorithms onto the novel architecture and present a comprehensive performance analysis. Performance results identify the algorithm/architecture combination as a cost effective, efficient method which exhibits speedup that is directly proportional to the number of processors used.

...read moreread less

Proceedings Article•DOI•

A nonlinear adaptive predictor for speech compression

[...]

S. Hunt

03 Jun 1996

TL;DR: A neural nonlinear predictor for one dimensional signals is presented, based on a combination of linearization and QR decomposition that allows a fast adapting algorithm.

...read moreread less

Abstract: A neural nonlinear predictor for one dimensional signals is presented. It is based on a combination of linearization and QR decomposition that allows a fast adapting algorithm. The predictor is used in a speech compression algorithm that has proven to be superior to linear based models. The compression and training are done simultaneously, allowing the network to continually adapt to the signal. The results presented show that this algorithm outperforms a typical LPC coding algorithm.

...read moreread less

Proceedings Article•DOI•

Acoustic echo cancelation using a pseudo-linear regression and QR-decomposition

[...]

M. Harteneck¹, R.W. Stewart¹•Institutions (1)

University of Strathclyde¹

12 May 1996

TL;DR: The problem of acoustic echo cancelation is addressed using an adaptive IIR filtering algorithm based on a QR decomposition and a pseudo-linear regression that yields a computational complexity of O(N/sup 2/) multiply-accumulates.

...read moreread less

Abstract: In this paper the problem of acoustic echo cancelation is addressed using an adaptive IIR filtering algorithm based on a QR decomposition and a pseudo-linear regression. The proposed algorithm yields a computational complexity of O(N/sup 2/) multiply-accumulates. In echo cancelation simulations it shows fast convergence in single talk and double talk periods, and proves to be stable if the near-end signal and received signal are correlated due to a far-end echo path.

...read moreread less

Journal Article•DOI•

A family of parallel QR factorization algorithms

[...]

Gerard G. L. Meyer¹, Mike Pascale²•Institutions (2)

Johns Hopkins University¹, Westinghouse Electric²

01 Jul 1996-Concurrency and Computation: Practice and Experience

TL;DR: A family of algorithms parameterized by the number of processors available P, arithmetic grain aggregation parameters g1, g2, …, gP, and communication grain aggregation parameter h, which computer the QR factorization of a matrix A ∈ Cm × n with minimal latency is presented.

...read moreread less

Abstract: Rapid computation of the QR factorization of a matrix is fundamental to many scientific and engineering problems. The paper presents a family of algorithms parameterized by the number of processors available P, arithmetic grain aggregation parameters g1, g2, …, gP, and communication grain aggregation parameter h, which computer the QR factorization of a matrix A ∈ Cm × n with minimal latency. The approach is particularly well suited for dedicated distributed memory architectures such as linear arrays of INMOS Transputers, Texas Instruments C40s or Analog Devices 21060s.

...read moreread less

Book Chapter•DOI•

Parallel Sparse Modified Gram-Schmidt QR Decomposition

[...]

Ramón Doallo, Basilio B. Fraguela, Juan Touriño, Emilio L. Zapata¹•Institutions (1)

University of Málaga¹

15 Apr 1996

TL;DR: A strategy to reduce fill-in in order to get memory savings and decrease the computation times of the QR decomposition with column pivoting of a sparse matrix by means of Modified Gram-Schmidt orthogonalization.

...read moreread less

Abstract: We present a parallel computational method for the QR decomposition with column pivoting of a sparse matrix by means of Modified Gram-Schmidt orthogonalization. Nonzero elements of the matrix M to be decomposed are stored in a one-dimensional doubly linked list data structure. We discuse a strategy to reduce fill-in in order to get memory savings and decrease the computation times. As an application of QR decomposition, we describe the least squares problem. This algorithm was designed for a message passing multiprocessor and has been evaluated on a Cray T3D, using the Harwell-Boeing sparse matrix collection.

...read moreread less

Book Chapter•DOI•

Parallel Complexity of Householder QR Factorization

[...]

Mauro Leoncini¹, Giovanni Manzini², Luciano Margara³•Institutions (3)

University of Pisa¹, University of Turin², University of Bologna³

25 Sep 1996

TL;DR: It is proved that the Householder QR factorization is likely to be inherently sequential as well and the problem of speedup vs non degeneracy and accuracy in numerical algorithms is investigated.

...read moreread less

Abstract: Gaussian Elimination with Partial Pivoting and Householder QR factorization are two very popular methods to solve linear systems. Implementations of these two methods are provided in state-of-the-art numerical libraries and packages, such as LAPACK and MATLAB. Gaussian Elimination with Partial Pivoting was already known to be P-complete. Here we prove that the Householder QR factorization is likely to be inherently sequential as well. We also investigate the problem of speedup vs non degeneracy and accuracy in numerical algorithms.

...read moreread less

Journal Article•DOI•

A simple proof of the transposed QR algorithm

[...]

R. R. Burnside, P. B. Guest

01 Jun 1996-Siam Review

TL;DR: A simple proof of the transposed QR algorithm which permits the singular value decomposition of a matrix to be introduced to a first course in matrix algebra in the context of iterative procedures is presented.

...read moreread less

Abstract: This paper presents a simple proof of the transposed QR algorithm which permits the singular value decomposition of a matrix to be introduced to a first course in matrix algebra in the context of iterative procedures.

...read moreread less

Journal Article•DOI•

Analysis of Algorithms for Orthogonalizing Products of Unitary Matrices

[...]

Roy Mathias¹•Institutions (1)

College of William & Mary¹

01 Mar 1996-Numerical Linear Algebra With Applications

Book Chapter•DOI•

Abs methods for kt equations

[...]

Spedicato Emilio¹, Z. Chen, E. Bodon¹•Institutions (1)

University of Bergamo¹

01 Jan 1996

TL;DR: In this article, a class of ABS methods for solving the KT equations is presented, and several methods in this class are compared with the classical methods of Aasen and the method based upon the QR factorization with Householder rotations.

...read moreread less

Abstract: In this paper we present a class of ABS methods for solving the KT equations. We compare several methods in this class with the classical methods of Aasen and the method based upon the QR factorization with Householder rotations. When the number of degrees of freedom is small two of the considered ABS methods are faster than the Aasen and the QR based methods by a factor respectively about 2 and 4. Moreover when the first block in the KT equations is diagonal and a sequence of problems have to be solved where only such a block changes, for small number of degrees of freedom the solution can be updated in order two operations by the ABS methods, while order three operations are required by the other methods. Finally, numerical testing over 300 problems has shown that the ABS methods give more accurate results in about 80

...read moreread less

Proceedings Article•

A highly parallel multichannel fast qrd-ls adaptive algorithm

[...]

Athanasios A. Rontogiannis¹, Sergios Theodoridis¹•Institutions (1)

National and Kapodistrian University of Athens¹

01 Sep 1996

TL;DR: A new fast multichannel QR decomposition (QRD) least squares (LS) adaptive algorithm is presented in this paper that is based exclusively on numerically robust orthogonal Givens rotations and offers substantially reduced computational complexity compared to previously derivedMultichannel fast QRD schemes.

...read moreread less

Abstract: A new fast multichannel QR decomposition (QRD) least squares (LS) adaptive algorithm is presented in this paper. The algorithm deals with the general case of channels with different number of delay elements and is based exclusively on numerically robust orthogonal Givens rotations. The new scheme processes each channel separately and as a result it comprises scalar operations only. Moreover, the proposed algorithm is implementable on a very regular systolic architecture and offers substantially reduced computational complexity compared to previously derived multichannel fast QRD schemes.

...read moreread less