scispace - formally typeset
Search or ask a question
Topic

QR decomposition

About: QR decomposition is a research topic. Over the lifetime, 3504 publications have been published within this topic receiving 100599 citations. The topic is also known as: QR factorization.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a technique for computing partial rank-revealing factorizations, such as a partial QR factorization or a partial singular value decomposition, is described, which is inspired by the Gram-Schmidt algorithm and has the same asymptotic flop count.
Abstract: This manuscript describes a technique for computing partial rank-revealing factorizations, such as a partial QR factorization or a partial singular value decomposition. The method takes as input a tolerance $\varepsilon$ and an $m\times n$ matrix $\boldsymbol{\mathsf{A}}$ and returns an approximate low-rank factorization of $\boldsymbol{\mathsf{A}}$ that is accurate to within precision $\varepsilon$ in the Frobenius norm (or some other easily computed norm). The rank $k$ of the computed factorization (which is an output of the algorithm) is in all examples we examined very close to the theoretically optimal $\varepsilon$-rank. The proposed method is inspired by the Gram--Schmidt algorithm and has the same $O(mnk)$ asymptotic flop count. However, the method relies on randomized sampling to avoid column pivoting, which allows it to be blocked, and hence accelerates practical computations by reducing communication. Numerical experiments demonstrate that the accuracy of the scheme is for every matrix that was...

97 citations

Journal ArticleDOI
TL;DR: This paper presents architectures and field-programmable gate-array designs of two variants of the DCD algorithm, known as cyclic and leading DCD algorithms, and proposes fixed-point designs that provide an accuracy performance that is very close to the performance of floating-point counterparts and require significantly lower FPGA resources than techniques based on QR decomposition.
Abstract: In the areas of signal processing and communications, such as antenna-array beamforming, adaptive filtering, multiuser and multiple-input-multiple-output (MIMO) detection, channel estimation and equalization, echo and interference cancellation, and others, solving linear systems of equations often provides an optimal performance. However, this is also a very complicated operation that designers try to avoid by proposing different suboptimal techniques. The dichotomous coordinate descent (DCD) algorithm allows linear systems of equations to be solved with high computational efficiency. In this paper, we present architectures and field-programmable gate-array (FPGA) designs of two variants of the DCD algorithm, which are known as cyclic and leading DCD algorithms. For each of these techniques, we present serial designs, group-2 and group-4 designs, as well as a design with parallel update of the residual vector for the cyclic DCD algorithm. These designs have different degrees of parallelism, thus enabling a tradeoff between FPGA resources and computation time. The serial designs require the smallest FPGA resources; they are well suited for applications where many parallel solvers are required, e.g., for detection in MIMO-orthogonal-frequency-division-multiplexing communication systems. The parallelism introduced in the proposed group-2 and group-4 designs allows faster convergence to the true solution at the expense of an increase in FPGA resources. The design with parallel update of the residual vector provides the fastest convergence speed; however, if the system size is high, it may result in a significant increase in FPGA resources. The proposed fixed-point designs provide an accuracy performance that is very close to the performance of floating-point counterparts and require significantly lower FPGA resources than techniques based on QR decomposition.

96 citations

Proceedings ArticleDOI
P. Luethi1, Andreas Burg1, S. Haene1, D. Perels1, Norbert Felber1, Wolfgang Fichtner1 
27 May 2007
TL;DR: The architecture and results of the first VLSI implementation of an iterative sorted QR decomposition preprocessor for MIMO receivers are described, which performs MIMM channel preprocessing using Givens rotations and provides the base for an improved layered stream decoding.
Abstract: The QR decomposition is an important, but often underestimated prerequisite for pseudo- or non-linear detection methods such as successive interference cancellation or sphere decoding for multiple-input multiple-output (MIMO) systems. The ability of concurrent iterative sorting during the QR decomposition introduces a moderate overall latency, but provides the base for an improved layered stream decoding. This paper describes the architecture and results of the first VLSI implementation of an iterative sorted QR decomposition preprocessor for MIMO receivers. The presented architecture performs MIMO channel preprocessing using Givens rotations in order to compute the minimum mean squared error QR decomposition

95 citations

Book ChapterDOI
05 Sep 2010
TL;DR: This work improves on the latest published approaches to bundle adjustment with conjugate gradients by making full use of the least squares nature of the problem and shows how a certain property of the preconditioned system allows us to reduce the work per iteration to roughly half of the standard CG algorithm.
Abstract: Bundle adjustment for multi-view reconstruction is traditionally done using the Levenberg-Marquardt algorithm with a direct linear solver, which is computationally very expensive. An alternative to this approach is to apply the conjugate gradients algorithm in the inner loop. This is appealing since the main computational step of the CG algorithm involves only a simple matrix-vector multiplication with the Jacobian. In this work we improve on the latest published approaches to bundle adjustment with conjugate gradients by making full use of the least squares nature of the problem. We employ an easy-to-compute QR factorization based block preconditioner and show how a certain property of the preconditioned system allows us to reduce the work per iteration to roughly half of the standard CG algorithm.

94 citations

Journal ArticleDOI
TL;DR: In this paper, the generalized lasso dual path algorithm given by Tibshirani and Taylor in 2011 is considered and a generic approach that covers any penalty matrix D and any (full column rank) matrix X of predictor variables is described.
Abstract: We consider efficient implementations of the generalized lasso dual path algorithm given by Tibshirani and Taylor in 2011. We first describe a generic approach that covers any penalty matrix D and any (full column rank) matrix X of predictor variables. We then describe fast implementations for the special cases of trend filtering problems, fused lasso problems, and sparse fused lasso problems, both with X = I and a general matrix X. These specialized implementations offer a considerable improvement over the generic implementation, both in terms of numerical stability and efficiency of the solution path computation. These algorithms are all available for use in the genlasso R package, which can be found in the CRAN repository.

94 citations


Network Information
Related Topics (5)
Optimization problem
96.4K papers, 2.1M citations
85% related
Network packet
159.7K papers, 2.2M citations
84% related
Robustness (computer science)
94.7K papers, 1.6M citations
83% related
Wireless network
122.5K papers, 2.1M citations
83% related
Wireless sensor network
142K papers, 2.4M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202331
202273
202190
2020132
2019126
2018139