scispace - formally typeset
Search or ask a question

Showing papers on "QR decomposition published in 2007"


Book
01 Jan 2007
TL;DR: In this article, the authors introduce linear algebra concepts and matrix decompositions for data mining and pattern recognition, including vectors and matrices in data mining, and linear systems and least squares for pattern recognition.
Abstract: Preface Part I. Linear Algebra Concepts and Matrix Decompositions: 1. Vectors and matrices in data mining and pattern recognition 2. Vectors and matrices 3. Linear systems and least squares 4. Orthogonality 5. QR decomposition 6. Singular value decomposition 7. Reduced rank least squares models 8. Tensor decomposition 9. Clustering and non-negative matrix factorization Part II. Data Mining Applications: 10. Classification of handwritten digits 11. Text mining 12. Page ranking for a Web search engine 13. Automatic key word and key sentence extraction 14. Face recognition using rensor SVD Part III. Computing the Matrix Decompositions: 15. Computing Eigenvalues and singular values Bibliography Index.

272 citations


Journal ArticleDOI
TL;DR: It is shown that essentially all standard linear algebra operations, including LU decompositions, QR decomposition, linear equation solving, matrix inversion, solving least squares problems, (generalized) eigenvalue problems and the singular value decomposition can also be done stably (in a normwise sense) in O(nω+η) operations.
Abstract: In Demmel et al. (Numer. Math. 106(2), 199–224, 2007) we showed that a large class of fast recursive matrix multiplication algorithms is stable in a normwise sense, and that in fact if multiplication of n-by-n matrices can be done by any algorithm in O(n ω+η) operations for any η > 0, then it can be done stably in O(nω+η) operations for any η > 0. Here we extend this result to show that essentially all standard linear algebra operations, including LU decomposition, QR decomposition, linear equation solving, matrix inversion, solving least squares problems, (generalized) eigenvalue problems and the singular value decomposition can also be done stably (in a normwise sense) in O(nω+η) operations.

196 citations


Journal ArticleDOI
TL;DR: The quest for a highly accurate and efficient SVD algorithm has led to a new, superior variant of the Jacobi algorithm, which has inherited all good high accuracy properties and can outperform the QR algorithm.
Abstract: This paper is the result of concerted efforts to break the barrier between numerical accuracy and run-time efficiency in computing the fundamental decomposition of numerical linear algebra—the singular value decomposition (SVD) of general dense matrices. It is an unfortunate fact that the numerically most accurate one-sided Jacobi SVD algorithm is several times slower than generally less accurate bidiagonalization-based methods such as the QR or the divide-and-conquer algorithm. Our quest for a highly accurate and efficient SVD algorithm has led us to a new, superior variant of the Jacobi algorithm. The new algorithm has inherited all good high accuracy properties of the Jacobi algorithm, and it can outperform the QR algorithm.

154 citations


Proceedings ArticleDOI
06 Jan 2007
TL;DR: This paper proposes a fully parallel VLSI architecture under fixed-precision for the inverse computation of a real square matrix using QR decomposition with modified Gram-Schmidt (MGS) orthogonalization.
Abstract: Matrix inversion and triangularization problems are common to a wide variety of communication systems, signal processing applications and solution of a set of linear equations. Matrix inversion is a computationally intensive process and its hardware implementation based on fixed-point (FP) arithmetic is a challenging problem. This paper proposes a fully parallel VLSI architecture under fixed-precision for the inverse computation of a real square matrix using QR decomposition with modified Gram-Schmidt (MGS) orthogonalization. The MGS algorithm is stable and accurate to the integral multiples of machine precision under fixed-precision for a well-conditioned non-singular matrix. For typical matrices (4 times 4) found in MIMO communication systems, the proposed architecture was able to achieve a clock rate of 277 MHz with a latency of 18 time units and area of 72K gates using 0.18-mum CMOS technology. For a generic square matrix of order n, the latency required is 5n - 2 which is better than all previously known architectures. With the use of LUTs and log-domain computations, the total area has been reduced compared to architectures based on linear-domain computations

119 citations


Journal ArticleDOI
TL;DR: An algorithm for the QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data (referred to as ‘tiles’) where parallelism can be exploited only at the level of the BLAS operations and with vendor implementations is presented.
Abstract: As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents an algorithm for the QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data. These tasks can be dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in an out of order execution of the tasks which will completely hide the presence of intrinsically sequential tasks in the factorization. Performance comparisons are presented with the LAPACK algorithm for QR factorization where parallelism can only be exploited at the level of the BLAS operations.

115 citations


Proceedings ArticleDOI
P. Luethi1, Andreas Burg1, S. Haene1, D. Perels1, Norbert Felber1, Wolfgang Fichtner1 
27 May 2007
TL;DR: The architecture and results of the first VLSI implementation of an iterative sorted QR decomposition preprocessor for MIMO receivers are described, which performs MIMM channel preprocessing using Givens rotations and provides the base for an improved layered stream decoding.
Abstract: The QR decomposition is an important, but often underestimated prerequisite for pseudo- or non-linear detection methods such as successive interference cancellation or sphere decoding for multiple-input multiple-output (MIMO) systems. The ability of concurrent iterative sorting during the QR decomposition introduces a moderate overall latency, but provides the base for an improved layered stream decoding. This paper describes the architecture and results of the first VLSI implementation of an iterative sorted QR decomposition preprocessor for MIMO receivers. The presented architecture performs MIMO channel preprocessing using Givens rotations in order to compute the minimum mean squared error QR decomposition

95 citations


Proceedings ArticleDOI
27 May 2007
TL;DR: A new method is proposed using programmable hardware units which not only achieves higher performance but also consumes less silicon area and can be reused for many other operations such as complex matrix multiplication, filtering, correlation and FFT/IFFT.
Abstract: Complex matrix inversion is a very computationally demanding operation in advanced multi-antenna wireless communications. Traditionally, systolic array-based QR decomposition (QRD) is used to invert large matrices. However, the matrices involved in MIMO baseband processing in mobile handsets are generally small which means QRD is not necessarily efficient. In this paper, a new method is proposed using programmable hardware units which not only achieves higher performance but also consumes less silicon area. Furthermore, the hardware can be reused for many other operations such as complex matrix multiplication, filtering, correlation and FFT/IFFT.

78 citations


Journal ArticleDOI
TL;DR: The simulation results show that the MENSE is superior in detecting closely spaced signals with a small number of snapshots and/or at relatively low signal-to-noise ratio (SNR)
Abstract: Inspired by the computational simplicity and numerical stability of QR decomposition, a nonparametric method for estimating the number of signals without eigendecomposition (MENSE) is proposed for the coherent narrowband signals impinging on a uniform linear array (ULA). By exploiting the array geometry and its shift invariance property to decorrelate the coherency of signals through subarray averaging, the number of signals is revealed in the rank of the QR upper-trapezoidal factor of the autoproduct of a combined Hankel matrix formed from the cross correlations between some sensor data. Since the infection of additive noise is defused, signal detection capability is improved. A new detection criterion is then formulated in terms of the row elements of the QR upper-triangular factor when finite array data are available, and the number of signals is determined as a value of the running index for which this ratio criterion is maximized, where the QR decomposition with column pivoting is also used to improve detection performance. The statistical analysis clarifies that the MENSE detection criterion is asymptotically consistent. Furthermore, the proposed MENSE algorithm is robust against the array uncertainties including sensor gain and phase errors and mutual coupling and against the deviations from the spatial homogeneity of noise model. The effectiveness of the MENSE is verified through numerical examples, and the simulation results show that the MENSE is superior in detecting closely spaced signals with a small number of snapshots and/or at relatively low signal-to-noise ratio (SNR)

78 citations


Journal ArticleDOI
TL;DR: This paper discusses a hybrid Monte Carlo and numerical integration EM algorithm for computing the maximum likelihood estimates for linear and nonlinear mixed models with censored data, and uses an efficient block-sampling scheme, automated monitoring of convergence, and dimension reduction based on the QR decomposition.

62 citations


Book ChapterDOI
01 Jan 2007
TL;DR: This paper proposes a new O(n2) complexity QR algorithm for real companion matrices by representing the matrices in the iterations in their sequentially semi-separable (SSS) forms and shows high efficiency and numerical robustness of the new QR algorithm.
Abstract: It has been shown in [4, 5, 6, 31] that the Hessenberg iterates of a companion matrix under the QR iterations have low off-diagonal rank structures. Such invariant rank structures were exploited therein to design fast QR iteration algorithms for finding eigenvalues of companion matrices. These algorithms require only O(n) storage and run in O(n2) time where n is the dimensiosn of the matrix. In this paper, we propose a new O(n2) complexity QR algorithm for real companion matrices by representing the matrices in the iterations in their sequentially semi-separable (SSS) forms [9, 10]. The bulge chasing is done on the SSS form QR factors of the Hessenberg iterates. Both double shift and single shift versions are provided. Deflation and balancing are also discussed. Numerical results are presented to illustrate both high efficiency and numerical robustness of the new QR algorithm.

59 citations


Proceedings ArticleDOI
01 Nov 2007
TL;DR: A minimum- area matrix decomposition architecture that is programmable to perform QRD and SVD with variable precision is described and the associated design and implementation trade-offs are investigated.
Abstract: The singular value decomposition (SVD) and the QR decomposition (QRD) are two prominent matrix decomposition algorithms used in various signal processing applications. In the field of multiple-input multiple-output (MIMO) communication systems, the SVD and the QRD are employed for beamforming and for channel-matrix preprocessing for MIMO detection, respectively. In this paper, we describe a minimum- area matrix decomposition architecture that is programmable to perform QRD and SVD with variable precision and we investigate the associated design and implementation trade-offs. Our reference implementation achieves a hardware efficiency of up to 325 k SVDs/s/mm2 and 1.92 M QRDs/s/mm2 for complex-valued 4 times 4-matrices in 0.18 mum CMOS technology.

Journal ArticleDOI
TL;DR: This paper focuses on the Hessenberg matrix associated with the multiplication operator in terms of an orthogonal basis in the linear space of polynomials with complex coefficients and the LU and QR factorizations of such a matrix.

Journal ArticleDOI
TL;DR: An improved whitening scheme for estimation of signal subspace, a novel biquadratic contrast function for extraction of independent sources, and an efficient alterative method for joint implementation of a set of approximate diagonalization-structural matrices are developed.
Abstract: This paper addresses the problem of blind separation of multiple independent sources from observed array output signals. The main contributions in this paper include an improved whitening scheme for estimation of signal subspace, a novel biquadratic contrast function for extraction of independent sources, and an efficient alterative method for joint implementation of a set of approximate diagonalization-structural matrices. Specifically, an improved whitening scheme is first developed by estimating the signal subspace jointly from a set of diagonalization-structural matrices based on the proposed cyclic maximizer of an interesting cost function. Moreover, the globally asymptotical convergence of the proposed cyclic maximizer is analyzed and proved. Next, a novel biquadratic contrast function is proposed for extracting one single independent component from a slice matrix group of any order cumulant of the array signals in the presence of temporally white noise. A fast fixed-point algorithm that is a cyclic minimizer is constructed for searching a minimum point of the proposed contrast function. The globally asymptotical convergence of the proposed fixed-point algorithm is analyzed. Then, multiple independent components are obtained by using repeatedly the proposed fixed-point algorithm for extracting one single independent component, and the orthogonality among them is achieved by the well-known QR factorization. The performance of the proposed algorithms is illustrated by simulation results and is compared with three related blind source separation algorithms

Journal ArticleDOI
TL;DR: An on-line scheme is formulated for modeling a nonlinear autoregressive with exogenous input (NARX) recurrent neuro-fuzzy structure from input-output samples of a multivariable nonlinear dynamic system in a noisy environment.
Abstract: In this paper, a new algorithm for neuro-fuzzy identification of multivariable discrete-time nonlinear dynamic systems, more specifically applied to consequent parameters estimation of the neuro-fuzzy inference system, is proposed based on a decomposed form as a set of coupled multiple input and single output (MISO) Takagi-Sugeno (TS) neuro-fuzzy networks. An on-line scheme is formulated for modeling a nonlinear autoregressive with exogenous input (NARX) recurrent neuro-fuzzy structure from input-output samples of a multivariable nonlinear dynamic system in a noisy environment. The adaptive weighted instrumental variable (WIV) algorithm by QR factorization based on the numerically robust orthogonal Householder transformation is developed to modify the consequent parameters of the TS multivariable neuro-fuzzy network

Journal ArticleDOI
TL;DR: The results show that the inverse solutions recovery by the LSQR method were more accurate than those recovered by the Tikhonov and TSVD methods, and suggests that their combination may provide a good scheme for solving the ECG inverse problem.
Abstract: Computing epicardial potentials from body surface potentials constitutes one form of ill-posed inverse problem of electrocardiography (ECG). To solve this ECG inverse problem, the Tikhonov regularization and truncated singular-value decomposition (TSVD) methods have been commonly used to overcome the ill-posed property by imposing constraints on the magnitudes or derivatives of the computed epicardial potentials. Such direct regularization methods, however, are impractical when the transfer matrix is large. The least-squares QR (LSQR) method, one of the iterative regularization methods based on Lanczos bidiagonalization and QR factorization, has been shown to be numerically more reliable in various circumstances than the other methods considered. This LSQR method, however, to our knowledge, has not been introduced and investigated for the ECG inverse problem. In this paper, the regularization properties of the Krylov subspace iterative method of LSQR for solving the ECG inverse problem were investigated. Due to the 'semi-convergence' property of the LSQR method, the L-curve method was used to determine the stopping iteration number. The performance of the LSQR method for solving the ECG inverse problem was also evaluated based on a realistic heart-torso model simulation protocol. The results show that the inverse solutions recovered by the LSQR method were more accurate than those recovered by the Tikhonov and TSVD methods. In addition, by combing the LSQR with genetic algorithms (GA), the performance can be improved further. It suggests that their combination may provide a good scheme for solving the ECG inverse problem.

Posted Content
TL;DR: In this paper, the authors present an algorithm for the Cholesky, LU and QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data.
Abstract: As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents an algorithm for the Cholesky, LU and QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data. These tasks can be dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in an out of order execution of the tasks which will completely hide the presence of intrinsically sequential tasks in the factorization. Performance comparisons are presented with the LAPACK algorithms where parallelism can only be exploited at the level of the BLAS operations and vendor implementations.

Patent
Yingxue Li1
31 Aug 2007
TL;DR: In this article, a method and apparatus for QR decomposition-based multiple-input multiple-output (MIMO) detection and soft bit generation is disclosed, and a tree search process is performed using the matrix and the vector to generate an approximate maximum likelihood (ML) estimate of transmitted symbols.
Abstract: A method and apparatus for QR decomposition-based multiple-input multiple-output (MIMO) detection and soft bit generation are disclosed. QR decomposition is performed on the MIMO channel matrix H to compute a Q matrix and an R matrix such that H=QR. The R matrix, or diagonal elements of the R matrix, is stored in a memory. An matrix is computed by dividing elements in each row of the R matrix with a corresponding diagonal element of the R matrix. A vector is computed by dividing each element of the received symbol vector Y with a corresponding diagonal element of the R matrix. A tree search process is performed using the matrix and the vector to generate an approximate maximum likelihood (ML) estimate of transmitted symbols.

Journal ArticleDOI
TL;DR: This work considers the computation of the Iwasawa decomposition of a symplectic matrix via the QR factorization and improves on the method recently described by T.-Y.

Patent
03 May 2007
TL;DR: In this article, a method for determining a signal vector comprising a plurality of components from a received signal vector is provided comprising performing a QR decomposition of a channel matrix characterizing the communication channel via which the signal vector was received.
Abstract: A method for determining a signal vector comprising a plurality of components from a received signal vector is provided comprising performing a QR decomposition of a channel matrix characterizing the communication channel via which the signal vector was received and being expanded by variance information about the noise on the communication channel carrying out a plurality of determination steps using the QR decomposition of the expanded channel matrix, wherein in each step a set of possible sub-vectors of the signal vector is determined and wherein in each step, the number of possible sub-vectors in the set is lower than a predefined maximum number, and selecting one vector of the set of possible sub-vectors determined in the last step of the plurality of determination steps as the signal vector.

Journal ArticleDOI
TL;DR: A fast new algorithm for reducing an N × N quasiseparable matrix to upper Hessenberg form via a sequence of N − 2 unitary transformations, yielding lower cost and a simplification of implementation.

01 Jan 2007
TL;DR: In this paper, a new URV-type matrix decomposition is proposed for solving generalized eigenvalue problems, namely Ax = �Bx, which is guaranteed to produce eigenvalues that are paired to working precision.
Abstract: In this work numerical methods for the solution of two classes of structured generalized eigenvalue problems, Ax = �Bx, are developed. Those classes are the palindromic (B = A T ) and the even (A = A T , B = −B T ) eigenvalue problems. The spectrum of these problems is not arbitrary, rather do eigenvalues occur in pairs. We will construct methods for palindromic and even eigenvalue problems that are of cubic complexity and that are guaranteed to produce eigenvalues that are paired to working precision. At the heart of both methods is a new URV-type matrix decomposition, that simultaneously transforms three matrices to skew triangular form, i.e., to a form that is triangular with respect to the Northeast-Southwest diagonal. The algorithm to compute this URV decomposition uses several other methods to reduce a single square matrix to skew triangular form: the skew QR factorization and the skew QRQ T decomposition. Moreover, a method to compute the singular value decomposition of a complex, skew symmetric matrix is presented and used.

Journal Article
TL;DR: It is demonstrated that, the proposed column oriented QR decomposition algorithm which uses MDA for column ordering and VPAIR for row ordering can lead to a much faster PSSE.

Journal ArticleDOI
TL;DR: In this article, an effective numerical solution for the plane wave scattering by multilayered periodic arrays of dielectric spheres is presented, where mode matching is analyzed by the mode matching method, where the electromagnetic fields in the air and dielectrics regions are approximated by using the Floquet harmonics and vector spherical wave functions, respectively.
Abstract: An effective numerical solution is presented for the plane wave scattering by multilayered periodic arrays of dielectric spheres. The treated structure is a fundamental model of photonic crystals having three-dimensional periodicity. The problem is analyzed by the mode matching method, where the electromagnetic fields in the air and dielectric regions are approximated by using the Floquet harmonics and vector spherical wave functions, respectively. They are matched on the junction surfaces in the least squares sense. Introduction of sequential accumulation in the process of QR decomposition reduces the computation time from O(Q 3 )t oO(Q 1 ) and the memory requirement from O(Q 2 )t oO(Q 1 ), with Q being a number of sphere layers. Numerical results are given for CPU time, speed of convergence, and some band gap characteristics.

Journal ArticleDOI
TL;DR: This paper presents an approach which is based on the compression of the partial inductance matrix utilizing the QR decomposition of the far coefficients submatrices which yields an efficient and mathematically consistent approach for reducing the storage and time requirements.
Abstract: The partial element equivalent circuit (PEEC) approach has been used in different forms for the computation of equivalent circuit elements for quasi-static and full-wave electromagnetic models. In this paper, we focus on the topic of large scale inductance computations. For many problems as part of PEEC modeling, partial inductances need to be computed to model interactions between a large numbers of objects. These computations can be very time and memory consuming. To date, several techniques have been devised to reduce the memory and time required to compute the partial inductance entities, as well as the time required to use them in a circuit analysis compute step. Some of the existing methods use hierarchical compression while some others are based on issues like properties of the inverse of the partial inductance matrix. However, because of inherent limitations, most of these methods are less suitable for PEEC applications. In this paper, we present an approach which is based on the compression of the partial inductance matrix utilizing the QR decomposition of the far coefficients submatrices. The QR-decomposed form is represented as a compressed SPICE-compatible circuit. This yields an efficient and mathematically consistent approach for reducing the storage and time requirements

Journal ArticleDOI
TL;DR: A QR-based nullspace factorization of KKT matrices and maintains the reduced Hessian positive definite, so that the resulting quasi-Newton steps in the primal and dual variables are downhill for suitably weighted merit functions.
Abstract: For use in a total quasi-Newton NLP code [Griewank, A. and Walther, A., 2002, On constrained optimization by adjoint based quasi-Newton methods. Optimization Methods and Software, 17, 869-889.], we describe a QR-based nullspace factorization of KKT matrices. We illustrate the linear algebra in detail and present a theory for maintaining factorized matrices after low-rank updates. Each update of the whole system is incorporated with a computational effort that grows only quadratically with respect to the number of variables and active constraints. Furthermore, our method is special in making use of quasi-Newton updates for the constraint Jacobian approximation, instead of the usual way of using the exact derivative or divided differences. To avoid singularity or blow-up of the KKT matrix, we limit the variations of its determinant to a certain factor and dampen or augment the updates if necessary. We maintain the reduced Hessian positive definite, so that the resulting quasi-Newton steps in the primal and dual variables are downhill for suitably weighted merit functions.

Proceedings ArticleDOI
01 Jul 2007
TL;DR: A new initial radius (IR) selection method of Sphere Decoding (SD), called IR-ZF-OSUC, for MIMO system is proposed, and selects the distance between the received signal and the lattice point mapped by the suboptimal solution as the IR, which makes the procedure can be embedded in the body of SD.
Abstract: In this paper, a new initial radius (IR) selection method of Sphere Decoding (SD), called IR-ZF-OSUC, for MIMO system is proposed. An significant merit is that this method utilizes the result of QR decomposition which is inherent in SD to obtain a suboptimal solution, and selects the distance between the received signal and the lattice point mapped by the suboptimal solution as the IR, which makes the procedure can be embedded in the body of SD. Unlike other approaches, it needs no extra processing, so lower computational complexity is achieved. Additionally, this method includes an ordering step, which not only makes the IR be closer to the optimal value but also increases the efficiency of SD itself. The simulation results show that the computational complexity of SD with IR-ZF-OSUC is lower than the ones using other approaches over a wide range of SNRs.

Proceedings ArticleDOI
04 Dec 2007
TL;DR: A FPGA implementation of a 4 times 4 MIMO MMSE Equalizer based on the QR factorization technique is presented using only 6 unrolled coordinate rotation digital computer (CORDIC) operators, which are efficiently exploited to minimize the latency of computation.
Abstract: In this paper a FPGA implementation of a 4 times 4 MIMO MMSE Equalizer based on the QR factorization technique is presented. Considering fast varying channels a new filter is computed at each new channel realization thanks to an efficient architecture of matrix triangularization. The QR decomposition is performed with only 6 unrolled coordinate rotation digital computer (CORDIC) operators, which are efficiently exploited to minimize the latency of computation. Soft LLR obtained in output of the equalizer are validated on an FPGA hardware bench including a 4 times 4 Rayleigh fading channel and turbo- coding.

Proceedings ArticleDOI
04 Dec 2007
TL;DR: An adaptive QRD-M algorithm with variable number of surviving paths is proposed for MIMO systems that can offer a better tradeoff between BER performance and computational complexity.
Abstract: The QR decomposition based M algorithm (QRD-M) is a sub-optimal detection algorithm which can achieve a tradeoff between bit error rate (BER) performance and computational complexity for multiple input multiple output (MIMO) systems. In this paper, an adaptive QRD-M algorithm with variable number of surviving paths is proposed for MIMO systems. The number of surviving paths at each detection stage is adaptively determined according to the instantaneous value and the statistics of channel conditions. The required statistics of the channel conditions is directly derived and given in closed form without a large number of training observations. The proposed algorithm is simple to implement and it is verified by computer simulations that the proposed algorithm has lower computational complexity than the fixed QRD-M algorithms thus can offer a better tradeoff between BER performance and computational complexity.

Proceedings Article
01 Sep 2007
TL;DR: This paper reports on a highly optimized 4×4 MMSE detector implementation that resulted in a real-time FPGA based implementation on a Xilinx Virtex-II 6000 part that delivers over 420 Mbps sustained throughput, with a small 2.77 μs latency.
Abstract: This paper reports on a highly optimized 4×4 MMSE detector implementation. The work resulted in a real-time FPGA based implementation on a Xilinx Virtex-II 6000 part. It utilizes 8,513 logic slices, 64 multipliers, and 23 Block RAMs (less than 30% of the overall resources of this part). The design delivers over 420 Mbps sustained throughput, with a small 2.77 μs latency. Three main techniques are responsible for the improvements over other MIMO detectors reported in literature. They are: (a) the combination of a modified Gram-Schmidt QR decomposition algorithm with Square-Root linear MMSE detection; (b) a dynamic scaling algorithm that enhances numerical stability; and (c) an aggressive time-shared VLSI architecture. The above techniques are quite general and are readily applicable to any MIMO detector implementation.

Proceedings ArticleDOI
15 Apr 2007
TL;DR: A new algorithm for background subtraction that can model the background image from a sequence of images, even if there are foreground objects in each image frame, and identification of the background based on QR-decomposition method in linear algebra is presented.
Abstract: This paper presents a new algorithm for background subtraction that can model the background image from a sequence of images, even if there are foreground objects in each image frame. In contrast with Gaussian mixture model algorithm, in our proposed method the problem of distinguishing between background and foreground kernels becomes trivial. The key idea of our method lies in the identification of the background based on QR-decomposition method in linear algebra. R-values taken from QR-decomposition can be applied to decompose a given system to indicate the degree of the significance of the decomposed parts. We split the image into small blocks and select the background blocks with the weakest contribution, according to the assigned R-values. Simulation results show the better background detection performance with respect to some others.