scispace - formally typeset
Search or ask a question

Showing papers on "QR decomposition published in 2016"


Journal ArticleDOI
TL;DR: In this paper, a technique for computing partial rank-revealing factorizations, such as a partial QR factorization or a partial singular value decomposition, is described, which is inspired by the Gram-Schmidt algorithm and has the same asymptotic flop count.
Abstract: This manuscript describes a technique for computing partial rank-revealing factorizations, such as a partial QR factorization or a partial singular value decomposition. The method takes as input a tolerance $\varepsilon$ and an $m\times n$ matrix $\boldsymbol{\mathsf{A}}$ and returns an approximate low-rank factorization of $\boldsymbol{\mathsf{A}}$ that is accurate to within precision $\varepsilon$ in the Frobenius norm (or some other easily computed norm). The rank $k$ of the computed factorization (which is an output of the algorithm) is in all examples we examined very close to the theoretically optimal $\varepsilon$-rank. The proposed method is inspired by the Gram--Schmidt algorithm and has the same $O(mnk)$ asymptotic flop count. However, the method relies on randomized sampling to avoid column pivoting, which allows it to be blocked, and hence accelerates practical computations by reducing communication. Numerical experiments demonstrate that the accuracy of the scheme is for every matrix that was...

97 citations


Journal ArticleDOI
TL;DR: In this paper, the generalized lasso dual path algorithm given by Tibshirani and Taylor in 2011 is considered and a generic approach that covers any penalty matrix D and any (full column rank) matrix X of predictor variables is described.
Abstract: We consider efficient implementations of the generalized lasso dual path algorithm given by Tibshirani and Taylor in 2011. We first describe a generic approach that covers any penalty matrix D and any (full column rank) matrix X of predictor variables. We then describe fast implementations for the special cases of trend filtering problems, fused lasso problems, and sparse fused lasso problems, both with X = I and a general matrix X. These specialized implementations offer a considerable improvement over the generic implementation, both in terms of numerical stability and efficiency of the solution path computation. These algorithms are all available for use in the genlasso R package, which can be found in the CRAN repository.

94 citations


Journal ArticleDOI
TL;DR: The paper focuses on the small-signal stability analysis of large power systems with inclusion of multiple delayed signals and compares a Chebyshev discretization scheme of an equivalent partial differential equations and the well-known Padé approximants.
Abstract: The paper focuses on the small-signal stability analysis of large power systems with inclusion of multiple delayed signals. The following four techniques are compared: 1) a Chebyshev discretization scheme of an equivalent partial differential equations that resembles the original delay differential-algebraic equations (DDAEs); 2) an approximation of the time integration operator; 3) a linear multistep discretization of the DDAEs based on an high-order implicit time-integration scheme; and 4) the well-known Pade approximants. These techniques are compared using a GPU-based parallel implementation of the Shur method and QR factorization and tested through a real-world transmission system.

71 citations


Journal ArticleDOI
TL;DR: The paper demonstrates image steganography using redundant discrete wavelet transform (RDWT) and QR factorization and proposes cover selection measure based on statistical texture analysis, which helps to enhance security of steganographic technique.

67 citations


Journal ArticleDOI
TL;DR: The proposed ED-TAS algorithms are amalgamated with the low-complexity yet efficient power allocation (PA) technique, termed as TAS-PA, for the sake of further improving the system's performance.
Abstract: The benefits of transmit antenna selection (TAS) invoked for spatial modulation (SM) aided multiple-input multiple-output (MIMO) systems are investigated. Specifically, we commence with a brief review of the existing TAS algorithms and focus on the recently proposed Euclidean distance-based TAS (ED-TAS) schemes due to their high diversity gain. Then, a pair of novel ED-TAS algorithms, termed as the improved QR decomposition (QRD)-based TAS (QRD-TAS) and the error-vector magnitude-based TAS (EVM-TAS) are proposed, which exhibit an attractive system performance at low complexity. Moreover, the proposed ED-TAS algorithms are amalgamated with the low-complexity yet efficient power allocation (PA) technique, termed as TAS-PA, for the sake of further improving the system’s performance. Our simulation results show that the proposed TAS-PA algorithms achieve signal-to-noise ratio (SNR) gains of up to 9 dB over the conventional TAS algorithms and up to 6 dB over the TAS-PA algorithm designed for spatial multiplexing systems.

62 citations


Journal ArticleDOI
TL;DR: This work proposes a multiscale low rank modeling that represents a data matrix as a sum of block-wise low rank matrices with increasing scales of block sizes and considers the inverse problem of decomposing the data matrix into its multiscales low rank components and approach the problem via a convex formulation.
Abstract: We present a natural generalization of the recent low rank + sparse matrix decomposition and consider the decomposition of matrices into components of multiple scales. Such decomposition is well motivated in practice as data matrices often exhibit local correlations in multiple scales. Concretely, we propose a multiscale low rank modeling that represents a data matrix as a sum of block-wise low rank matrices with increasing scales of block sizes. We then consider the inverse problem of decomposing the data matrix into its multiscale low rank components and approach the problem via a convex formulation. Theoretically, we show that under various incoherence conditions, the convex program recovers the multiscale low rank components either exactly or approximately. Practically, we provide guidance on selecting the regularization parameters and incorporate cycle spinning to reduce blocking artifacts. Experimentally, we show that the multiscale low rank decomposition provides a more intuitive decomposition than conventional low rank methods and demonstrate its effectiveness in four applications, including illumination normalization for face images, motion separation for surveillance videos, multiscale modeling of the dynamic contrast enhanced magnetic resonance imaging, and collaborative filtering exploiting age information.

53 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed colour image water marking technique based on Hessenberg decomposition outperforms other watermarking methods and it is robust to resist a wide range of attacks, e.g. image compression, filtering, cropping, rotation, adding noise, blurring, scaling, sharpening and rotation and so on.
Abstract: In this study, a novel blind image watermarking technique using Hessenberg decomposition is proposed to embed colour watermark image into colour host image. In the process of embedding watermark, the watermark information of colour image is embedded into the second row of the second column element and the third row of the second column element in the orthogonal matrix obtained by Hessenberg decomposition. In the process of extracting watermark, neither the original host image nor the original watermark image is needed and it is impossible to retrieve them without the authorised keys. Experimental results show that the proposed colour image watermarking technique based on Hessenberg decomposition outperforms other watermarking methods and it is robust to resist a wide range of attacks, e.g. image compression, filtering, cropping, rotation, adding noise, blurring, scaling, sharpening and rotation and so on. Especially, the proposed method has lower computational complexity than other methods based on singular value decomposition or QR decomposition.

50 citations


Journal ArticleDOI
TL;DR: In this article, a dual-domain algorithm based on matrix rank reduction was developed for separating simultaneous-source seismic data. But the authors only considered the 3D common receiver gathers or offset-midpoint gathers.
Abstract: We have developed a fast dual-domain algorithm based on matrix rank reduction for separating simultaneous-source seismic data. Our algorithm operates on 3D common receiver gathers or offset-midpoint gathers. At a given monochromatic frequency slice in the ω-x-y domain, the spatial data of the ideal unblended common receiver or offset-midpoint gather could be represented via a low-rank matrix. The interferences from the randomly and closely fired shots increased the rank of the aforementioned matrix. Therefore, we could minimize the misfit between the blended observation and the predicted blended data subject to a low-rank constraint that was applied to the data in the ω-x-y domain. The low-rank constraint could be implemented via the classic truncated singular value decomposition (SVD) or via a randomized QR decomposition (rQRd). The rQRd yielded nearly one order of processing time improvement with respect to the truncated SVD. We have also discovered that the rQRd was less stringent on the select...

49 citations


Journal ArticleDOI
TL;DR: The implementation and tuning of the kernels for the Cholesky factorization and the forward and backward substitution are described and Comparisons against optimized multicore implementations are presented.
Abstract: Many problems in engineering and scientific computing require the solution of a large number of small systems of linear equations. Due to their high processing power, Graphics Processing Units became an attractive target for this class of problems, and routines based on the LU and the QR factorization have been provided by NVIDIA in the cuBLAS library. This work addresses the situation where the systems of equations are symmetric positive definite. The paper describes the implementation and tuning of the kernels for the Cholesky factorization and the forward and backward substitution. Targeted workloads involve the solution of thousands of linear systems of the same size, where the focus is on matrix dimensions from 5 by 5 to 100 by 100. Due to the lack of a cuBLAS Cholesky factorization, execution rates of cuBLAS LU and cuBLAS QR are used for comparison against the proposed Cholesky factorization in this work. Execution rates of forward and backward substitution routines are compared to equivalent cuBLAS routines. Comparisons against optimized multicore implementations are also presented. Superior performance is reached in all cases.

41 citations


Journal ArticleDOI
TL;DR: Experimental results show that proposed scheme not only efficient in terms of computational cost and memory requirement but also achieve good imperceptibility and robustness against image processing operations compared to the state-of-art techniques.
Abstract: In this paper, an efficient and robust image watermarking scheme based on lifting wavelet transform (LWT) and QR decomposition using Lagrangian support vector regression (LSVR) is presented. After performing one level decomposition of host image using LWT, the low frequency subband is divided into 4?×?4 non-overlapping blocks. Based on the correlation property of lifting wavelet coefficients, each selected block is followed by QR decomposition. The significant element of first row of R matrix of each block is set as target to LSVR for embedding the watermark. The remaining elements (called feature vector) of upper triangular matrix R act as input to LSVR. The security of the watermark is achieved by applying Arnold transformation to original watermark to get its scrambled image. This scrambled image is embedded into the output (predicted value) of LSVR compared with the target value using optimal scaling factor to reduce the tradeoff between imperceptibility and robustness. Experimental results show that proposed scheme not only efficient in terms of computational cost and memory requirement but also achieve good imperceptibility and robustness against image processing operations compared to the state-of-art techniques.

38 citations


Journal ArticleDOI
TL;DR: Low-complexity compressed sensing techniques for monitoring electrocardiogram (ECG) signals in wireless body sensor network (WBSN) are presented and the design shows good hardware efficiency and is suitable for low-energy applications.
Abstract: Low-complexity compressed sensing (CS) techniques for monitoring electrocardiogram (ECG) signals in wireless body sensor network (WBSN) are presented. The prior probability of ECG sparsity in the wavelet domain is first exploited. Then, variable orthogonal multi-matching pursuit (vOMMP) algorithm that consists of two phases is proposed. In the first phase, orthogonal matching pursuit (OMP) algorithm is adopted to effectively augment the support set with reliable indices and in the second phase, the orthogonal multi-matching pursuit (OMMP) is employed to rescue the missing indices. The reconstruction performance is thus enhanced with the prior information and the vOMMP algorithm. Furthermore, the computation-intensive pseudo-inverse operation is simplified by the matrix-inversion-free (MIF) technique based on QR decomposition. The vOMMP-MIF CS decoder is then implemented in 90 nm CMOS technology. The QR decomposition is accomplished by two systolic arrays working in parallel. The implementation supports three settings for obtaining 40, 44, and 48 coefficients in the sparse vector. From the measurement result, the power consumption is 11.7 mW at 0.9 V and 12 MHz. Compared to prior chip implementations, our design shows good hardware efficiency and is suitable for low-energy applications.

Posted Content
TL;DR: In this article, the authors review low rank approximation techniques briefly and give extensive references of many techniques, including Singular Value Decomposition, QR decomposition with column pivoting, rank revealing QR factorization (RRQR), Interpolative decomposition etc.
Abstract: Low rank approximation of matrices has been well studied in literature. Singular value decomposition, QR decomposition with column pivoting, rank revealing QR factorization (RRQR), Interpolative decomposition etc are classical deterministic algorithms for low rank approximation. But these techniques are very expensive $(O(n^{3})$ operations are required for $n\times n$ matrices). There are several randomized algorithms available in the literature which are not so expensive as the classical techniques (but the complexity is not linear in n). So, it is very expensive to construct the low rank approximation of a matrix if the dimension of the matrix is very large. There are alternative techniques like Cross/Skeleton approximation which gives the low-rank approximation with linear complexity in n . In this article we review low rank approximation techniques briefly and give extensive references of many techniques.

Journal ArticleDOI
01 Oct 2016-Optik
TL;DR: The experimental results show that the proposed method is outperforms singular value decomposition (SVD)-based and QR decomposition with column pivoting (QRCP)-based methods in terms of recognition rates.

Journal ArticleDOI
TL;DR: The L-curve method is applied to obtain an improved initial optimal solution by balancing the residual and the complexity of the solutions instead of manually adjusting the smoothing parameters.

Journal ArticleDOI
TL;DR: The idea behind the approach is to transform the spatial covariance matrix to be a scalar matrix σI and to obtain the apodization weights and the beamformed output without computing the matrix inverse, and the computational complexity is reduced to O(L2).
Abstract: Adaptive beamforming methods for ultrasound imaging have been studied to improve image resolution and contrast. The most common approach is the minimum variance (MV) beamformer which minimizes the power of the beamformed output while maintaining the response from the direction of interest constant. The method achieves higher resolution and better contrast than the delay-and-sum (DAS) beamformer, but it suffers from high computational cost. This cost is mainly due to the computation of the spatial covariance matrix and its inverse, which requires $O(L^3)$ computations, where $L$ denotes the subarray size. In this study, we propose a computationally efficient MV beamformer based on QR decomposition. The idea behind our approach is to transform the spatial covariance matrix to be a scalar matrix $\sigma \mathbf{I}$ and we subsequently obtain the apodization weights and the beamformed output without computing the matrix inverse. To do that, QR decomposition algorithm is used and also can be executed at low cost, and therefore, the computational complexity is reduced to $O(L^2)$ . In addition, our approach is mathematically equivalent to the conventional MV beamformer, thereby showing the equivalent performances. The simulation and experimental results support the validity of our approach.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a Fourier basis-based approximation of the Gaussian range kernel of the bilateral filter, where the coefficients of the basis are obtained by solving a series of least-squares problems.
Abstract: It was demonstrated in earlier work that, by approximating its range kernel using shiftable functions, the non-linear bilateral filter can be computed using a series of fast convolutions. Previous approaches based on shiftable approximation have, however, been restricted to Gaussian range kernels. In this work, we propose a novel approximation that can be applied to any range kernel, provided it has a pointwise-convergent Fourier series. More specifically, we propose to approximate the Gaussian range kernel of the bilateral filter using a Fourier basis, where the coefficients of the basis are obtained by solving a series of least-squares problems. The coefficients can be efficiently computed using a recursive form of the QR decomposition. By controlling the cardinality of the Fourier basis, we can obtain a good tradeoff between the run-time and the filtering accuracy. In particular, we are able to guarantee sub-pixel accuracy for the overall filtering, which is not provided by most existing methods for fast bilateral filtering. We present simulation results to demonstrate the speed and accuracy of the proposed algorithm.

Journal ArticleDOI
TL;DR: In this article, the authors presented error analysis of the Cholesky algorithm in an oblique inner product defined by a positive definite matrix, and showed that by repeating the algorithm twice, the stability is greatly improved.
Abstract: The Cholesky QR algorithm is an ideal QR decomposition algorithm for high performance computing, but known to be unstable. We present error analysis of the Cholesky QR algorithm in an oblique inner product defined by a positive definite matrix, and show that by repeating the algorithm twice (called CholeskyQR2), its stability is greatly improved.

Journal ArticleDOI
TL;DR: An algorithm for computing the polar decomposition of a 3 × 3 real matrix that is based on the connection between orthogonal matrices and quaternions is proposed, which is numerically reliable and requires fewer arithmetic operations than the alternative of computing the Polar decomposition via the singular value decomposition.
Abstract: We propose an algorithm for computing the polar decomposition of a 3 × 3 real matrix that is based on the connection between orthogonal matrices and quaternions. An important application is to 3D transformations in the level 3 Cascading Style Sheets specification used in web browsers. Our algorithm is numerically reliable and requires fewer arithmetic operations than the alternative of computing the polar decomposition via the singular value decomposition.

Journal ArticleDOI
01 Dec 2016-Optik
TL;DR: In this paper, the authors proposed a new three-dimensional fractional-order chaotic system with four nonlinear terms, which is solved as a discrete map by employing Adomian Decomposition Method (ADM) and corresponding phase portraits for commensurate system are investigated and all Lyapunov characteristic exponents are calculated based on the QR decomposition method.

Proceedings ArticleDOI
22 May 2016
TL;DR: A fast compressive sensing reconstruction algorithm implemented on FPGA using Orthogonal Matching Pursuit (OMP) that achieves 50% lower complexity than the other existed algorithms.
Abstract: This paper presents a fast compressive sensing reconstruction algorithm implemented on FPGA using Orthogonal Matching Pursuit (OMP). The algorithm is optimized with QR decomposition to solve the least square problem and avoids the square root operations to facilitate the hardware implementation. The implementation results show that this design can run at a frequency of 100MHz and the proposed algorithm achieves 50% lower complexity than the other existed algorithms.

Proceedings ArticleDOI
01 Jul 2016
TL;DR: The architecture and implementation of a high performance QR decomposition IEEE754 single precision floating point core is described, using a modified Gram-Schmidt algorithm, using Intel's new floating point Arria 10 FPGAs to generate column high functional units, giving O(n2) processing times.
Abstract: This paper describes the architecture and implementation of a high performance QR decomposition IEEE754 single precision floating point core, using a modified Gram-Schmidt algorithm. Using Intel's new floating point Arria 10 FPGAs, synthesis is used to generate column high functional units, giving O(n2) processing times. The modified Gram-Schmidt algorithm is expressed in a different order to combine the elements of the column calculations at a later stage, which reduces the data dependencies in a deeply pipelined hardware implementation. Special vector structures in the tools and hardware architecture are supported to maximize performance. The matrix sizes are parameterized, with both square and rectangular complex matrixes supported. Two versions of the QRD core, using an example matrix size of 50×100 are presented, one of a conventional FPGA architecture (Stratix V) and the other on Floating Point FPGA (Arria 10) with comparative data on logic, registers, DSP, memory resources and Fmax. Matrix throughput and GFLOPS/W results are also provided. In addition, results for specific matrix sizes common in wireless MIMO processing are also presented.

Journal ArticleDOI
TL;DR: A new adaptive version of the split Bregman method for finding online sparse solutions that is numerically more stable and easily amenable to multivariate implementation due to the use of a QR decomposition-based RLS algorithm for implementation.

Book ChapterDOI
12 Jun 2016
TL;DR: This paper presents application of Givens rotations in the process of learning feedforward artificial neural network based on QR decomposition, and describes mathematical background that needs to be considered during the application.
Abstract: This paper presents application of Givens rotations in the process of learning feedforward artificial neural network. This approach is based on QR decomposition. The paper describes mathematical background that needs to be considered during the application of the Givens rotations. The paper concludes with results of example simulations.

Journal ArticleDOI
TL;DR: Simulation results show that the novel simple interference cancellation scheme can perform better than DFE, and its complexity is very low because of the absence of QR decomposition.
Abstract: The faster-than-Nyquist (FTN) signalling is a bandwidth-efficient technology which has drawn attention in the bandwidth-starved world. However, inter-symbol interference is introduced by transmitting signals at a higher signalling rate than allowed by the Nyquist criterion. Decision feedback equalisation (DFE) cancellation is an efficient signal detection scheme for FTN-based communication system. However, since the DFE interference cancellation is executed by overall matrix computation, the required memory size is very large. In order to reduce the complexity of the interference cancellation process, a novel simple interference cancellation is proposed. Simulation results show that the scheme can perform better than DFE, and its complexity is very low because of the absence of QR decomposition.

Journal ArticleDOI
P. Sawyer1
TL;DR: In this paper, the QR decomposition is used to compute the Iwasawa decomposition for all classical Lie groups of non-compact type, and this approach can also be used for the exceptional Lie groups.

Journal ArticleDOI
TL;DR: Numerical tests show that the LSQR algorithm can be successfully applied to retrieve the ASD with high stability in the presence of random noise and low susceptibility to the shape of distributions, and the experimental measurement ASD over Harbin in China is recovered reasonably.

Journal ArticleDOI
TL;DR: Numerical results have corroborated claims, demonstrating the sensitivity of the Gram–Schimidt algorithm, as well as the deterioration of the large-scale MIMO detection performance under highly correlated channels scenarios.
Abstract: In this work, some aspects of the sorted QR decomposition are addressed for ordered successive---interference---cancellation detection in large-scale antenna systems. An analysis on the sorted QR decomposition behavior, including its impact on the performance of symbol detection in large ill conditioned MIMO channel matrices, has been presented. As the correlation on the channel matrix grows, the sorted QR decomposition may not ensure its requirements, causing misleading symbol estimation. In this context, it is shown that orthogonality condition may be broken, depending on the matrix condition, which comes from propagation errors on the norm updating of the modified Gram---Schimidt method. Numerical results have corroborated our claims, demonstrating the sensitivity of the Gram---Schimidt algorithm, as well as the deterioration of the large-scale MIMO detection performance under highly correlated channels scenarios.

Journal ArticleDOI
TL;DR: Numerical simulation results demonstrate that the new multiple color-image authentication system based on HSI (Hue–Saturation–Intensity) color space and QR decomposition in gyrator domains is the superior than the existing techniques.
Abstract: A new multiple color-image authentication system based on HSI (Hue–Saturation–Intensity) color space and QR decomposition in gyrator domains is proposed. In this scheme, original color images are converted from RGB (Red–Green–Blue) color spaces to HSI color spaces, divided into their H, S, and I components, and then obtained corresponding phase-encoded components. All the phase-encoded H, S, and I components are individually multiplied, and then modulated by random phase functions. The modulated H, S, and I components are convoluted into a single gray image with asymmetric cryptosystem. The resulting image is segregated into Q and R parts by QR decomposition. Finally, they are independently gyrator transformed to get their encoded parts. The encoded Q and R parts should be gathered without missing anyone for decryption. The angles of gyrator transform afford sensitive keys. The protocol based on QR decomposition of encoded matrix and getting back decoded matrix after multiplying matrices Q and R, ...

Proceedings ArticleDOI
20 Mar 2016
TL;DR: Simulations demonstrate that using the proposed scheme, the QRD overhead is reduced by almost 50% for very high order MIMO, without incurring any performance degradation.
Abstract: In this paper, low-complexity multiple-input multiple-output (MIMO) subspace detection schemes are studied, which decompose a channel into multiple decoupled streams to be detected disjointly. Existing schemes require a number of matrix decomposition operations equal to the number of detected streams, which is computationally complex, especially in high-order MIMO systems. We propose two computationally efficient detection algorithms, based on a preprocessing stage that consists of special layer ordering, followed by permutation-robust QR decomposition (QRD) and elementary matrix operations. The algorithms are illustrated in the context of a 4-layer MIMO system, and their complexity is studied. Simulations demonstrate that using the proposed scheme, the QRD overhead is reduced by almost 50% for very high order MIMO, without incurring any performance degradation.

Journal ArticleDOI
TL;DR: In this article, the authors demonstrate the application of QR factorization in solving the regional P-wave structure and computing the full resolution matrix with 267,520 model parameters, and demonstrate that direct methods are becoming feasible for large seismic tomography problems, based upon recent developments in sparse algorithms and high performance computing resources.
Abstract: Abbreviated title: Toward using direct methods in seismic tomography. Summary For more than two decades, the number of data and model parameters in seismic tomography problems has exceeded the available computational resources required for application of direct computational methods, leaving iterative solvers the only option. One disadvantage of the iterative techniques is that the inverse of the matrix that defines the system is not explicitly formed, and as a consequence, the model resolution and covariance matrices cannot be computed. Despite the significant effort in finding computationally affordable approximations of these matrices, challenges remain, and methods such as the checkerboard resolution tests continue to be used. Based upon recent developments in sparse algorithms and high performance computing resources, we show that direct methods are becoming feasible for large seismic tomography problems. We demonstrate the application of QR factorization in solving the regional P-wave structure and computing the full resolution matrix with 267,520 model parameters.