Showing papers on "Sparse matrix published in 2007"

PDF

Open Access

Book•

Matrix Computations, 3rd Edition

[...]

Golub Gene H. Et.Al

06 Jul 2007

948 citations

Journal Article•DOI•

A Direct Formulation for Sparse PCA Using Semidefinite Programming

[...]

Alexandre d'Aspremont¹, Laurent El Ghaoui, Michael I. Jordan, Gert R. G. Lanckriet²•Institutions (2)

Princeton University¹, University of California, San Diego²

01 Jul 2007-Siam Review

TL;DR: In this paper, the authors consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination.

...read moreread less

Abstract: Given a covariance matrix, we consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination This problem arises in the decomposition of a covariance matrix into sparse factors or sparse principal component analysis (PCA), and has wide applications ranging from biology to finance We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming-based relaxation for our problem We also discuss Nesterov's smooth minimization technique applied to the semidefinite program arising in the semidefinite relaxation of the sparse PCA problem The method has complexity $O(n^4 \sqrt{\log(n)}/\epsilon)$, where $n$ is the size of the underlying covariance matrix and $\epsilon$ is the desired absolute accuracy on the optimal value of the problem

...read moreread less

699 citations

Journal Article•DOI•

Novel Three-Phase AC–AC Sparse Matrix Converters

[...]

Johann W. Kolar¹, F. Schafmeister¹, S.D. Round¹, Hans Ertl²•Institutions (2)

ETH Zurich¹, Vienna University of Technology²

10 Sep 2007-IEEE Transactions on Power Electronics

TL;DR: In this article, a three-phase ac-ac sparse matrix converter with no energy storage elements and employing only 15 IGBTs, as opposed to 18 IGBT switches, was proposed.

...read moreread less

Abstract: A novel three-phase ac-ac sparse matrix converter having no energy storage elements and employing only 15 IGBTs, as opposed to 18 IGBTs of a functionally equivalent conventional ac-ac matrix converter, is proposed. It is shown that the realization effort could be further reduced to only nine IGBTs in an ultra sparse matrix converter (USMC) in the case where only unidirectional power flow is required and the fundamental phase displacement at the input and at the output is limited to plusmnpi/6. The dependency of the voltage and current transfer ratios of the sparse matrix converters on the operating parameters is analyzed and a space vector modulation scheme is described in combination with a zero current commutation method. Finally, the sparse matrix concept is verified by simulation and experimentally using a 6.8-kW/400-V very sparse matrix converter, which is implemented with 12 IGBT switches, and USMC prototypes.

...read moreread less

398 citations

Proceedings Article•DOI•

Toeplitz-Structured Compressed Sensing Matrices

[...]

Waheed U. Bajwa¹, Jarvis Haupt¹, G. Raz, Stephen J. Wright¹, Robert Nowak¹ - Show less +1 more•Institutions (1)

University of Wisconsin-Madison¹

26 Aug 2007

TL;DR: It is shown that Toeplitz-structured matrices with entries drawn independently from the same distributions are also sufficient to recover x from y with high probability, and the performance of such matrices is compared with that of fully independent and identically distributed ones.

...read moreread less

Abstract: The problem of recovering a sparse signal x Rn from a relatively small number of its observations of the form y = Ax Rk, where A is a known matrix and k « n, has recently received a lot of attention under the rubric of compressed sensing (CS) and has applications in many areas of signal processing such as data cmpression, image processing, dimensionality reduction, etc. Recent work has established that if A is a random matrix with entries drawn independently from certain probability distributions then exact recovery of x from these observations can be guaranteed with high probability. In this paper, we show that Toeplitz-structured matrices with entries drawn independently from the same distributions are also sufficient to recover x from y with high probability, and we compare the performance of such matrices with that of fully independent and identically distributed ones. The use of Toeplitz matrices in CS applications has several potential advantages: (i) they require the generation of only O(n) independent random variables; (ii) multiplication with Toeplitz matrices can be efficiently implemented using fast Fourier transform, resulting in faster acquisition and reconstruction algorithms; and (iii) Toeplitz-structured matrices arise naturally in certain application areas such as system identification.

...read moreread less

339 citations

Journal Article•DOI•

Low-cost evaluation of the exchange Fock matrix from Cholesky and density fitting representations of the electron repulsion integrals

[...]

Francesco Aquilante¹, Thomas Bondo Pedersen, Roland Lindh•Institutions (1)

Lund University¹

21 May 2007-Journal of Chemical Physics

TL;DR: A new algorithm, "local K" (LK), is proposed, for fast evaluation of the exchange Fock matrix in case the Cholesky decomposition of the electron repulsion integrals is used, which can be used also within the class of methods that employ auxiliary basis set expansions for representing the electron Repulsion Integrals.

...read moreread less

Abstract: The authors propose a new algorithm, “local K” (LK), for fast evaluation of the exchange Fock matrix in case the Cholesky decomposition of the electron repulsion integrals is used. The novelty lies in the fact that rigorous upper bounds to the contribution from each occupied orbital to the exchange Fock matrix are employed. By formulating these inequalities in terms of localized orbitals, the scaling of computing the exchange Fock matrix is reduced from quartic to quadratic with only negligible prescreening overhead and strict error control. Compared to the unscreened Cholesky algorithm, the computational saving is substantial for systems of medium and large sizes. By virtue of its general formulation, the LK algorithm can be used also within the class of methods that employ auxiliary basis set expansions for representing the electron repulsion integrals.

...read moreread less

287 citations

Journal Article•DOI•

Fourth-Order Cumulant-Based Blind Identification of Underdetermined Mixtures

[...]

L. De Lathauwer, Joséphine Castaing, Jean-François Cardoso¹•Institutions (1)

École Normale Supérieure¹

01 Jun 2007-IEEE Transactions on Signal Processing

TL;DR: Two fourth-order cumulant-based techniques for the estimation of the mixing matrix in underdetermined independent component analysis based on a simultaneous matrix diagonalization and a simultaneous off-diagonalization are studied.

...read moreread less

Abstract: In this paper we study two fourth-order cumulant-based techniques for the estimation of the mixing matrix in underdetermined independent component analysis. The first method is based on a simultaneous matrix diagonalization. The second is based on a simultaneous off-diagonalization. The number of sources that can be allowed is roughly quadratic in the number of observations. For both methods, explicit expressions for the maximum number of sources are given. Simulations illustrate the performance of the techniques

...read moreread less

283 citations

Proceedings Article•DOI•

Efficient Compressive Sensing with Deterministic Guarantees Using Expander Graphs

[...]

Weiyu Xu¹, Babak Hassibi¹•Institutions (1)

California Institute of Technology¹

24 Sep 2007

TL;DR: A compressive sensing scheme with deterministic performance guarantees using expander-graphs-based measurement matrices is proposed and it is shown that the signal recovery can be achieved with complexity O(n) even if the number of nonzero elements k grows linearly with n.

...read moreread less

Abstract: Compressive sensing is an emerging technology which can recover a sparse signal vector of dimension n via a much smaller number of measurements than n. However, the existing compressive sensing methods may still suffer from relatively high recovery complexity, such as O(n3), or can only work efficiently when the signal is super sparse, sometimes without deterministic performance guarantees. In this paper, we propose a compressive sensing scheme with deterministic performance guarantees using expander-graphs-based measurement matrices and show that the signal recovery can be achieved with complexity O(n) even if the number of nonzero elements k grows linearly with n. We also investigate compressive sensing for approximately sparse signals using this new method. Moreover, explicit constructions of the considered expander graphs exist. Simulation results are given to show the performance and complexity of the new method.

...read moreread less

224 citations

Proceedings Article•DOI•

Distributed sparse random projections for refinable approximation

[...]

Wei Wang¹, Minos Garofalakis¹, Kannan Ramchandran¹•Institutions (1)

University of California, Berkeley¹

25 Apr 2007

TL;DR: A novel distributed algorithm based on sparse random projections, which requires no global coordination or knowledge to solve the problem of recovering an approximation of n data values by querying any L sensors, so that the reconstruction error is comparable to the optimal fc-term approximation.

...read moreread less

Abstract: Consider a large-scale wireless sensor network measuring compressible data, where n distributed data values can be well-approximated using only k

...read moreread less

196 citations

Proceedings Article•DOI•

Compressed Sensing Framework for EEG Compression

[...]

Selin Aviyente¹•Institutions (1)

Michigan State University¹

26 Aug 2007

TL;DR: A compressed sensing framework is introduced for efficient representation of multichannel, multiple trial EEG data and a simultaneous orthogonal matching pursuit algorithm is shown to be effective in the joint recovery of the original multiple trail EEG signals from a small number of projections.

...read moreread less

Abstract: Many applications in signal processing require the efficient representation and processing of data. The traditional approach to efficient signal representation is compression. In recent years, there has been a new approach to compression at the sensing level. Compressed sensing (CS) is an emerging field which is based on the revelation that a small collection of linear projections of a sparse signal contains enough information for reconstruction. In this paper, we propose an application of compressed sensing in the field of biomedical signal processing, particularly electroencophelogram (EEG) collection and storage. A compressed sensing framework is introduced for efficient representation of multichannel, multiple trial EEG data. The proposed framework is based on the revelation that EEG signals are sparse in a Gabor frame. The sparsity of EEG signals in a Gabor frame is utilized for compressed sensing of these signals. A simultaneous orthogonal matching pursuit algorithm is shown to be effective in the joint recovery of the original multiple trail EEG signals from a small number of projections.

...read moreread less

183 citations

Journal Article•DOI•

A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations

[...]

Nicholas I. M. Gould¹, Jennifer A. Scott¹, Yifan Hu²•Institutions (2)

Rutherford Appleton Laboratory¹, Wolfram Research²

01 Jun 2007-ACM Transactions on Mathematical Software

TL;DR: This study uses performance profiles as a tool for evaluating and comparing the performance of serial sparse direct solvers on an extensive set of symmetric test problems taken from a range of practical applications.

...read moreread less

Abstract: In recent years a number of solvers for the direct solution of large sparse symmetric linear systems of equations have been developed. These include solvers that are designed for the solution of positive definite systems as well as those that are principally intended for solving indefinite problems. In this study, we use performance profiles as a tool for evaluating and comparing the performance of serial sparse direct solvers on an extensive set of symmetric test problems taken from a range of practical applications.

...read moreread less

174 citations

Journal Article•DOI•

A Sparse Discretization for Integral Equation Formulations of High Frequency Scattering Problems

[...]

Daan Huybrechs¹, Stefan Vandewalle¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Oct 2007-SIAM Journal on Scientific Computing

TL;DR: A boundary element method with basis functions that incorporate the asymptotic behavior of the solution at high frequencies and combines this hybrid method with very effective quadrature rules for oscillatory integrals to obtain a sparse discretization matrix for the oscillatory problem.

...read moreread less

Abstract: We consider two-dimensional scattering problems, formulated as an integral equation defined on the boundary of the scattering obstacle. The oscillatory nature of high-frequency scattering problems necessitates a large number of unknowns in classical boundary element methods. In addition, the corresponding discretization matrix of the integral equation is dense. We formulate a boundary element method with basis functions that incorporate the asymptotic behavior of the solution at high frequencies. The method exhibits the effectiveness of asymptotic methods at high frequencies with only few unknowns, but retains accuracy for lower frequencies. New in our approach is that we combine this hybrid method with very effective quadrature rules for oscillatory integrals. As a result, we obtain a sparse discretization matrix for the oscillatory problem. Moreover, numerical experiments indicate that the accuracy of the solution actually increases with increasing frequency. The sparse discretization applies to problems where the phase of the solution can be predicted a priori, for example in the case of smooth and convex scatterers.

...read moreread less

Decay bounds and ( ) algorithms for approximating functions of sparse matrices

[...]

Michele Benzi, Nader Razouk¹•Institutions (1)

Kent State University¹

01 Jan 2007

TL;DR: In this article, decay bounds for the entries of f(A), where A is a sparse (in particular, banded) n× n diagonalizable matrix and f is smooth on a subset of the complex plane containing the spectrum of A, are established.

...read moreread less

Abstract: We establish decay bounds for the entries of f(A), where A is a sparse (in particular, banded) n× n diagonalizable matrix and f is smooth on a subset of the complex plane containing the spectrum of A. Combined with techniques from approximation theory, the bounds are used to compute sparse (or banded) approximations to f(A), resulting in algorithms that under appropriate conditions have linear complexity in the matrix dimension. Applications to various types of problems are discussed and illustrated by numerical examples.

...read moreread less

Book Chapter•DOI•

Concurrent number cruncher: an efficient sparse linear solver on the GPU

[...]

Luc Buatois¹, Guillaume Caumon¹, Bruno Levy²•Institutions (2)

Nancy-Université¹, French Institute for Research in Computer Science and Automation²

26 Sep 2007

TL;DR: By combining recent GPU programming techniques with supercomputing strategies, this work implements a sparse general-purpose linear solver which outperforms leading-edge CPU counterparts (MKL / ACML).

...read moreread less

Abstract: A wide class of geometry processing and PDE resolution methods needs to solve a linear system, where the non-zero pattern of the matrix is dictated by the connectivity matrix of the mesh. The advent of GPUs with their ever-growing amount of parallel horsepower makes them a tempting resource for such numerical computations. This can be helped by new APIs (CTM from ATI and CUDA from NVIDIA) which give a direct access to the multithreaded computational resources and associated memory bandwidth of GPUs; CUDA even provides a BLAS implementation but only for dense matrices (CuBLAS). However, existing GPU linear solvers are restricted to specific types of matrices, or use non-optimal compressed row storage strategies. By combining recent GPU programming techniques with supercomputing strategies (namely block compressed row storage and register blocking), we implement a sparse general-purpose linear solver which outperforms leading-edge CPU counterparts (MKL / ACML).

...read moreread less

Journal Article•DOI•

Cholesky Decompositions and Estimation of A Covariance Matrix: Orthogonality of Variance-Correlation Parameters

[...]

Mohsen Pourahmadi¹•Institutions (1)

Northern Illinois University¹

01 Dec 2007-Biometrika

TL;DR: In this article, Chen and Dunson proposed a modified Cholesky decomposition of the form E = DLL'D for a covariance matrix where D is a diagonal matrix with entries proportional to the square roots of the diagonal entries of E and L is a unit lower-triangular matrix solely determining its correlation matrix.

...read moreread less

Abstract: SUMMARY Chen & Dunson (2003) have proposed a modified Cholesky decomposition of the form E = DLL'D for a covariance matrix where D is a diagonal matrix with entries proportional to the square roots of the diagonal entries of E and L is a unit lower-triangular matrix solely determining its correlation matrix. This total separation of variance and correlation is definitely a major advantage over the more traditional modified Cholesky decomposition of the form LD2L' (Pourahmadi, 1999). We show that, though the variance and correlation parameters of the former decomposition are separate, they are not asymptotically orthogonal and that the estimation of the new parameters could be more demanding computationally. We also provide statistical interpretation for the entries of L and D as certain moving average parameters and innovation variances and indicate how the existing likelihood procedures can be employed to estimate the new parameters.

...read moreread less

Journal Article•DOI•

JADAMILU: a software code for computing selected eigenvalues of large sparse symmetric matrices

[...]

Matthias Bollhöfer¹, Yvan Notay²•Institutions (2)

Braunschweig University of Technology¹, Université libre de Bruxelles²

15 Dec 2007-Computer Physics Communications

TL;DR: A new software code for computing selected eigenvalues and associated eigenvectors of a real symmetric matrix is described, which combines the Jacobi–Davidson method with efficient multilevel incomplete LU (ILU) preconditioning.

...read moreread less

Journal Article•DOI•

The combination technique and some generalisations

[...]

Markus Hegland¹, Jochen Garcke¹, Vivien J. Challis²•Institutions (2)

Australian National University¹, University of Queensland²

15 Jan 2007-Linear Algebra and its Applications

TL;DR: In this paper, the performance of the combination technique is analyzed using a projection framework and the C/S decomposition, and modified combination coefficients are derived which are optimal in a certain sense.

...read moreread less

Journal Article•DOI•

Scientific computing Kernels on the cell processor

[...]

Samuel Williams¹, John Shalf¹, Leonid Oliker¹, Shoaib Kamil¹, Parry Husbands¹, Katherine Yelick¹ - Show less +2 more•Institutions (1)

Lawrence Berkeley National Laboratory¹

01 Jun 2007-International Journal of Parallel Programming

TL;DR: A performance model for Cell is introduced and applied to several key numerical kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs and is validated by comparing results against published hardware data, as well as the own Cell blade implementations.

...read moreread less

Abstract: In this work, we examine the potential of using the recently-released STI Cell processor as a building block for future high-end scientific computing systems. Our work contains several novel contributions. First, we introduce a performance model for Cell and apply it to several key numerical kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs. Next, we validate our model by comparing results against published hardware data, as well as our own Cell blade implementations. Additionally, we compare Cell performance to benchmarks run on leading superscalar (AMD Opteron), VLIW (Intel Itanium2), and vector (Cray X1E) architectures. Our work also explores several different kernel implementations and demonstrates a simple and effective programming model for Cell's unique architecture. Finally, we propose modest microarchitectural modifications that could significantly increase the efficiency of double-precision calculations. Overall results demonstrate the tremendous potential of the Cell architecture for scientific computations in terms of both raw performance and power efficiency.

...read moreread less

Journal Article•DOI•

A 3-D Spectral-Element Time-Domain Method for Electromagnetic Simulation

[...]

Joon-Ho Lee¹, Qing Huo Liu¹•Institutions (1)

Duke University¹

07 May 2007-IEEE Transactions on Microwave Theory and Techniques

TL;DR: A spectral-element time-domain (SETD) method is proposed to solve 3-D transient electromagnetic problems based on Gauss-Lobatto-Legendre polynomials that requires only a trivial sparse matrix-vector product at each time step, thus significantly reducing CPU time and memory requirement.

...read moreread less

Abstract: A spectral-element time-domain (SETD) method is proposed to solve 3-D transient electromagnetic problems based on Gauss-Lobatto-Legendre polynomials. It has the advantages of spectral accuracy and block-diagonal mass matrix. With the inexpensive inversion of the block-diagonal mass matrix, the proposed method requires only a trivial sparse matrix-vector product at each time step, thus significantly reducing CPU time and memory requirement. Galerkin's method is used for spatial discretization, and a fourth-order Runge-Kutta scheme is employed for the time integration. The perfectly matched layer (PML) is employed to truncate the boundary in unbounded problems. The pseudospectral time-domain method is used to simplify the treatment of the PML inside the proposed SETD method. Numerical examples are shown to verify the efficiency and the spectral accuracy with the order of basis functions

...read moreread less

Journal Article•DOI•

Convolutive Blind Source Separation in the Frequency Domain Based on Sparse Representation

[...]

Zhaoshui He¹, Shengli Xie¹, Shuxue Ding², Andrzej Cichocki³•Institutions (3)

South China University of Technology¹, University of Aizu², Warsaw University of Technology³

01 Jul 2007-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This paper proposes a novel natural gradient method for complex sparse representation based on the maximum a posteriori (MAP) criterion, and develops a new CBSS method that works in the frequency domain.

...read moreread less

Abstract: Convolutive blind source separation (CBSS) that exploits the sparsity of source signals in the frequency domain is addressed in this paper. We assume the sources follow complex Laplacian-like distribution for complex random variable, in which the real part and imaginary part of complex-valued source signals are not necessarily independent. Based on the maximum a posteriori (MAP) criterion, we propose a novel natural gradient method for complex sparse representation. Moreover, a new CBSS method is further developed based on complex sparse representation. The developed CBSS algorithm works in the frequency domain. Here, we assume that the source signals are sufficiently sparse in the frequency domain. If the sources are sufficiently sparse in the frequency domain and the filter length of mixing channels is relatively small and can be estimated, we can even achieve underdetermined CBSS. We illustrate the validity and performance of the proposed learning algorithm by several simulation examples.

...read moreread less

Proceedings Article•DOI•

A comparison of vertex ordering algorithms for large graph visualization

[...]

C. Mueller¹, B. Martin¹, Andrew Lumsdaine¹•Institutions (1)

Indiana University¹

15 Oct 2007

TL;DR: This study examines the use of graph ordering algorithms for visual analysis of data sets using visual similarity matrices, considering three general classes of algorithms for generating orderings: simple graph theoretic algorithms, symbolic sparse matrix reordering algorithms, and spectral decomposition algorithms.

...read moreread less

Abstract: In this study, we examine the use of graph ordering algorithms for visual analysis of data sets using visual similarity matrices. Visual similarity matrices display the relationships between data items in a dot-matrix plot format, with the axes labeled with the data items and points drawn if there is a relationship between two data items. The biggest challenge for displaying data using this representation is finding an ordering of the data items that reveals the internal structure of the data set. Poor orderings are indistinguishable from noise whereas a good ordering can reveal complex and subtle features of the data. We consider three general classes of algorithms for generating orderings: simple graph theoretic algorithms, symbolic sparse matrix reordering algorithms, and spectral decomposition algorithms. We apply each algorithm to synthetic and real world data sets and evaluate each algorithm for interpretability (i.e., does the algorithm lead to images with usable visual features?) and stability (i.e., does the algorithm consistently produce similar results?). We also provide a detailed discussion of the results for each algorithm across the different graph types and include a discussion of some strategies for using ordering algorithms for data analysis based on these results.

...read moreread less

Journal Article•DOI•

A note on the iterative MRI reconstruction from nonuniform k-space data.

[...]

Tobias Knopp¹, Stefan Kunis², Daniel Potts²•Institutions (2)

University of Lübeck¹, Chemnitz University of Technology²

13 Mar 2007-International Journal of Biomedical Imaging

TL;DR: A recently proposed implicit discretisation scheme is used which generalises the standard approach based on gridding and improves the reconstruction quality when the sampling scheme or the weights are less regular.

...read moreread less

Abstract: In magnetic resonance imaging (MRI), methods that use a non-Cartesian grid in k-space are becoming increasingly important. In this paper, we use a recently proposed implicit discretisation scheme which generalises the standard approach based on gridding. While the latter succeeds for sufficiently uniform sampling sets and accurate estimated density compensation weights, the implicit method further improves the reconstruction quality when the sampling scheme or the weights are less regular. Both approaches can be solved efficiently with the nonequispaced FFT. Due to several new techniques for the storage of an involved sparse matrix, our examples include also the reconstruction of a large 3D data set. We present four case studies and report on efficient implementation of the related algorithms.

...read moreread less

Journal Article•DOI•

Three-step multi-domain BEM solver for nonhomogeneous material problems.

[...]

Xiao-Wei Gao¹, L. Guo¹, Ch. Zhang²•Institutions (2)

Southeast University¹, University of Siegen²

01 Dec 2007-Engineering Analysis With Boundary Elements

TL;DR: In this article, a three-step solution technique is presented for solving two-dimensional and three-dimensional (3D) nonhomogeneous material problems using the multi-domain boundary element method.

...read moreread less

Abstract: A three-step solution technique is presented for solving two-dimensional (2D) and three-dimensional (3D) nonhomogeneous material problems using the multi-domain boundary element method. The discretized boundary element formulation expressed in terms of normalized displacements and tractions is written for each sub-domain. The first step is to eliminate internal variables at the individual domain level. The second step is to eliminate boundary unknowns defined over nodes used only by the domain itself. And the third step is to establish the system of equations according to the compatibility of displacements and equilibrium of tractions at common interface nodes. Discontinuous elements are utilized to model the traction discontinuity across corner nodes. The distinct feature of the three-step solver is that only interface displacements are unknowns in the final system of equations and the coefficient matrix is blocked sparse. As a result, large-scale 3D problems can be solved efficiently. Three numerical examples for 2D and 3D problems are given to demonstrate the effectiveness of the presented technique.

...read moreread less

Proceedings Article•DOI•

Sparse Matrix-Vector Multiplication Design on FPGAs

[...]

Junqing Sun¹, Gregory D. Peterson¹, Olaf O. Storaasli²•Institutions (2)

University of Tennessee¹, Oak Ridge National Laboratory²

23 Apr 2007

TL;DR: This paper introduces the innovative SpMxV solver designed for FPGAs (SSF), which achieves up to 20x speedup and minimizes the control logic by taking advantage of the data flow via an innovative accumulation circuit which uses pipelined floating point adders.

...read moreread less

Abstract: Creating a high throughput sparse matrix vector multiplication (SpMxV) implementation depends on a balanced system design. In this paper, we introduce the innovative SpMxV solver designed for FPGAs (SSF). Besides high computational throughput, system performance is optimized by reducing initialization time and overheads, minimizing and overlapping I/O operations, and increasing scalability. SSF accepts any matrix size and can be easily adapted to different data formats. SSF minimizes the control logic by taking advantage of the data flow via an innovative accumulation circuit which uses pipelined floating point adders. Compared to optimized software codes on a Pentium 4 microprocessor, our design achieves up to 20x speedup.

...read moreread less

Journal Article•DOI•

A two-step approach for model reduction in flexible multibody dynamics

[...]

Michael Lehner¹, Peter Eberhard¹•Institutions (1)

University of Stuttgart¹

23 Feb 2007-Multibody System Dynamics

TL;DR: It is demonstrated that the suggested two-step approach for model reduction in flexible multibody dynamics has very good approximation capabilities in the time as well as in the frequency domain and can help to reduce the computation time of a numerical simulation significantly.

...read moreread less

Abstract: In this work, a two-step approach for model reduction in flexible multibody dynamics is proposed. This technique is a combination of the Krylov-subspace method and a Gramian matrix based reduction approach that is particularly suited if a small reduced-order model of a system charged with many force-inputs has to be generated. The proposed methodology can be implemented efficiently using sparse matrix techniques and is therefore applicable to large-scale systems too. By a numerical example, it is demonstrated that the suggested two-step approach has very good approximation capabilities in the time as well as in the frequency domain and can help to reduce the computation time of a numerical simulation significantly.

...read moreread less

Patent•

Compression system and method for accelerating sparse matrix computations

[...]

Andrew Lumsdaine¹, Jeremiah Willcock¹•Institutions (1)

Office of Technology Transfer¹

12 Feb 2007

TL;DR: In this paper, the authors present a sparse matrix processing system and method which uses sparse matrices that are compressed to reduce memory traffic and improve performance of computations using sparse matrix.

...read moreread less

Abstract: The present invention involves a sparse matrix processing system and method which uses sparse matrices that are compressed to reduce memory traffic and improve performance of computations using sparse matrices.

...read moreread less

Proceedings Article•DOI•

The case for a wide-table approach to manage sparse relational data sets

[...]

Eric Chu¹, Jennifer L. Beckmann², Jeffrey F. Naughton¹•Institutions (2)

University of Wisconsin-Madison¹, Microsoft²

11 Jun 2007

TL;DR: This paper argues that the single-table approach is a necessary component of self-managing database systems because it frees users from a tedious and potentially ineffective schema-design phase when managing sparse data sets.

...read moreread less

Abstract: A "sparse" data set typically has hundreds or even thousands of attributes, but most objects have non-null values for only a small number of these attributes. A popular view about sparse data is that it arises merely as the result of poor schema design. In this paper, we argue that rather than being the result of inept schema design,storing a sparse data set in a single table is the right way to proceed. However, for this to be the case, RDBMSs must provide sparse data management facilities that go beyond the previously studied requirement of storing such data sets efficiently. In particular, an RDBMS must 1) enable users to effectively build ad hoc queries over a very large number of attributes, and 2) support efficient evaluation of these queries over a wide, sparse table. We propose techniques that provide these capabilities, and argue that the single-table approach is a necessary component of self-managing database systems because it frees users from a tedious and potentially ineffective schema-design phase when managing sparse data sets.

...read moreread less

Journal Article•DOI•

Fast evaluation of scaled opposite spin second-order Møller–Plesset correlation energies using auxiliary basis expansions and exploiting sparsity

[...]

Yousung Jung¹, Yousung Jung², Yihan Shao¹, Yihan Shao², Martin Head-Gordon¹, Martin Head-Gordon² - Show less +2 more•Institutions (2)

University of California, Berkeley¹, Lawrence Berkeley National Laboratory²

01 Sep 2007-Journal of Computational Chemistry

TL;DR: A diverse set of test calculations shows that the size of system where significant computational savings can be achieved depends strongly on the dimensionality of the system, and the extent of localizability of the molecular orbitals.

...read moreread less

Abstract: The scaled opposite spin Moller–Plesset method (SOS-MP2) is an economical way of obtaining correlation energies that are computationally cheaper, and yet, in a statistical sense, of higher quality than standard MP2 theory, by introducing one empirical parameter. But SOS-MP2 still has a fourth-order scaling step that makes the method inapplicable to very large molecular systems. We reduce the scaling of SOS-MP2 by exploiting the sparsity of expansion coefficients and local integral matrices, by performing local auxiliary basis expansions for the occupied-virtual product distributions. To exploit sparsity of 3-index local quantities, we use a blocking scheme in which entire zero-rows and columns, for a given third global index, are deleted by comparison against a numerical threshold. This approach minimizes sparse matrix book-keeping overhead, and also provides sufficiently large submatrices after blocking, to allow efficient matrix–matrix multiplies. The resulting algorithm is formally cubic scaling, and requires only moderate computational resources (quadratic memory and disk space) and, in favorable cases, is shown to yield effective quadratic scaling behavior in the size regime we can apply it to. Errors associated with local fitting using the attenuated Coulomb metric and numerical thresholds in the blocking procedure are found to be insignificant in terms of the predicted relative energies. A diverse set of test calculations shows that the size of system where significant computational savings can be achieved depends strongly on the dimensionality of the system, and the extent of localizability of the molecular orbitals. © 2007 Wiley Periodicals, Inc. J Comput Chem 2007

...read moreread less

Journal Article•DOI•

A Multiresolution System of Rao–Wilton–Glisson Functions

[...]

Francesca Vipiana¹, Giuseppe Vecchi¹, Paola Pirinoli¹•Institutions (1)

Polytechnic University of Turin¹

12 Mar 2007-IEEE Transactions on Antennas and Propagation

TL;DR: A multiresolution (MR) basis is described for the method of moments (MoM) analysis of a generic 3-D conductor discretized with triangular cells, constructing as linear combination of Rao-Wilton-Glisson basis functions thus allowing direct applicability on existing MoM codes.

...read moreread less

Abstract: A multiresolution (MR) basis is described for the method of moments (MoM) analysis of a generic 3-D conductor discretized with triangular cells. The MR basis functions are constructed as linear combination of Rao-Wilton-Glisson (RWG) basis functions, thus allowing direct applicability on existing MoM codes. With respect to previous work by the authors, the generation of the MR functions is significantly simplified, yet keeping similar applicability and performance, in terms of conditioning of the MoM matrix, convergence of the iterative solvers, and possibility of sparsifying the MoM matrix by clipping. Moreover, with the proposed basis the basis-change matrix from RWG to MR is highly sparse at all levels; this allows a more efficient generation of the MR-MoM matrix

...read moreread less

Journal Article•DOI•

Reconstruction in optical tomography using the P-N approximations

[...]

Stephen M. Wright, Martin Schweiger¹, Simon R. Arridge•Institutions (1)

University College London¹

01 Jan 2007-Measurement Science and Technology

TL;DR: In this article, the authors consider the inverse problem of reconstructing the absorption and scattering coefficients of the radiative transfer equation (RTE) from measurements of photon current transmitted through a scattering medium in the frequency domain.

...read moreread less

Abstract: In this paper, we consider the inverse problem of reconstructing the absorption and scattering coefficients of the radiative transfer equation (RTE) from measurements of photon current transmitted through a scattering medium in the frequency domain. We consider an output least-squares formulation of this problem and derive the appropriate forward operators and their Frechet derivatives. For efficient implementation, we use the second-order form of the RTE and discuss its solution using a finite element method. The PN approximation is used to expand the radiance in spherical harmonics, which leads to a large sparse matrix system that can be efficiently solved. Examples are shown in the low-scattering case where the diffusion approximation fails.

...read moreread less

Proceedings Article•DOI•

Representation-transparent matrix algorithms with scalable performance

[...]

Peter Gottschling¹, David S. Wise, Michael D. Adams¹•Institutions (1)

Indiana University¹

17 Jun 2007

TL;DR: The design of the Matrix Template Library (MTL) has been independently extended to provide recursators, to support block-recursive algorithms, supplementing MTL's iterators, and these techniques generalize to other applications in linear algebra.

...read moreread less

Abstract: Positive results from new object-oriented tools for scientific programming are reported. Using template classes, abstractions of matrix representations are available that subsume conventional row-major, column-major, either Z- or I-Morton-order, as well as block-wise combinations of these. Moreover, the design of the Matrix Template Library (MTL) has been independently extended to provide recursators, to support block-recursive algorithms, supplementing MTL's iterators. Data types modeling both concepts enable the programmer to implement both iterative and recursive algorithms (or even both) on all of the aforementioned matrix representations at once for a wide family of important scientific operations.We illustrate the unrestricted applicability of our matrix-recursator on matrix multiplication. The same generic block-recursive function, unaltered, is instantiated on different triplets of matrix types. Within a base block, either a library multiplication or a user-provided, platform-specific code provides excellent performance. We achieve 77% of peak-performance using hand-tuned base cases without explicit prefetching. This excellent performance becomes available over a wide family of matrix representations from a single program. The techniques generalize to other applications in linear algebra.

...read moreread less

Collapse