scispace - formally typeset
Search or ask a question

Showing papers on "Sparse approximation published in 1996"


Journal ArticleDOI
Masato Okada1
TL;DR: This paper summarizes associative memory models and sparse representation of memory in these models and discusses the biological plausibility of the discussed associative memories and sparse coding.

148 citations


Book ChapterDOI
19 Aug 1996
TL;DR: Experimental results on sparse matrices confirm both the validity of the proposed hypergraph models and appropriateness of the multilevel approach to hypergraph partitioning.
Abstract: In this work, we show the deficiencies of the graph model for decomposing sparse matrices for parallel matrix-vector multiplication. Then, we propose two hypergraph models which avoid all deficiencies of the graph model. The proposed models reduce the decomposition problem to the well-known hypergraph partitioning problem widely encountered in circuit partitioning in VLSI. We have implemented fast Kernighan-Lin based graph and hypergraph partitioning heuristics and used the successful multilevel graph partitioning tool (Metis) for the experimental evaluation of the validity of the proposed hypergraph models. We have also developed a multilevel hypergraph partitioning heuristic for experimenting the performance of the multilevel approach on hypergraph partitioning. Experimental results on sparse matrices, selected from Harwell-Boeing collection and NETLIB suite, confirm both the validity of our proposed hypergraph models and appropriateness of the multilevel approach to hypergraph partitioning.

100 citations


Journal ArticleDOI
TL;DR: The method presented in this paper delays data structure selection until the compile phase, thereby allowing the compiler to combine code optimization with explicitData structure selection, and enables the compilation of efficient code for sparse computations.
Abstract: The problem of compiler optimization of sparse codes is well known and no satisfactory solutions have been found yet. One of the major obstacles is formed by the fact that sparse programs explicitly deal with particular data structures selected for storing sparse matrices. This explicit data structure handling obscures the functionality of a code to such a degree that optimization of the code is prohibited, for instance, by the introduction of indirect addressing. The method presented in this paper delays data structure selection until the compile phase, thereby allowing the compiler to combine code optimization with explicit data structure selection. This method enables the compiler to generate efficient code for sparse computations. Moreover, the task of the programmer is greatly reduced in complexity.

60 citations


Proceedings ArticleDOI
07 May 1996
TL;DR: An iterative algorithm for computing sparse solutions (or sparse approximate solutions) to linear inverse problems and is shown to converge to the local minima of a function of the form used for picking out sparse solutions.
Abstract: We present an iterative algorithm for computing sparse solutions (or sparse approximate solutions) to linear inverse problems. The algorithm is intended to supplement the existing arsenal of techniques. It is shown to converge to the local minima of a function of the form used for picking out sparse solutions, and its connection with existing techniques explained. Finally, it is demonstrated on subset selection and deconvolution examples. The fact that the proposed algorithm is sometimes successful when existing greedy algorithms fail is also demonstrated.

58 citations



Journal ArticleDOI
TL;DR: This paper shows that dynamic exploitation of the sparsity inherent in derivative computation can result in dramatic gains in runtime and memory savings, and reports on the runtime andMemory requirements of computing the gradients with the ADIFOR.
Abstract: Automatic differentiation (AD) is a technique that augments computer codes with statements for the computation of derivatives. The computational workhorse of AD-generated codes for first-order derivatives is the linear combination of vectors. For many large-scale problems, the vectors involved in this operation are inherently sparse. If the underlying function is a partially separable one (e.g., if its Hessian is sparse), many of the intermediate gradient vectors computed by AD will also be sparse, even though the final gradient is likely to be dense. For large Jacobians computations, every intermediate derivative vector is usually at least as sparse as the least sparse row of the final Jacobian. In this paper, we show that dynamic exploitation of the sparsity inherent in derivative computation can result in dramatic gains in runtime and memory savings. For a set of gradient problems exhibiting implicit sparsity, we report on the runtime and memory requirements of computing the gradients with the ADIFOR (...

43 citations


31 Dec 1996
TL;DR: This paper discusses issues and reports on the experience of PSPARSLIB, an on-going project for building a library of parallel iterative sparse matrix solvers, which aims to find efficient ways to precondition the system.
Abstract: Solving sparse irregularly structured linear systems on parallel platforms poses several challenges. First, sparsity makes it difficult to exploit data locality, whether in a distributed or shared memory environment. A second, perhaps more serious challenge, is to find efficient ways to precondition the system. Preconditioning techniques which have a large degree of parallelism, such as multicolor SSOR, often have a slower rate of convergence than their sequential counterparts. Finally, a number of other computational kernels such as inner products could ruin any gains gained from parallel speed-ups, and this is especially true on workstation clusters where start-up times may be high. In this paper we discuss these issues and report on our experience with PSPARSLIB, an on-going project for building a library of parallel iterative sparse matrix solvers.

27 citations


Journal ArticleDOI
TL;DR: Some recent developments in frontal and multifrontal schemes for solving sparse linear systems, including variants that exploit parallelism and matrix structure are reviewed.

26 citations


Proceedings ArticleDOI
01 Jan 1996
TL;DR: This paper analyzes the use of Blocklng, Data Precopying and Software Pipelining to improve the performance of sparse matrix computations on superscalar workstations and shows that there is a clear difference between the dense case and the sparse case in terms of the compromises to be adopted to optimize the algorithms.
Abstract: In this paper we analyze the use of Blocklng (tiling), Data Precopying and Software Pipelining to improve the performance of sparse matrix computations on superscalar workstations. In particular, we analyze the case of the Sparse Matrix by dense Matrix operation. The analysis focusses on the practical aspects that can be observed when programming such problem on present workstations with several memory levels. The problem is studied on the Alpha 21064 based workstation DEC 3000/800. Simulations of the memory hierarchy are also used to understand the behaviour of the algorithms. The results obtained show that there is a clear difference between the dense case and the sparse case in terms of the compromises to be adopted to optimize the algorithms. The analysis can be of interest to numerical library and compiler designers.

23 citations


Journal ArticleDOI
TL;DR: New lower bounds on the number of non-zeros of sparse polynomials are obtained and a fully polynomial time (e, δ) approximation algorithm for the number and degree of multivariate sparse poynomials over a finite field of q elements and degree less than q − 1 is given.

23 citations


Journal ArticleDOI
TL;DR: The estimation of the sparse signal and the optimization of the Gaussian mixture are combined in the proposed algorithm: in each iteration a new signal estimate and a new model (which approximates the distribution of the new estimate) are obtained.

Proceedings ArticleDOI
01 Jan 1996
TL;DR: A run-time system called RAPID is described that provides a set of library functions for specifying irregular data objects and tasks that access these objects, and the system extracts a task dependence graph from data access patterns, and executes tasks efficiently on a distributed memory machine.
Abstract: Run-time compilation techniques have been shown effective for automating the parallelization of loops with unstructured indirect data accessing patterns. However, it is still an open problem to efficiently parallelize sparse matrix factorizations commonly used in iterative numerical problems. The difficulty is that a factorization process contains irregularly-interleaved communication and computation with varying granularities and it is hard to obtain scalable performance on distributed memory machines. In this paper, we present an inspector/executor approach for parallelizing such applications by embodying automatic graph scheduling techniques to optimize interleaved communication and computation. We describe a run-time system called RAPID that provides a set of library functions for specifying irregular data objects and tasks that access these objects. The system extracts a task dependence graph from data access patterns, and executes tasks efficiently on a distributed memory machine. We discuss a set of optimization strategies used in this system and demonstrate the application of this system in parallelizing sparse Cholesky and LU factorizations.

Proceedings ArticleDOI
05 Aug 1996
TL;DR: A general, practical method for handling sparse data that avoids held-out data and iterative reestimation is derived from first principles and has been tested on a part-of-speech tagging task and out-performed interpolation with context-independent weights.
Abstract: A general, practical method for handling sparse data that avoids held-out data and iterative reestimation is derived from first principles. It has been tested on a part-of-speech tagging task and out-performed (deleted) interpolation with context-independent weights, even when the latter used a globally optimal parameter setting determined a posteriori.

Proceedings ArticleDOI
01 Jan 1996
TL;DR: Alternative semi-regular distribution strategies which trade off the quality of loadbalance and locality for lower decomposition overheads and efficient lookup are evaluated.
Abstract: Sparse matrix problems are difficult to parallelize efficiently on distributed memory machines since non-zero elements are unevenly scattered and are accessed via multiple levels of indirection. Irregular distributions that achieve good load balance and locality are hard to compute, have high memory overheads and also lead to further indirection in locating distributed data. This paper evaluates alternative semi-regular distribution strategies which trade off the quality of loadbalance and locality for lower decomposition overheads and efficient lookup. The proposed techniques are compared to an irregular sparse matrix partitioned and the relative merits of each distribution method are outlined.

Journal ArticleDOI
TL;DR: In Saad's book, the classical relaxation methods such as the Jacobi, Gauss-Seidel, and SOR iterations are given a scant treatment in favor of an extensive treatment of Krylov subspace methods, which reflects the state of the art in iterative methods.
Abstract: For several decades, since the publication of early classics such as R. Varga's Matrix Iterative Analyszs and D.M. Young's Iterative Solution $Large Linear Systems, few books were published on iterative methods as a subject on its own. Certainly, Gene Golub and Charles Van Loan's Mutrzx Computations offered a very nice but short discussion. But there really wasn't a follow up to the classics that gave iterative methods a comprehensive, mathematical treatment. The situation took a happy turn in the last few years when several books with similar titles appeared on the subject. First, there was Wolfgang Hack-busch's Iterative Solution OfLarge Sparse Systems of Equations, followed by Owe Axelsson's Iterative Solutim Methods, and finally the book under review here. I have twice used the books by Hack-busch and Saad (separately) in graduate classes at UCLA and at the Chinese University of Hong Kong-Axelsson was a reference in each case. One cannot review any one of these three books without referring to the other two. A perusal of Varga's and Saad's tables of contents reveals how much the field has progressed over the more than 3 0 years between their publication. The chapter headings have almost nothing in common. Certainly, the solution of partial differential equations remains a central and important application of iterative methods.sition Methods have no counterparts in the earlier classic. In Saad's book, the classical relaxation methods such as the Jacobi, Gauss-Seidel, and SOR iterations are given a scant treatment in favor of an extensive treatment of Krylov subspace methods (the conjugate gradient method and its nonsymmetric count e r p s B C G , GMRES, QMR, etc.). Preconditioning techniques such as incomplete L U and domain decomposition take up most of the last five chapters. Parallel implementation of standard iterative methods as well as methods specifically motivated by par-allelism, such as polynomial precondi-tioners and multicoloring ideas, receive an extended two-chapter treatment. Dramatic as this difference in emphasis may seem, it merely reflects the state of the art in iterative methods. Many industrial codes still employ relaxation methods, but these days many are used as preconditioners wrapped within a Krylov subspace method such as GMRES (created about 10 years ago by Saad and M. Schultz). For those readers who want to know more about these new techniques, both in terms of the underlylng mathematics and the many algorithrmc variants, Saad's book is a Godsend and a must-have. All …

Journal ArticleDOI
TL;DR: This work proposes efficient hybrid methods for various representations of the sparse problems for Gauss-Newton method improvements in the case of large residual or ill-conditioned nonlinear least-square problems.
Abstract: Hybrid methods are developed for improving the Gauss-Newton method in the case of large residual or ill-conditioned nonlinear least-square problems. These methods are used usually in a form suitable for dense problems. But some standard approaches are unsuitable, and some new possibilities appear in the sparse case. We propose efficient hybrid methods for various representations of the sparse problems. After describing the basic ideas that help deriving new hybrid methods, we are concerned with designing hybrid methods for sparse Jacobian and sparse Hessian representations of the least-square problems. The efficiency of hybrid methods is demonstrated by extensive numerical experiments.

Journal ArticleDOI
TL;DR: This article demonstrates how portable matrix libraries for iterative solution of linear systems may be written using data encapsulation techniques and defines an interface for the calling sequences for the functions that act on the data.
Abstract: Over the past few years several proposals have been made for the standardization of sparse matrix storage formats in order to allow for the development of portable matrix libraries for the iterative solution of linear systems. We believe that this is the wrong approach. Rather than define one standard (or a small number of standards) for matrix storage, the community should define an interface (i.e., the calling sequences) for the functions that act on the data. In addition, we cannot ignore the interface to the vector operations because, in many applications, vectors may not be stored as consecutive elements in memory. With the acceptance of shared memory, distributed memory, and cluster memory parallel machines, the flexibility of the distribution of the elements of vectors is also extremely important. This issue is ignored in most proposed standards. In this article we demonstrate how such libraries may be written using data encapsulation techniques.

Proceedings ArticleDOI
11 Jun 1996
TL;DR: The studies reveal that the proposed sparse matrix inversion algorithm significantly reduces the time taken for obtaining the solution of the snake problem.
Abstract: In this paper, we present techniques for inverting sparse, symmetric and positive definite matrices on parallel and distributed computers. We propose two algorithms, one for SIMD implementation and the other for MIMD implementation. These algorithms are modified versions of Gaussian elimination and they take into account the sparseness of the matrix. Our algorithms perform better than the general parallel Gaussian elimination algorithm. In order to demonstrate the usefulness of our technique, we implemented the snake problem using our sparse matrix algorithm. Our studies reveal that the proposed sparse matrix inversion algorithm significantly reduces the time taken for obtaining the solution of the snake problem. In this paper, we present the results of our experimental work.

Journal Article
TL;DR: In this article, it was shown that if P has sparse hard sets under log-space many-one reductions, then P is a subset of DSPACE[log 2 n].
Abstract: In 1978, Hartmanis conjectured that there exist no sparse complete sets for P under logspace many-one reductions. In this paper, in support of the conjecture, it is shown that if P has sparse hard sets under logspace many-one reductions, then P is a subset of DSPACE[log^2 n].

Journal ArticleDOI
01 Dec 1996
TL;DR: The methods and implementation techniques used for the nonsymmetric sparse linear system solver, mcsparse on the Cedar system are described and a novel reordering scheme upon which the solver is based is presented.
Abstract: In this paper, the methods and implementation techniques used for the nonsymmetric sparse linear system solver, mcsparse on the Cedar system are described. A novel reordering scheme (H∗) upon which the solver is based is presented. The tradeoffs discussed include stability and fill-in control, hierarchical parallelism, and load balancing. Experimental results demonstrating the effectiveness of the solver with respect to each of these issues are presented. We also address the implications of this work for other parallel processing systems.

Book ChapterDOI
19 Aug 1996
TL;DR: Some initial performance results are presented which suggest the usefulness of parspai for tackling such large size systems on present day dmpps in a reasonable time.
Abstract: A parallel implementation of a sparse approximate inverse (spai) preconditioner for distributed memory parallel processors (dmpp) is presented. The fundamental spai algorithm is known to be a useful tool for improving the convergence of iterative solvers for ill-conditioned linear systems. The parallel implementation (parspai) exploits the inherent parallelism in the spai algorithm and the data locality on the dmpps, to solve structurally symmetric (but non-symmetric) matrices, which typically arise when solving partial differential equations (pdes). Some initial performance results are presented which suggest the usefulness of parspai for tackling such large size systems on present day dmpps in a reasonable time.

Journal ArticleDOI
TL;DR: A general sparse algorithm using the zeros inside the sparse global stiffness matrix of a large space structure for optimization of large structures is developed and is substantially more efficient than that based on the banded matrix approach.
Abstract: Exploiting the zeros inside the sparse global stiffness matrix of a large space structure, a general sparse algorithm has been developed for optimization of large structures. An indirect reference data structure has been used to store the nonzero elements of the stiffness matrix in compact form. The optimization approach used is the optimally criteria method with stress, displacement, and fabricational constraints. The algorithm has been applied to minimum weight design of three space structures with the largest one having 2024 members. The performance of the algorithm is compared with the widely-used banded matrix approach. The new algorithm is substantially more efficient than that based on the banded matrix approach.

Proceedings ArticleDOI
07 May 1996
TL;DR: The key result is a theorem which shows a simple condition that a sequence has to satisfy for it to converge to a sparse limiting solution, and three approaches to incorporate this condition into optimization problems are presented.
Abstract: This paper presents affine scaling transformation based methods for finding low complexity sparse solutions to optimization problems. The methods achieve sparse solutions in a more general context, and generalize our earlier work on FOCUSS developed to deal with the underdetermined linear inverse problem. The key result is a theorem which shows a simple condition that a sequence has to satisfy for it to converge to a sparse limiting solution. Three approaches to incorporate this condition into optimization problems are presented. These consist of either imposing the condition as an additional optimization constraint, or suitably modifying the cost function, or using a combination of the two. The benefits of the methodology when applied to the linear inverse problem are twofold. Firstly, it allows for the treatment of the overdetermined problem in addition to the underdetermined problem, and secondly it enables establishing sufficient conditions under which regularized versions of FOCUSS are assured of convergence to sparse solutions.

Proceedings ArticleDOI
01 Jun 1996
TL;DR: A definition of electro static potential is proposed that can be used to formulate sparse approximations of the electrostatic potential matrix in both uniform and multilayered planar dielectrics and are provably positive definite for the troublesome cases with a uniform dielectric and without a groundplane.
Abstract: Boundary element methods (BEM) are often used for complex 3D capacitance extraction because of their efficiency, ease of data preparation, and automatic handling of open regions. BEM capacitance extraction, however, yields a dense set of linear equations that makes solving via direct matrix methods such as Gaussian elimination prohibitive for large problem sizes. Although iterative, multipole-accelerated techniques have produced dramatic improvements in BEM capacitance extraction, accurate sparse approximations of the electrostatic potential matrix are still desirable for the following reasons. First, the corresponding capacitance models are sufficient for a large number of analysis and design applications. Moreover, even when the utmost accuracy is required, sparse approximations can be used to precondition iterative solution methods. We propose a definition of electrostatic potential that can be used to formulate sparse approximations of the electrostatic potential matrix in both uniform and multilayered planar dielectrics. Any degree of sparsity can be obtained, and unlike conventional techniques which discard the smallest matrix terms, these approximations are provably positive definite for the troublesome cases with a uniform dielectric and without a groundplane.

Journal ArticleDOI
TL;DR: A method is presented which discretizes elliptic differential equations on curvilinear bounded domains with adaptive sparse grids which has the same behaviour of convergence like the sparse grid discretization on the unit square.
Abstract: Elliptic differential equations can be discretized with bilinear finite elements. Using sparse grids instead of full grids, the dimension of the finite element space for the 2D problem reduces from O(N 2 ) to O (N log N) while the approximation properties are nearly the same for smooth functions. A method is presented which discretizes elliptic differential equations on curvilinear bounded domains with adaptive sparse grids. The grid is generated by a transformation of the domain. This method has the same behaviour of convergence like the sparse grid discretization on the unit square.

Book ChapterDOI
Edith Cohen1
03 Jun 1996
TL;DR: The essence of the nonzero structure and hence, a near-optimal order of multiplications, can be determined in near-linear time in the number of nonzero entries, which is much smaller than the time required for the multiplications.
Abstract: We consider the problem of predicting the nonzero structure of a product of two or more matrices. Prior knowledge of the nonzero structure can be applied to optimize memory allocation and to determine the optimal multiplication order for a chain product of sparse matrices. We adapt a recent algorithm by the author and show that the essence of the nonzero structure and hence, a near-optimal order of multiplications, can be determined in near-linear time in the number of nonzero entries, which is much smaller than the time required for the multiplications. An experimental evaluation of the algorithm demonstrates that it is practical for matrices of order 103 with 104 nonzeros (or larger). A relatively small pre-computation results in a large time saved in the computation-intensive multiplication.


Journal ArticleDOI
TL;DR: This paper reports on experiences with the direct sparse matrix solvers MA28 by Duff, Y12M by Zlatev et al, and one special-purpose matrix solver, all embedded in the parallel ODE solver PSODE by Sommeijer.
Abstract: The use of implicit methods for numerically solving stiff systems of differential equations requires the solution of systems of nonlinear equations. Normally these are solved by a Newtontype process, in which we have to solve systems of linear equations. The Jacobian of the derivative function determines the structure of the matrices of these linear systems. Since it often occurs that the components of the derivative function only depend on a small number of variables, the system can be considerably sparse. Hence, it can be worth the effort to use a sparse matrix solver instead of a dense LU -decomposition. This paper reports on experiences with the direct sparse matrix solvers MA28 by Duff [1], Y12M by Zlatev et al. [2] and one special-purpose matrix solver, all embedded in the parallel ODE solver PSODE by Sommeijer [3].

Book ChapterDOI
01 Jan 1996
TL;DR: Some criteria, such as the average data parallel computation ratio, are introduced, to evaluate and compare data parallel algorithms and propose a data parallel preconditioned conjugate gradient algorithm using these matrix vector operations.
Abstract: We first present data parallel algorithms for classical linear algebra methods. We analyze some of the main problems that a user has to solve. As examples, we propose data parallel algorithms for the Gauss and Gauss-Jordan methods. Thus, we introduce some criteria, such as the average data parallel computation ratio, to evaluate and compare data parallel algorithms. Our studies include both dense and sparse matrix computations. We describe in detail a data parallel structure to map general sparse matrices and we present data parallel sparse matrixvector multiplication. Then, we propose a data parallel preconditioned conjugate gradient algorithm using these matrix vector operations.

Journal ArticleDOI
TL;DR: The use of wavelet bases are used to create a sparse approximation of the fully populated matrix that one obtains using an integral formulation like charge simulation or surface charge simulation for numerically solving Laplace's equation with mixed boundary conditions.
Abstract: This paper describes the use of wavelet bases to create a sparse approximation of the fully populated matrix that one obtains using an integral formulation like charge simulation or surface charge simulation for numerically solving Laplace's equation with mixed boundary conditions. The sparse approximation is formed by a similarity transform of the N/spl times/N coefficient matrix, and the cost of the one employed here is of optimal order N/sup 2/. We must emphasize that benefits of computing with a sparse matrix typically do not justify the costs of the transformation, unless the problem has multiple right hand sides, i.e. one wants to simulate multiple excitation modes. The special orthogonal matrices we need for the similarity transform are built from wavelet bases. Wavelets are a well studied and mature topic in pure and applied mathematics, however, the fundamental ideas are probably new to many researchers interested in electrostatic field computation. Towards this end an important purpose of this paper is to describe some of the basic concepts of multiresolutional analysis using wavelet bases.