scispace - formally typeset
Search or ask a question

Showing papers on "Sparse approximation published in 1998"


Journal ArticleDOI
TL;DR: If the data are noiseless, the modified version of basis pursuit denoising proposed in this article is equivalent to SVM in the following sense: if applied to the same data set, the two techniques give the same solution, which is obtained by solving the same quadratic programming problem.
Abstract: In the first part of this paper we show a similarity between the principle of Structural Risk Minimization Principle (SRM) (Vapnik, 1982) and the idea of Sparse Approximation, as defined in (Chen, Donoho and Saunders, 1995) and Olshausen and Field (1996). Then we focus on two specific (approximate) implementations of SRM and Sparse Approximation, which have been used to solve the problem of function approximation. For SRM we consider the Support Vector Machine technique proposed by V. Vapnik and his team at AT\&T Bell Labs, and for Sparse Approximation we consider a modification of the Basis Pursuit De-Noising algorithm proposed by Chen, Donoho and Saunders (1995). We show that, under certain conditions, these two techniques are equivalent: they give the same solution and they require the solution of the same quadratic programming problem.

538 citations


Proceedings ArticleDOI
16 Aug 1998
TL;DR: It is shown how sparse coding can be used to extract wavelet-like features from natural image data and how to apply a soft-thresholding operator on the components of sparse coding in order to reduce Gaussian noise.
Abstract: Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to the techniques of independent component analysis and blind source separation. In this paper, we investigate the application of sparse coding for image feature extraction. We show how sparse coding can be used to extract wavelet-like features from natural image data. As an application of such a feature extraction scheme, we show how to apply a soft-thresholding operator on the components of sparse coding in order to reduce Gaussian noise. Methods based on sparse coding have the important benefit over wavelet methods that the features are determined solely by the statistical properties of the data, while the wavelet transformation relies heavily on certain abstract mathematical properties that may be only weakly related to the properties of the natural data.

95 citations


Journal ArticleDOI
TL;DR: A new general representation for a function is derived as a linear combination of local correlation kernels at optimal sparse locations (and scales) and its relation to principal component analysis, regularization, sparsity principles, and support vector machines is characterized.
Abstract: We derive a new general representation for a function as a linear combination of local correlation kernels at optimal sparse locations (and scales) and characterize its relation to principal component analysis, regularization, sparsity principles, and support vector machines.

88 citations


Journal Article
TL;DR: This work discusses approaches for an efficient handling of the correction equation in the Jacobi-Davidson method and discusses how to restrict this preconditioner effectively to the subspace orthogonal to the current eigenvector.
Abstract: We discuss approaches for an efficient handling of the correction equation in the Jacobi-Davidson method. The correction equation is effective in a subspace orthogonal to the current eigenvector approximation. The operator in the correction equation is a dense matrix, but it is composed from three factors that allow for a sparse representation. If the given matrix eigenproblem is sparse then one often aims for the construction of a preconditioner for that matrix. We discuss how to restrict this preconditioner effectively to the subspace orthogonal to the current eigenvector. The correction equation itself is formulated in terms of approximations for an eigenpair. In order to avoid misconvergence one has to make the right selection for the approximations, and this aspect will be discussed as well.

85 citations


Proceedings Article
01 Dec 1998
TL;DR: This paper shows how sparse coding can be used for denoising, and uses maximum likelihood estimation of nongaussian variables corrupted by gaussian noise to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise.
Abstract: Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reduction and independent component analysis, and has some neurophysiological plausibility. In this paper, we show how sparse coding can be used for denoising. Using maximum likelihood estimation of nongaussian variables corrupted by gaussian noise, we show how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise. Furthermore, we show how to choose the optimal sparse coding basis for denoising. Our method is closely related to the method of wavelet shrinkage, but has the important benefit over wavelet methods that both the features and the shrinkage parameters are estimated directly from the data.

69 citations


Proceedings ArticleDOI
04 May 1998
TL;DR: This work shows how sparse coding can be used for denoising, using methods reminiscent of wavelet theory to apply a soft-thresholding operator on the components of sparse coding in order to reduce Gaussian noise.
Abstract: Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reduction and independent component analysis, and has some neurophysiological plausibility. We show how sparse coding can be used for denoising. Using methods reminiscent of wavelet theory, we show how to apply a soft-thresholding operator on the components of sparse coding in order to reduce Gaussian noise. Our method has the important benefit over wavelet methods that the transformation is determined solely by the statistical properties of the data. The wavelet transformation, on the other hand, relies heavily on certain abstract mathematical properties that may be only weakly related to the properties of the natural data. Experiments on image data are reported.

59 citations


Proceedings Article
01 Jan 1998
TL;DR: This work uses the principles of maximum likelihood estimation to construct a method for decomposing signals into a weighted sum of chirped Gabor functions, and proposes sub-optimal estimators for the chirp parameters, and presents a novel method for estimating chirP rate.
Abstract: We use the principles of maximum likelihood estimation to construct a method for decomposing signals into a weighted sum of chirped Gabor functions. This method provides a sparse representation of the signal similar to basis and matching pursuit methods. However since the parameters of the chirps are estimated rather than discretized, the "dictionary" is essentially of infinite size. Since the maximum likelihood estimator requires excessive computations, we propose sub-optimal estimators for the chirp parameters, and present a novel method for estimating chirp rate.

57 citations


Journal ArticleDOI
TL;DR: A significant speed improvement is realized over that of the standard LU factorization of this matrix, and the method presented is referred to as the LU sparse integral factored representation (LUSIFER).
Abstract: All of the matrices which arise in the method-of-moments solution of scattering and antenna problems have a hidden structure. This structure is due to the physics of electromagnetic interactions. Matrix-algebra routines are used to uncover this structure in moment-method matrices, after they have been calculated. This structure is used to create a sparse representation of the matrix. Although this step involves an approximation, the error involved can be nearly as small as the precision of the calculation. Then, without further approximation, a sparse representation of the LU factorization of this matrix is computed. A significant speed improvement is realized over that of the standard LU factorization of this matrix. The resulting method can be added to any of a variety of moment-method programs to solve the matrix problem more quickly, and with less computer memory. For large problems this is the time-critical operation, so this allows larger problems to be solved. The computer program we have written can be used immediately with most moment-method programs, since it amounts to simply a better matrix-inversion package. The method presented is referred to as the LU sparse integral factored representation (LUSIFER).

56 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: This paper addresses image and signal processing problems where the result most consistent with prior knowledge is the minimum order, or "maximally sparse" solution.
Abstract: This paper addresses image and signal processing problems where the result most consistent with prior knowledge is the minimum order, or "maximally sparse" solution. These problems arise in such diverse areas as astronomical star image deblurring, neuromagnetic image reconstruction, seismic deconvolution, and thinned array beamformer design. An optimization theoretic formulation for sparse solutions is presented, and its relationship to the MUSIC algorithm is discussed. Two algorithms for sparse inverse problems are introduced, and examples of their application to beamforming array design and star image deblurring are presented.

53 citations


Proceedings ArticleDOI
30 Mar 1998
TL;DR: This work shows that a rather simple sequential cache-efficient algorithm provides significantly better performance than existing algorithms for sparse matrix multiplication, and describes a multithreaded implementation of this simple algorithm that scales well with the number of threads and CPUs.
Abstract: Several fast sequential algorithms have been proposed in the past to multiply sparse matrices These algorithms do not explicitly address the impact of caching on performance We show that a rather simple sequential cache-efficient algorithm provides significantly better performance than existing algorithms for sparse matrix multiplication We then describe a multithreaded implementation of this simple algorithm and show that its performance scales well with the number of threads and CPUs For 10% sparse, 500/spl times/500 matrices, the multithreaded version running on 4-CPU systems provides more than a 411-fold speed increase over the well-known BLAS routine and a 146 fold and 446-fold speed increase over two other recent techniques for fast sparse matrix multiplication, both of which are relatively difficult to parallelize efficiently

52 citations


Journal ArticleDOI
TL;DR: Different methods for computing a sparse approximate inverse M for a given sparse matrix A by minimizing |AM-E| in the Frobenius norm are investigated and how to take full advantage of the sparsity of A is shown.
Abstract: We investigate different methods for computing a sparse approximate inverse M for a given sparse matrix A by minimizing |AM-E| in the Frobenius norm. Such methods are very useful for deriving preconditioners in iterative solvers, especially in a parallel environment. We compare different strategies for choosing the sparsity structure of M and different ways for solving the small least-squares problem that are related to the computation of each column of M. Especially we show how we can take full advantage of the sparsity of A. Furthermore, we give assistance how to design and apply an algorithm for computing sparse approximate inverses for a general sparse matrix.

Proceedings ArticleDOI
12 May 1998
TL;DR: A general affine scaling optimization algorithm is given that converges to a sparse solution for measures chosen from within this subclass of the Schur-concave functions.
Abstract: A general framework based on majorization, Schur-concavity, and concavity is given that facilitates the analysis of algorithm performance and clarifies the relationships between existing proposed diversity measures useful for best basis selection. Admissible sparsity measures are given by the Schur-concave functions, which are the class of functions consistent with the partial ordering on vectors known as majorization. Concave functions form an important subclass of the Schur-concave functions which attain their minima at sparse solutions to the basis selection problem. Based on a particular functional factorization of the gradient, we give a general affine scaling optimization algorithm that converges to a sparse solution for measures chosen from within this subclass.

Journal ArticleDOI
TL;DR: The data structures and communication parameters required by these utilities for performing unstructured sparse matrix–vector multiplications on distributed-memory message-passing computers for general sparse unstructuring matrices with data locality are presented.
Abstract: In this paper we describe general software utilities for performing unstructured sparse matrix–vector multiplications on distributed-memory message-passing computers. The matrix–vector multiply comprises an important kernel in the solution of large sparse linear systems by iterative methods. Our focus is to present the data structures and communication parameters required by these utilities for general sparse unstructured matrices with data locality. These types of matrices are commonly produced by finite difference and finite element approximations to systems of partial differential equations. In this discussion we also present representative examples and timings which demonstrate the utility and performance of the software. © 1998 John Wiley & Sons, Ltd.

Proceedings ArticleDOI
01 Aug 1998
TL;DR: An e cient algorithm is given to compute a certi cate of inconsistency for a black box linear system over a eld to certifying that a sparse Diophantine linear system of integer equations has no integer solutions, even when it may have rational solutions.
Abstract: Randomized black box algorithms provide a very e cient means for solving sparse linear systems over arbitrary elds. However, when these probabilistic algorithms fail, it is not revealed whether no solution exists or whether the algorithm simply made unlucky random choices. Here we give an e cient algorithm to compute a certi cate of inconsistency for a black box linear system over a eld. Our method requires a black box for the transpose of the matrix. The cost of producing the certi cate is shown to be about the same as that of solving the system in the black box model, while the cost of applying a given certi cate to prove inconsistency is much smaller. We also give an e cient algorithm for certifying that a sparse Diophantine linear system of integer equations has no integer solutions, even when it may have rational solutions.

Journal ArticleDOI
TL;DR: A new method, based on the nested dissection heuristic, provides significantly better orderings than the most commonly used ordering method, minimum degree, on a variety of large-scale linear programming problems.
Abstract: The main cost of solving a linear programming problem using an interior point method is usually the cost of solving a series of sparse, symmetric linear systems of equations, AΘATx = b. These systems are typically solved using a sparse direct method. The first step in such a method is a reordering of the rows and columns of the matrix to reduce fill in the factor and/or reduce the required work. This article evaluates several methods for performing fill-reducing ordering on a variety of large-scale linear programming problems. We find that a new method, based on the nested dissection heuristic, provides significantly better orderings than the most commonly used ordering method, minimum degree.

Journal ArticleDOI
TL;DR: In this article, sparse matrix representations of pseudodifferential operators are derived for the Bremmer series solution of the wave equation in generally inhomogeneous media, and an optimization procedure is followed to minimize the errors, in the high-frequency limit, for a given discretization rate.
Abstract: The Bremmer series solution of the wave equation in generally inhomogeneous media, requires the introduction of pseudodifferential operators. In this paper, sparse matrix representations of these pseudodifferential operators are derived. The authors focus on designing sparse matrices, keeping the accuracy high at the cost of ignoring any critical scattering-angle phenomena. Such matrix representations follow from rational approximations of the vertical slowness and the transverse Laplace operator symbols, and of the vertical derivative, as they appear in the parabolic equation method. Sparse matrix representations lead to a fast algorithm. An optimization procedure is followed to minimize the errors, in the high-frequency limit, for a given discretization rate. The Bremmer series solver consists of three steps: directional decomposition into up- and downgoing waves, one-way propagation, and interaction of the counterpropagating constituents. Each of these steps is represented by a sparse matrix equation. The resulting algorithm provides an improvement of the parabolic equation method, in particular for transient wave phenomena, and extends the latter method, systematically, for backscattered waves.


Book ChapterDOI
14 Jun 1998
TL;DR: An overview of the algorithms, implementation aspects, performance results, and the user interface of WSSMP is given.
Abstract: The Watson Symmetric Sparse Matrix Package, WSSMP, is a high-performance, robust, and easy to use software package for solving large sparse symmetric systems of linear equations. It can can be used as a serial package, or in a shared-memory multiprocessor environment, or as a scalable parallel solver in a message-passing environment, where each node can either be a uniprocessor or a shared-memory multiprocessor. WSSMP uses scalable parallel multifrontal algorithms for sparse symmetric factorization and triangular solves. Sparse symmetric factorization in WSSMP has been clocked at up to 210 MFLOPS on an RS6000/590, 500 MFLOPS on an RS6000/397 and in excess of 20 GFLOPS on a 64-node SP with RS6000/397 nodes. This paper gives an overview of the algorithms, implementation aspects, performance results, and the user interface of WSSMP.

Proceedings ArticleDOI
01 Jun 1998
TL;DR: A probabilistic model for the prediction of the number of misses on a K-way associative cache memory considering sparse matrices with a uniform or banded distribution is presented.
Abstract: While much work has been devoted to the study of cache behavior during the execution of codes with regular access patterns, little attention has been paid to irregular codes. An important portion of these codes are scientific applications that handle compressed sparse matrices. In this work a probabilistic model for the prediction of the number of misses on a K-way associative cache memory considering sparse matrices with a uniform or banded distribution is presented. Two different irregular kernels are considered: the sparse matrix-vector product and the transposition of a sparse matrix. The model was validated with simulations on synthetic uniform matrices and banded matrices from the Harwell-Boeing collection.

Journal ArticleDOI
TL;DR: This article discusses some of the experiences with a completely different approach to the generation of sparse primitives with the implementation of a “sparse compiler” that is capable of automatically converting a dense program into sparse code.
Abstract: Primitives in mathematical software are usually written and optimized by hand. With the implementation of a “sparse compiler” that is capable of automatically converting a dense program into sparse code, however, a completely different approach to the generation of sparse primitives can be taken. A dense implementation of a particular primitive is supplied to the sparse compiler, after which it can be converted into many different sparse versions of this primitive. Each version is specifically tailored to a class of sparse matrices having a specific nonzero structure. In this article, we discuss some of our experiences with this new approach.

Proceedings ArticleDOI
04 May 1998
TL;DR: It is shown that a sparse distributed memory (SDM) model using sparse codes and a suitable activation mechanism can be used as a clean-up memory and it is proved that the retrieval of the constituent parts can be made arbitrarily exact with a growing memory.
Abstract: An important property for any memory system is the ability to form higher-level concepts from lower-level ones in a robust way. This process is in the article called chunking. It is also important that such higher-level concepts can be analyzed, i.e., broken down into their constituent parts. This is called probing and clean-up. These issues have previously been treated for vectors of real numbers and for dense binary patterns. Using sparse codes instead of dense ones has many advantages. The paper shows how to define robust chunking operations for such sparse codes. It is shown that a sparse distributed memory (SDM) model using sparse codes and a suitable activation mechanism can be used as a clean-up memory. It is proved that the retrieval of the constituent parts can be made arbitrarily exact with a growing memory. This is so even if we let the load increase to infinity.


Proceedings ArticleDOI
25 Aug 1998
TL;DR: A probabilistic model to estimate the number of misses on a set associative cache with an LRU replacement algorithm is introduced and some new results focusing on different types of distributions that usually appear in some well-known real matrices suites, such as the Harwell-Boeing or NEP are presented.
Abstract: A probabilistic model to estimate the number of misses on a set associative cache with an LRU replacement algorithm is introduced. Such modeling has been used by our group in previous work for sparse matrices with a uniform distribution of the non-zero elements. We present some new results focusing on different types of distributions that usually appear in some well-known real matrices suites, such as the Harwell-Boeing or NEP.

Book ChapterDOI
01 Sep 1998
TL;DR: A probabilistic model for the prediction of the number of misses on a direct mapped cache memory considering sparse matrices with an uniform distribution is presented.
Abstract: Many scientific applications handle compressed sparse matrices. Cache behavior during the execution of codes with irregular access patterns, such as those generated by this type of matrices, has not been widely studied. In this work a probabilistic model for the prediction of the number of misses on a direct mapped cache memory considering sparse matrices with an uniform distribution is presented. As an example of the potential usability of such types of models, and taking into account the state of the art with respect to high performance superscalar and/or superpipelined CPUs with a multilevel memory hierarchy, we have modeled the cache behavior of an optimized sparse matrix-dense matrix product algorithm including blocking at the memory and register levels.

01 Dec 1998
TL;DR: A new paradigm for signal reconstruction and superresolution, Correlation Kernel Analysis (CKA), that is based on the selection of a sparse set of bases from a large dictionary of class- specific basis functions, which concludes that when used with a sparse representation technique, the correlation function is an effective kernel for image reconstruction andsuperresolution.
Abstract: This paper presents a new paradigm for signal reconstruction and superresolution, Correlation Kernel Analysis (CKA), that is based on the selection of a sparse set of bases from a large dictionary of class- specific basis functions The basis functions that we use are the correlation functions of the class of signals we are analyzing To choose the appropriate features from this large dictionary, we use Support Vector Machine (SVM) regression and compare this to traditional Principal Component Analysis (PCA) for the tasks of signal reconstruction, superresolution, and compression The testbed we use in this paper is a set of images of pedestrians This paper also presents results of experiments in which we use a dictionary of multiscale basis functions and then use Basis Pursuit De-Noising to obtain a sparse, multiscale approximation of a signal The results are analyzed and we conclude that 1) when used with a sparse representation technique, the correlation function is an effective kernel for image reconstruction and superresolution, 2) for image compression, PCA and SVM have different tradeoffs, depending on the particular metric that is used to evaluate the results, 3) in sparse representation techniques, L_1 is not a good proxy for the true measure of sparsity, L_0, and 4) the L_epsilon norm may be a better error metric for image reconstruction and compression than the L_2 norm, though the exact psychophysical metric should take into account high order structure in images

Book ChapterDOI
02 Sep 1998
TL;DR: This paper shows how sparse coding can be used for denoising, and shows how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise.
Abstract: Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reduction and independent component analysis, and has some neurophysiological plausibility. In this paper, we show how sparse coding can be used for denoising. Using maximum likelihood estimation of nongaussian variables corrupted by gaussian noise, we show how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise. A theoretical analysis of the denoising capability of the method is given, and it is shown how to choose the optimal basis for sparse coding.

Journal ArticleDOI
TL;DR: Throughput approaching that of large dense matrix factorizations is demonstrated on two data-parallel systems, the MasPar MP-2 and the Thinking Machines CM-5.
Abstract: Sparse matrix factorization is a computational bottleneck in many scientific and engineering problems. This paper examines the problem of factoring large sparse matrices on data-parallel computers. A multifrontal approach is presented in which only the fine-grain concurrency found within the elimination of each supernode is exploited. Throughput approaching that of large dense matrix factorizations is demonstrated on two data-parallel systems, the MasPar MP-2 and the Thinking Machines CM-5.

Journal ArticleDOI
TL;DR: It is proved that on most local memory machines with p processors, this computation of the product y = Ax requires Ω ((n/p) \log p) time on the average, and the same lower bound also holds, in the worst case, for matrices with only 2n or 3n nonzero elements.
Abstract: In this paper we consider the problem of computing on a local memory machine the product y = Ax,where A is a random n×n sparse matrix with Θ (n) nonzero elements. To study the average case communication cost of this problem, we introduce four different probability measures on the set of sparse matrices. We prove that on most local memory machines with p processors, this computation requires Ω ((n/p) \log p) time on the average. We prove that the same lower bound also holds, in the worst case, for matrices with only 2n or 3n nonzero elements.

Book ChapterDOI
07 Aug 1998
TL;DR: A small set of new extensions to HPF-2 are proposed to parallelize sparse matrix computations, supporting part of the new capabilities on a runtime library and evaluating on a Cray T3E.
Abstract: There is a class of sparse matrix computations, such as direct solvers of systems of linear equations, that change the fill-in (nonzero entries) of the coefficient matrix, and involve row and column operations (pivoting). This paper addresses the problem of the parallelization of these sparse computations from the point of view of the parallel language and the compiler. Dynamic data structures for sparse matrix storage are analyzed, permitting to efficiently deal with fill-in and pivoting issues. Any of the data representations considered enforces the handling of indirections for data accesses, pointer referencing and dynamic data creation. All of these elements go beyond current data-parallel compilation technology. We propose a small set of new extensions to HPF-2 to parallelize these codes, supporting part of the new capabilities on a runtime library. This approach has been evaluated on a Cray T3E, implementing, in particular, the sparse LU factorization.

Journal ArticleDOI
TL;DR: This work reformulated the selection of a few responding units tuned to patterns of activity from the Fourier spectrum as a multilayer system composed of four stages of different kinds of feature detectors that produces a sparse representation for images that produces an organization according to object power discrimination.