scispace - formally typeset
Search or ask a question

Showing papers on "Sparse approximation published in 2007"


Journal ArticleDOI
TL;DR: An algorithm based on an enhanced sparse representation in transform domain based on a specially developed collaborative Wiener filtering achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.
Abstract: We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call "groups." Collaborative Altering is a special procedure developed to deal with these 3D groups. We realize it using the three successive steps: 3D transformation of a group, shrinkage of the transform spectrum, and inverse 3D transformation. The result is a 3D estimate that consists of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative filtering reveals even the finest details shared by grouped blocks and, at the same time, it preserves the essential unique features of each individual block. The filtered blocks are then returned to their original positions. Because these blocks are overlapping, for each pixel, we obtain many different estimates which need to be combined. Aggregation is a particular averaging procedure which is exploited to take advantage of this redundancy. A significant improvement is obtained by a specially developed collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its efficient implementation are presented in full detail; an extension to color-image denoising is also developed. The experimental results demonstrate that this computationally scalable algorithm achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.

7,912 citations


Journal ArticleDOI
TL;DR: Practical incoherent undersampling schemes are developed and analyzed by means of their aliasing interference and demonstrate improved spatial resolution and accelerated acquisition for multislice fast spin‐echo brain imaging and 3D contrast enhanced angiography.
Abstract: The sparsity which is implicit in MR images is exploited to significantly undersample k -space. Some MR images such as angiograms are already sparse in the pixel representation; other, more complicated images have a sparse representation in some transform domain–for example, in terms of spatial finite-differences or their wavelet coefficients. According to the recently developed mathematical theory of compressedsensing, images with a sparse representation can be recovered from randomly undersampled k -space data, provided an appropriate nonlinear recovery scheme is used. Intuitively, artifacts due to random undersampling add as noise-like interference. In the sparse transform domain the significant coefficients stand out above the interference. A nonlinear thresholding scheme can recover the sparse coefficients, effectively recovering the image itself. In this article, practical incoherent undersampling schemes are developed and analyzed by means of their aliasing interference. Incoherence is introduced by pseudo-random variable-density undersampling of phase-encodes. The reconstruction is performed by minimizing the 1 norm of a transformed image, subject to data

6,653 citations


Journal ArticleDOI
TL;DR: This paper considers the optimization of compressed sensing projections, and targets an average measure of the mutual coherence of the effective dictionary, and shows that this leads to better CS reconstruction performance.
Abstract: Compressed sensing (CS) offers a joint compression and sensing processes, based on the existence of a sparse representation of the treated signal and a set of projected measurements. Work on CS thus far typically assumes that the projections are drawn at random. In this paper, we consider the optimization of these projections. Since such a direct optimization is prohibitive, we target an average measure of the mutual coherence of the effective dictionary, and demonstrate that this leads to better CS reconstruction performance. Both the basis pursuit (BP) and the orthogonal matching pursuit (OMP) are shown to benefit from the newly designed projections, with a reduction of the error rate by a factor of 10 and beyond.

834 citations


Journal ArticleDOI
TL;DR: The experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms.
Abstract: Motivation: Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space. Results: In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms. Availability: The software is available as supplementary material. Contact:hskim@cc.gatech.edu, hpark@acc.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online.

813 citations


Journal ArticleDOI
TL;DR: Based on the concept of automatic relevance determination, this paper uses an empirical Bayesian prior to estimate a convenient posterior distribution over candidate basis vectors and consistently places its prominent posterior mass on the appropriate region of weight-space necessary for simultaneous sparse recovery.
Abstract: Given a large overcomplete dictionary of basis vectors, the goal is to simultaneously represent L>1 signal vectors using coefficient expansions marked by a common sparsity profile. This generalizes the standard sparse representation problem to the case where multiple responses exist that were putatively generated by the same small subset of features. Ideally, the associated sparse generating weights should be recovered, which can have physical significance in many applications (e.g., source localization). The generic solution to this problem is intractable and, therefore, approximate procedures are sought. Based on the concept of automatic relevance determination, this paper uses an empirical Bayesian prior to estimate a convenient posterior distribution over candidate basis vectors. This particular approximation enforces a common sparsity profile and consistently places its prominent posterior mass on the appropriate region of weight-space necessary for simultaneous sparse recovery. The resultant algorithm is then compared with multiple response extensions of matching pursuit, basis pursuit, FOCUSS, and Jeffreys prior-based Bayesian methods, finding that it often outperforms the others. Additional motivation for this particular choice of cost function is also provided, including the analysis of global and local minima and a variational derivation that highlights the similarities and differences between the proposed algorithm and previous approaches.

796 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination.
Abstract: Given a covariance matrix, we consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination This problem arises in the decomposition of a covariance matrix into sparse factors or sparse principal component analysis (PCA), and has wide applications ranging from biology to finance We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming-based relaxation for our problem We also discuss Nesterov's smooth minimization technique applied to the semidefinite program arising in the semidefinite relaxation of the sparse PCA problem The method has complexity $O(n^4 \sqrt{\log(n)}/\epsilon)$, where $n$ is the size of the underlying covariance matrix and $\epsilon$ is the desired absolute accuracy on the optimal value of the problem

699 citations


Journal ArticleDOI
TL;DR: It is shown that shearlets, an affine-like system of functions recently introduced by the authors and their collaborators, are essentially optimal in representing 2-dimensional functions f which are C^2 except for discontinuities along $C^2$ curves.
Abstract: In this paper we show that shearlets, an affine-like system of functions recently introduced by the authors and their collaborators, are essentially optimal in representing 2-dimensional functions f which are $C^2$ except for discontinuities along $C^2$ curves. More specifically, if $f_N^S$ is the N-term reconstruction of f obtained by using the N largest coefficients in the shearlet representation, then the asymptotic approximation error decays as $ orm{f-f_N^S}_2^2 \asymp N^{-2} (\log N)^3, N \to \infty,$ which is essentially optimal, and greatly outperforms the corresponding asymptotic approximation rate $N^{-1}$ associated with wavelet approximations. Unlike curvelets, which have similar sparsity properties, shearlets form an affine-like system and have a simpler mathematical structure. In fact, the elements of this system form a Parseval frame and are generated by applying dilations, shear transformations, and translations to a single well-localized window function.

698 citations


Journal ArticleDOI
TL;DR: This paper presents a new k-means type algorithm for clustering high-dimensional objects in sub-spaces that can generate better clustering results than other subspace clustering algorithms and is also scalable to large data sets.
Abstract: This paper presents a new k-means type algorithm for clustering high-dimensional objects in sub-spaces. In high-dimensional data, clusters of objects often exist in subspaces rather than in the entire space. For example, in text clustering, clusters of documents of different topics are categorized by different subsets of terms or keywords. The keywords for one cluster may not occur in the documents of other clusters. This is a data sparsity problem faced in clustering high-dimensional data. In the new algorithm, we extend the k-means clustering process to calculate a weight for each dimension in each cluster and use the weight values to identify the subsets of important dimensions that categorize different clusters. This is achieved by including the weight entropy in the objective function that is minimized in the k-means clustering process. An additional step is added to the k-means clustering process to automatically compute the weights of all dimensions in each cluster. The experiments on both synthetic and real data have shown that the new algorithm can generate better clustering results than other subspace clustering algorithms. The new algorithm is also scalable to large data sets.

591 citations


Proceedings ArticleDOI
03 Sep 2007
TL;DR: An effective video denoising method based on highly sparse signal representation in local 3D transform domain that achieves state-of-the-art denoised performance in terms of both peak signal-to-noise ratio and subjective visual quality is proposed.
Abstract: We propose an effective video denoising method based on highly sparse signal representation in local 3D transform domain. A noisy video is processed in blockwise manner and for each processed block we form a 3D data array that we call “group” by stacking together blocks found similar to the currently processed one. This grouping is realized as a spatio-temporal predictive-search block-matching, similar to techniques used for motion estimation. Each formed 3D group is filtered by a 3D transform-domain shrinkage (hard-thresholding and Wiener filtering), the result of which are estimates of all grouped blocks. This filtering — that we term “collaborative filtering” — exploits the correlation between grouped blocks and the corresponding highly sparse representation of the true signal in the transform domain. Since, in general, the obtained block estimates are mutually overlapping, we aggregate them by a weighted average in order to form a non-redundant estimate of the video. Significant improvement of this approach is achieved by using a two-step algorithm where an intermediate estimate is produced by grouping and collaborative hard-thresholding and then used both for improving the grouping and for applying collaborative empirical Wiener filtering. We develop an efficient realization of this video denoising algorithm. The experimental results show that at reasonable computational cost it achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.

496 citations


Proceedings ArticleDOI
12 Nov 2007
TL;DR: The results demonstrate the effectiveness of the proposed grouping constraint and show that the developed denoising algorithm achieves state-of-the-art performance in terms of both peak signal-to-noise ratio and visual quality.
Abstract: We propose an effective color image denoising method that exploits filtering in highly sparse local 3D transform domain in each channel of a luminance-chrominance color space. For each image block in each channel, a 3D array is formed by stacking together blocks similar to it, a process that we call "grouping". The high similarity between grouped blocks in each 3D array enables a highly sparse representation of the true signal in a 3D transform domain and thus a subsequent shrinkage of the transform spectra results in effective noise attenuation. The peculiarity of the proposed method is the application of a "grouping constraint" on the chrominances by reusing exactly the same grouping as for the luminance. The results demonstrate the effectiveness of the proposed grouping constraint and show that the developed denoising algorithm achieves state-of-the-art performance in terms of both peak signal-to-noise ratio and visual quality.

464 citations


Proceedings Article
03 Dec 2007
TL;DR: This paper furnishes an alternative means of expressing the ARD cost function using auxiliary functions that naturally addresses both of these issues and suggest alternative cost functions and update procedures for selecting features and promoting sparse solutions in a variety of general situations.
Abstract: Automatic relevance determination (ARD) and the closely-related sparse Bayesian learning (SBL) framework are effective tools for pruning large numbers of irrelevant features leading to a sparse explanatory subset. However, popular update rules used for ARD are either difficult to extend to more general problems of interest or are characterized by non-ideal convergence properties. Moreover, it remains unclear exactly how ARD relates to more traditional MAP estimation-based methods for learning sparse representations (e.g., the Lasso). This paper furnishes an alternative means of expressing the ARD cost function using auxiliary functions that naturally addresses both of these issues. First, the proposed reformulation of ARD can naturally be optimized by solving a series of re-weighted l1 problems. The result is an efficient, extensible algorithm that can be implemented using standard convex programming toolboxes and is guaranteed to converge to a local minimum (or saddle point). Secondly, the analysis reveals that ARD is exactly equivalent to performing standard MAP estimation in weight space using a particular feature- and noise-dependent, non-factorial weight prior. We then demonstrate that this implicit prior maintains several desirable advantages over conventional priors with respect to feature selection. Overall these results suggest alternative cost functions and update procedures for selecting features and promoting sparse solutions in a variety of general situations. In particular, the methodology readily extends to handle problems such as non-negative sparse coding and covariance component estimation.

Journal ArticleDOI
TL;DR: This work proposes a versatile convex variational formulation for optimization over orthonormal bases that covers a wide range of problems, and establishes the strong convergence of a proximal thresholding algorithm to solve it.
Abstract: The notion of soft thresholding plays a central role in problems from various areas of applied mathematics, in which the ideal solution is known to possess a sparse decomposition in some orthonormal basis. Using convex-analytical tools, we extend this notion to that of proximal thresholding and investigate its properties, providing, in particular, several characterizations of such thresholders. We then propose a versatile convex variational formulation for optimization over orthonormal bases that covers a wide range of problems, and we establish the strong convergence of a proximal thresholding algorithm to solve it. Numerical applications to signal recovery are demonstrated.

Proceedings ArticleDOI
24 Sep 2007
TL;DR: A compressive sensing scheme with deterministic performance guarantees using expander-graphs-based measurement matrices is proposed and it is shown that the signal recovery can be achieved with complexity O(n) even if the number of nonzero elements k grows linearly with n.
Abstract: Compressive sensing is an emerging technology which can recover a sparse signal vector of dimension n via a much smaller number of measurements than n. However, the existing compressive sensing methods may still suffer from relatively high recovery complexity, such as O(n3), or can only work efficiently when the signal is super sparse, sometimes without deterministic performance guarantees. In this paper, we propose a compressive sensing scheme with deterministic performance guarantees using expander-graphs-based measurement matrices and show that the signal recovery can be achieved with complexity O(n) even if the number of nonzero elements k grows linearly with n. We also investigate compressive sensing for approximately sparse signals using this new method. Moreover, explicit constructions of the considered expander graphs exist. Simulation results are given to show the performance and complexity of the new method.

Journal ArticleDOI
TL;DR: A large class of admissible sparseness measures is introduced, and sufficient conditions are given for having a unique sparse representation of a signal from the dictionary w.r.t. such aSparseness measure.

Proceedings ArticleDOI
28 Oct 2007
TL;DR: This paper proposes a novel dimensionality reduction framework, called Unified Sparse Subspace Learning (USSL), for learning sparse projections, which casts the problem of learning the projective functions into a regression framework, which facilitates the use of different kinds of regularizes.
Abstract: Recently the problem of dimensionality reduction (or, subspace learning) has received a lot of interests in many fields of information processing, including data mining, information retrieval, and pattern recognition. Some popular methods include principal component analysis (PCA), linear discriminant analysis (LDA) and locality preserving projection (LPP). However, a disadvantage of all these approaches is that the learned projective functions are linear combinations of all the original features, thus it is often difficult to interpret the results. In this paper, we propose a novel dimensionality reduction framework, called Unified Sparse Subspace Learning (USSL), for learning sparse projections. USSL casts the problem of learning the projective functions into a regression framework, which facilitates the use of different kinds of regularizes. By using a L1-norm regularizer (lasso), the sparse projections can be efficiently computed. Experimental results on real world classification and clustering problems demonstrate the effectiveness of our method.

Journal ArticleDOI
TL;DR: The paper describes a method for controlling the population of the information matrix, whereby the Exactly Sparse Extended Information Filter (ESEIF) performs inference over a model that is conservative relative to the standard Gaussian distribution.
Abstract: Recent research concerning the Gaussian canonical form for Simultaneous Localization and Mapping (SLAM) has given rise to a handful of algorithms that attempt to solve the SLAM scalability problem for arbitrarily large environments. One such estimator that has received due attention is the Sparse Extended Information Filter (SEIF) proposed by Thrun et al., which is reported to be nearly constant time, irrespective of the size of the map. The key to the SEIF's scalability is to prune weak links in what is a dense information (inverse covariance) matrix to achieve a sparse approximation that allows for efficient, scalable SLAM. We demonstrate that the SEIF sparsification strategy yields error estimates that are overconfident when expressed in the global reference frame, while empirical results show that relative map consistency is maintained. In this paper, we propose an alternative scalable estimator based on an information form that maintains sparsity while preserving consistency. The paper describes a method for controlling the population of the information matrix, whereby we track a modified version of the SLAM posterior, essentially by ignoring a small fraction of temporal measurements. In this manner, the Exactly Sparse Extended Information Filter (ESEIF) performs inference over a model that is conservative relative to the standard Gaussian distribution. We compare our algorithm to the SEIF and standard EKF both in simulation as well as on two nonlinear datasets. The results convincingly show that our method yields conservative estimates for the robot pose and map that are nearly identical to those of the EKF.

01 Jan 2007
TL;DR: If sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical and the differences in performance between different features become insignificant as the feature-space dimension is sufficiently large.
Abstract: In this paper, we examine the role of feature selection in face recognition from the perspective of sparse representation. We cast the recognition problem as finding a sparse representation of the test image features w.r.t. the training set. The sparse representation can be accurately and efficiently computed by `-minimization. The proposed simple algorithm generalizes conventional face recognition classifiers such as nearest neighbors and nearest subspaces. Using face recognition under varying illumination and expression as an example, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficient and whether the sparse representation is correctly found. We conduct extensive experiments to validate the significance of imposing sparsity using the Extended Yale B database and the AR database. Our thorough evaluation shows that, using conventional features such as Eigenfaces and facial parts, the proposed algorithm achieves much higher recognition accuracy on face images with variation in either illumination or expression. Furthermore, other unconventional features such as severely downsampled images and randomly projected features perform almost equally well with the increase of feature dimensions. The differences in performance between different features become insignificant as the feature-space dimension is sufficiently large.

Journal ArticleDOI
TL;DR: Two fast sparse approximation schemes for least squares support vector machine (LS-SVM) are presented to overcome the limitation of LS-S VM that it is not applicable to large data sets and to improve test speed.
Abstract: In this paper, we present two fast sparse approximation schemes for least squares support vector machine (LS-SVM), named FSALS-SVM and PFSALS-SVM, to overcome the limitation of LS-SVM that it is not applicable to large data sets and to improve test speed. FSALS-SVM iteratively builds the decision function by adding one basis function from a kernel-based dictionary at one time. The process is terminated by using a flexible and stable epsilon insensitive stopping criterion. A probabilistic speedup scheme is employed to further improve the speed of FSALS-SVM and the resulting classifier is named PFSALS-SVM. Our algorithms are of two compelling features: low complexity and sparse solution. Experiments on benchmark data sets show that our algorithms obtain sparse classifiers at a rather low cost without sacrificing the generalization performance

Journal ArticleDOI
TL;DR: This study uses performance profiles as a tool for evaluating and comparing the performance of serial sparse direct solvers on an extensive set of symmetric test problems taken from a range of practical applications.
Abstract: In recent years a number of solvers for the direct solution of large sparse symmetric linear systems of equations have been developed. These include solvers that are designed for the solution of positive definite systems as well as those that are principally intended for solving indefinite problems. In this study, we use performance profiles as a tool for evaluating and comparing the performance of serial sparse direct solvers on an extensive set of symmetric test problems taken from a range of practical applications.

Journal ArticleDOI
TL;DR: This work presents a family of iterative least squares based dictionary learning algorithms (ILS-DLA), including algorithms for design of signal dependent block based dictionaries and overlapping dictionaries, as generalizations of transforms and filter banks, respectively.

Journal ArticleDOI
TL;DR: Two subspace-based algorithms for TF-nondisjoint sources are proposed: one uses quadratic TFDs and the other a linear TFD, and numerical performance of the proposed methods are provided highlighting their performance gain compared to existing ones.
Abstract: This paper considers the blind separation of nonstationary sources in the underdetermined case, when there are more sources than sensors. A general framework for this problem is to work on sources that are sparse in some signal representation domain. Recently, two methods have been proposed with respect to the time-frequency (TF) domain. The first uses quadratic time-frequency distributions (TFDs) and a clustering approach, and the second uses a linear TFD. Both of these methods assume that the sources are disjoint in the TF domain; i.e., there is, at most, one source present at a point in the TF domain. In this paper, we relax this assumption by allowing the sources to be TF-nondisjoint to a certain extent. In particular, the number of sources present at a point is strictly less than the number of sensors. The separation can still be achieved due to subspace projection that allows us to identify the sources present and to estimate their corresponding TFD values. In particular, we propose two subspace-based algorithms for TF-nondisjoint sources: one uses quadratic TFDs and the other a linear TFD. Another contribution of this paper is a new estimation procedure for the mixing matrix. Finally, then numerical performance of the proposed methods are provided highlighting their performance gain compared to existing ones

Book ChapterDOI
09 Sep 2007
TL;DR: Four variants of the nonparametric Bayesian extension of Independent Components Analysis are described, with Gaussian or Laplacian priors on X and the one or two-parameter IBPs, and Bayesian inference under these models is demonstrated using a Markov Chain Monte Carlo algorithm.
Abstract: A nonparametric Bayesian extension of Independent Components Analysis (ICA) is proposed where observed data Y is modelled as a linear superposition, G, of a potentially infinite number of hidden sources, X. Whether a given source is active for a specific data point is specified by an infinite binary matrix, Z. The resulting sparse representation allows increased data reduction compared to standard ICA. We define a prior on Z using the Indian Buffet Process (IBP). We describe four variants of the model, with Gaussian or Laplacian priors on X and the one or two-parameter IBPs. We demonstrate Bayesian inference under these models using a Markov Chain Monte Carlo (MCMC) algorithm on synthetic and gene expression data and compare to standard ICA algorithms.

Journal ArticleDOI
TL;DR: The sparse grid methodology is applied and the application demonstrated to work efficiently for up to 10 proteins, and error bounds are provided which confirm the effectiveness of sparse grid approximations for smooth high-dimensional probability distributions.

Dissertation
31 Jul 2007
TL;DR: Several new techniques to reduce the complexity of Gaussian process models to 0(N3) complexity and relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space are developed.
Abstract: Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the behaviour of the error bars. They naturally grow in regions away from training data where we have high uncertainty about the interpolating function. In their standard form GPs have several limitations, which can be divided into two broad categories: computational difficulties for large data sets, and restrictive modelling assumptions for complex data sets. This thesis addresses various aspects of both of these problems. The training cost for a GP has 0(N3) complexity, where N is the number of training data points. This is due to an inversion of the N x N covariance matrix. In this thesis we develop several new techniques to reduce this complexity to 0(NM2), where M is a user chosen number much smaller than N. The sparse approximation we use is based on a set of M 'pseudo-inputs' which are optimised together with hyperparameters at training time. We develop a further approximation based on clustering inputs that can be seen as a mixture of local and global approximations. Standard GPs assume a uniform noise variance. We use our sparse approximation described above as a way of relaxing this assumption. By making a modification of the sparse covariance function, we can model input dependent noise. To handle high dimensional data sets we use supervised linear dimensionality reduction. As another extension of the standard GP, we relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space. All these techniques further increase the applicability of GPs to real complex data sets. We present empirical comparisons of our algorithms with various competing techniques, and suggest problem dependent strategies to follow in practice.

Book ChapterDOI
09 Sep 2007
TL;DR: It is experimentally shown that the proposed SCA or atomic decomposition on over-complete dictionaries algorithm is about two orders of magnitude faster than the state-of-the-art l1-magic, while providing the same (or better) accuracy.
Abstract: In this paper, a new algorithm for Sparse Component Analysis (SCA) or atomic decomposition on over-complete dictionaries is presented. The algorithm is essentially a method for obtaining sufficiently sparse solutions of underdetermined systems of linear equations. The solution obtained by the proposed algorithm is compared with the minimum l1-norm solution achieved by Linear Programming (LP). It is experimentally shown that the proposed algorithm is about two orders of magnitude faster than the state-of-the-art l1-magic, while providing the same (or better) accuracy.

01 Jan 2007
TL;DR: In this article, decay bounds for the entries of f(A), where A is a sparse (in particular, banded) n× n diagonalizable matrix and f is smooth on a subset of the complex plane containing the spectrum of A, are established.
Abstract: We establish decay bounds for the entries of f(A), where A is a sparse (in particular, banded) n× n diagonalizable matrix and f is smooth on a subset of the complex plane containing the spectrum of A. Combined with techniques from approximation theory, the bounds are used to compute sparse (or banded) approximations to f(A), resulting in algorithms that under appropriate conditions have linear complexity in the matrix dimension. Applications to various types of problems are discussed and illustrated by numerical examples.

Journal ArticleDOI
TL;DR: This paper deals with the l1-norm penalization of complex-valued variables, that brings satisfactory prior modeling for the estimation of spectral lines and proposes an efficient optimization strategy.
Abstract: We address the problem of estimating spectral lines from irregularly sampled data within the framework of sparse representations. Spectral analysis is formulated as a linear inverse problem, which is solved by minimizing an l1-norm penalized cost function. This approach can be viewed as a basis pursuit de-noising (BPDN) problem using a dictionary of cisoids with high frequency resolution. In the studied case, however, usual BPDN characterizations of uniqueness and sparsity do not apply. This paper deals with the l1-norm penalization of complex-valued variables, that brings satisfactory prior modeling for the estimation of spectral lines. An analytical characterization of the minimizer of the criterion is given and geometrical properties are derived about the uniqueness and the sparsity of the solution. An efficient optimization strategy is proposed. Convergence properties of the iterative coordinate descent (ICD) and iterative reweighted least-squares (IRLS) algorithms are first examined. Then, both strategies are merged in a convergent procedure, that takes advantage of the specificities of ICD and IRLS, considerably improving the convergence speed. The computation of the resulting spectrum estimator can be implemented efficiently for any sampling scheme. Algorithm performance and estimation quality are illustrated throughout the paper using an artificial data set, typical of some astrophysical problems, where sampling irregularities are caused by day/night alternation. We show that accurate frequency location is achieved with high resolution. In particular, compared with sequential Matching Pursuit methods, the proposed approach is shown to achieve more robustness regarding sampling artifacts.

Journal ArticleDOI
TL;DR: A novel multiclass support vector machine, which performs classification and variable selection simultaneously through an L1-norm penalized sparse representation, and is compared against some competitors in terms of accuracy of prediction.
Abstract: Binary support vector machines (SVMs) have been proven to deliver high performance. In multiclass classification, however, issues remain with respect to variable selection. One challenging issue is classification and variable selection in the presence of variables in the magnitude of thousands, greatly exceeding the size of training sample. This often occurs in genomics classification. To meet the challenge, this article proposes a novel multiclass support vector machine, which performs classification and variable selection simultaneously through an L1-norm penalized sparse representation. The proposed methodology, together with the developed regularization solution path, permits variable selection in such a situation. For the proposed methodology, a statistical learning theory is developed to quantify the generalization error in an attempt to gain insight into the basic structure of sparse learning, permitting the number of variables to greatly exceed the sample size. The operating characteristics of the...

01 Jan 2007
TL;DR: A novel unsupervised method for learning sparse, overcomplete features using a linear encoder, and a linear decoder preceded by a sparsifying non-linearity that turns a code vector into a quasi-binary sparse code vector.
Abstract: We describe a novel unsupervised method for learning sparse, overcomplete features The model uses a linear encoder, and a linear decoder preceded by a sparsifying non-linearity that turns a code vector into a quasi-binary sparse code vector Given an input, the optimal code minimizes the distance between the output of the decoder and the input patch while being as similar as possible to the encoder output Learning proceeds in a two-phase EM-like fashion: (1) compute the minimum-energy code vector, (2) adjust the parameters of the encoder and decoder so as to decrease the energy The model produces “stroke detectors” when trained on handwritten numerals, and Gabor-like filters when trained on natural image patches Inference and learning are very fast, requiring no preprocessing, and no expensive sampling Using the proposed unsupervised method to initialize the first layer of a convolutional network, we achieved an error rate slightly lower than the best reported result on the MNIST dataset Finally, an extension of the method is described to learn topographical filter maps

Journal ArticleDOI
TL;DR: A novel iterative EEG source imaging algorithm, Lp norm iterative sparse solution (LPISS), which was applied to a real evoked potential collected in a study of inhibition of return (IOR), and the result was consistent with the previously suggested activated areas involved in an IOR process.
Abstract: How to localize the neural electric activities effectively and precisely from the scalp EEG recordings is a critical issue for clinical neurology and cognitive neuroscience. In this paper, based on the spatial sparse assumption of brain activities, proposed is a novel iterative EEG source imaging algorithm, Lp norm iterative sparse solution (LPISS). In LPISS, the lp(ples1) norm constraint for sparse solution is integrated into the iterative weighted minimum norm solution of the underdetermined EEG inverse problem, and it is the constraint and the iteratively renewed weight that forces the inverse problem to converge to a sparse solution effectively. The conducted simulation studies with comparison to LORETA and FOCUSS for various dipoles configurations confirmed the validation of LPISS for sparse EEG source localization. Finally, LPISS was applied to a real evoked potential collected in a study of inhibition of return (IOR), and the result was consistent with the previously suggested activated areas involved in an IOR process