scispace - formally typeset
Search or ask a question

Showing papers on "Sparse approximation published in 2009"


Journal ArticleDOI
TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

9,658 citations


Journal ArticleDOI
TL;DR: The aim of this paper is to introduce a few key notions and applications connected to sparsity, targeting newcomers interested in either the mathematical aspects of this area or its applications.
Abstract: A full-rank matrix ${\bf A}\in \mathbb{R}^{n\times m}$ with $n

2,372 citations


Proceedings ArticleDOI
14 Jun 2009
TL;DR: A new online optimization algorithm for dictionary learning is proposed, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples, and leads to faster performance and better dictionaries than classical batch algorithms for both small and large datasets.
Abstract: Sparse coding---that is, modelling data vectors as sparse linear combinations of basis elements---is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on learning the basis set, also called dictionary, to adapt it to specific data, an approach that has recently proven to be very effective for signal reconstruction and classification in the audio and image processing domains. This paper proposes a new online optimization algorithm for dictionary learning, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples. A proof of convergence is presented, along with experiments with natural images demonstrating that it leads to faster performance and better dictionaries than classical batch algorithms for both small and large datasets.

2,313 citations


Journal ArticleDOI
TL;DR: The presented analysis shows that in the noiseless setting, the proposed algorithm can exactly reconstruct arbitrary sparse signals provided that the sensing matrix satisfies the restricted isometry property with a constant parameter.
Abstract: We propose a new method for reconstruction of sparse signals with and without noisy perturbations, termed the subspace pursuit algorithm. The algorithm has two important characteristics: low computational complexity, comparable to that of orthogonal matching pursuit techniques when applied to very sparse signals, and reconstruction accuracy of the same order as that of linear programming (LP) optimization methods. The presented analysis shows that in the noiseless setting, the proposed algorithm can exactly reconstruct arbitrary sparse signals provided that the sensing matrix satisfies the restricted isometry property with a constant parameter. In the noisy setting and in the case that the signal is not exactly sparse, it can be shown that the mean-squared error of the reconstruction is upper-bounded by constant multiples of the measurement and signal perturbation energies.

2,235 citations


Book
12 Oct 2009
TL;DR: This book provides a broad survey of models and efficient algorithms for Nonnegative Matrix Factorization (NMF), including NMFs various extensions and modifications, especially Nonnegative Tensor Factorizations (NTF) and Nonnegative Tucker Decompositions (NTD).
Abstract: This book provides a broad survey of models and efficient algorithms for Nonnegative Matrix Factorization (NMF) This includes NMFs various extensions and modifications, especially Nonnegative Tensor Factorizations (NTF) and Nonnegative Tucker Decompositions (NTD) NMF/NTF and their extensions are increasingly used as tools in signal and image processing, and data analysis, having garnered interest due to their capability to provide new insights and relevant information about the complex latent relationships in experimental data sets It is suggested that NMF can provide meaningful components with physical interpretations; for example, in bioinformatics, NMF and its extensions have been successfully applied to gene expression, sequence analysis, the functional characterization of genes, clustering and text mining As such, the authors focus on the algorithms that are most useful in practice, looking at the fastest, most robust, and suitable for large-scale models Key features: Acts as a single source reference guide to NMF, collating information that is widely dispersed in current literature, including the authors own recently developed techniques in the subject area Uses generalized cost functions such as Bregman, Alpha and Beta divergences, to present practical implementations of several types of robust algorithms, in particular Multiplicative, Alternating Least Squares, Projected Gradient and Quasi Newton algorithms Provides a comparative analysis of the different methods in order to identify approximation error and complexity Includes pseudo codes and optimized MATLAB source codes for almost all algorithms presented in the book The increasing interest in nonnegative matrix and tensor factorizations, as well as decompositions and sparse representation of data, will ensure that this book is essential reading for engineers, scientists, researchers, industry practitioners and graduate students across signal and image processing; neuroscience; data mining and data analysis; computer science; bioinformatics; speech processing; biomedical engineering; and multimedia

2,136 citations


Journal ArticleDOI
TL;DR: This work proposes iterative methods in which each step is obtained by solving an optimization subproblem involving a quadratic term with diagonal Hessian plus the original sparsity-inducing regularizer, and proves convergence of the proposed iterative algorithm to a minimum of the objective function.
Abstract: Finding sparse approximate solutions to large underdetermined linear systems of equations is a common problem in signal/image processing and statistics. Basis pursuit, the least absolute shrinkage and selection operator (LASSO), wavelet-based deconvolution and reconstruction, and compressed sensing (CS) are a few well-known areas in which problems of this type appear. One standard approach is to minimize an objective function that includes a quadratic (lscr 2) error term added to a sparsity-inducing (usually lscr1) regularizater. We present an algorithmic framework for the more general problem of minimizing the sum of a smooth convex function and a nonsmooth, possibly nonconvex regularizer. We propose iterative methods in which each step is obtained by solving an optimization subproblem involving a quadratic term with diagonal Hessian (i.e., separable in the unknowns) plus the original sparsity-inducing regularizer; our approach is suitable for cases in which this subproblem can be solved much more rapidly than the original problem. Under mild conditions (namely convexity of the regularizer), we prove convergence of the proposed iterative algorithm to a minimum of the objective function. In addition to solving the standard lscr2-lscr1 case, our framework yields efficient solution techniques for other regularizers, such as an lscrinfin norm and group-separable regularizers. It also generalizes immediately to the case in which the data is complex rather than real. Experiments with CS problems show that our approach is competitive with the fastest known methods for the standard lscr2-lscr1 problem, as well as being efficient on problems with other separable regularization terms.

1,723 citations


Journal ArticleDOI
TL;DR: This work analyzes the behavior of l1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern of a vector beta* based on observations contaminated by noise, and establishes precise conditions on the problem dimension p, the number k of nonzero elements in beta*, and the number of observations n.
Abstract: The problem of consistently estimating the sparsity pattern of a vector beta* isin Rp based on observations contaminated by noise arises in various contexts, including signal denoising, sparse approximation, compressed sensing, and model selection. We analyze the behavior of l1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern. Our main result is to establish precise conditions on the problem dimension p, the number k of nonzero elements in beta*, and the number of observations n that are necessary and sufficient for sparsity pattern recovery using the Lasso. We first analyze the case of observations made using deterministic design matrices and sub-Gaussian additive noise, and provide sufficient conditions for support recovery and linfin-error bounds, as well as results showing the necessity of incoherence and bounds on the minimum value. We then turn to the case of random designs, in which each row of the design is drawn from a N (0, Sigma) ensemble. For a broad class of Gaussian ensembles satisfying mutual incoherence conditions, we compute explicit values of thresholds 0 0, if n > 2 (thetasu + delta) klog (p- k), then the Lasso succeeds in recovering the sparsity pattern with probability converging to one for large problems, whereas for n < 2 (thetasl - delta)klog (p - k), then the probability of successful recovery converges to zero. For the special case of the uniform Gaussian ensemble (Sigma = Iptimesp), we show that thetasl = thetas

1,438 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: This work proposes a method based on sparse representation (SR) to cluster data drawn from multiple low-dimensional linear or affine subspaces embedded in a high-dimensional space and applies this method to the problem of segmenting multiple motions in video.
Abstract: We propose a method based on sparse representation (SR) to cluster data drawn from multiple low-dimensional linear or affine subspaces embedded in a high-dimensional space. Our method is based on the fact that each point in a union of subspaces has a SR with respect to a dictionary formed by all other data points. In general, finding such a SR is NP hard. Our key contribution is to show that, under mild assumptions, the SR can be obtained `exactly' by using l1 optimization. The segmentation of the data is obtained by applying spectral clustering to a similarity matrix built from this SR. Our method can handle noise, outliers as well as missing data. We apply our subspace clustering algorithm to the problem of segmenting multiple motions in video. Experiments on 167 video sequences show that our approach significantly outperforms state-of-the-art methods.

1,411 citations


Proceedings Article
15 Apr 2009
TL;DR: A variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood.
Abstract: Sparse Gaussian process methods that use inducing variables require the selection of the inducing inputs and the kernel hyperparameters. We introduce a variational formulation for sparse approximations that jointly infers the inducing inputs and the kernel hyperparameters by maximizing a lower bound of the true log marginal likelihood. The key property of this formulation is that the inducing inputs are defined to be variational parameters which are selected by minimizing the Kullback-Leibler divergence between the variational distribution and the exact posterior distribution over the latent function values. We apply this technique to regression and we compare it with other approaches in the literature.

1,350 citations


Journal ArticleDOI
TL;DR: A fast algorithm for overcomplete sparse decomposition, called SL0, is proposed, which tries to directly minimize the l 1 norm.
Abstract: In this paper, a fast algorithm for overcomplete sparse decomposition, called SL0, is proposed. The algorithm is essentially a method for obtaining sparse solutions of underdetermined systems of linear equations, and its applications include underdetermined sparse component analysis (SCA), atomic decomposition on overcomplete dictionaries, compressed sensing, and decoding real field codes. Contrary to previous methods, which usually solve this problem by minimizing the l 1 norm using linear programming (LP) techniques, our algorithm tries to directly minimize the l 1 norm. It is experimentally shown that the proposed algorithm is about two to three orders of magnitude faster than the state-of-the-art interior-point LP solvers, while providing the same (or better) accuracy.

1,033 citations


Journal ArticleDOI
TL;DR: This paper finds a simple regularized version of Orthogonal Matching Pursuit (ROMP) which has advantages of both approaches: the speed and transparency of OMP and the strong uniform guarantees of L1-minimization.
Abstract: This paper seeks to bridge the two major algorithmic approaches to sparse signal recovery from an incomplete set of linear measurements—L1-minimization methods and iterative methods (Matching Pursuits). We find a simple regularized version of Orthogonal Matching Pursuit (ROMP) which has advantages of both approaches: the speed and transparency of OMP and the strong uniform guarantees of L1-minimization. Our algorithm, ROMP, reconstructs a sparse signal in a number of iterations linear in the sparsity, and the reconstruction is exact provided the linear measurements satisfy the uniform uncertainty principle.

Journal ArticleDOI
TL;DR: A simple algorithm for selecting a subset of coordinates with largest sample variances is provided, and it is shown that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n.
Abstract: Principal components analysis (PCA) is a classic method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. Contemporary datasets often have p comparable with or even much larger than n. Our main assertions, in such settings, are (a) that some initial reduction in dimensionality is desirable before applying any PCA-type search for principal modes, and (b) the initial reduction in dimensionality is best achieved by working in a basis in which the signals have a sparse representation. We describe a simple asymptotic model in which the estimate of the leading principal component vector via standard PCA is consistent if and only if p(n)/n → 0. We provide a simple algorithm for selecting a subset of coordinates with largest sample variances, and show that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n.

Proceedings ArticleDOI
Nathan Bell1, Michael Garland1
14 Nov 2009
TL;DR: This work explores SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes, including structured grid and unstructured mesh matrices.
Abstract: Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained parallelism and impose sufficient regularity on execution paths and memory access patterns. We explore SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes. The techniques we propose are efficient, successfully utilizing large percentages of peak bandwidth. Furthermore, they deliver excellent total throughput, averaging 16 GFLOP/s and 10 GFLOP/s in double precision for structured grid and unstructured mesh matrices, respectively, on a GeForce GTX 285. This is roughly 2.8 times the throughput previously achieved on Cell BE and more than 10 times that of a quad-core Intel Clovertown system.

Proceedings ArticleDOI
01 Sep 2009
TL;DR: In this paper, a robust visual tracking method was proposed by casting tracking as a sparse approximation problem in a particle filter framework, where each target candidate is sparsely represented in the space spanned by target templates and trivial templates.
Abstract: In this paper we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, corruption and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an l 1 -regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Two additional components further improve the robustness of our approach: 1) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 2) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on five challenging sequences involving heavy occlusions, drastic illumination changes, and large pose variations. The proposed approach shows excellent performance in comparison with three previously proposed trackers.

Journal ArticleDOI
TL;DR: Several commonly-used sparsity measures are compared based on whether or not they satisfy these six propositions and only two of these measures satisfy all six: the pq-mean with p les 1, q > 1 and the Gini index.
Abstract: Sparsity of representations of signals has been shown to be a key concept of fundamental importance in fields such as blind source separation, compression, sampling and signal analysis. The aim of this paper is to compare several commonly-used sparsity measures based on intuitive attributes. Intuitively, a sparse representation is one in which a small number of coefficients contain a large proportion of the energy. In this paper, six properties are discussed: (Robin Hood, Scaling, Rising Tide, Cloning, Bill Gates, and Babies), each of which a sparsity measure should have. The main contributions of this paper are the proofs and the associated summary table which classify commonly-used sparsity measures based on whether or not they satisfy these six propositions. Only two of these measures satisfy all six: the pq-mean with p les 1, q > 1 and the Gini index.

01 Jan 2009
TL;DR: This paper proposes a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework and introduces a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure.
Abstract: In this paper we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, corruption and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an � 1-regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Two additional components further improve the robustness of our approach: 1) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 2) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on five challenging sequences involving heavy occlusions, drastic illumination changes, and large pose variations. The proposed approach shows excellent performance in comparison with three previously proposed trackers.

Journal ArticleDOI
TL;DR: A framework for the joint design and optimization, from a set of training images, of the nonparametric dictionary and the sensing matrix is introduced and it is shown that this joint optimization outperforms both the use of random sensing matrices and those matrices that are optimized independently of the learning of the dictionary.
Abstract: Sparse signal representation, analysis, and sensing have received a lot of attention in recent years from the signal processing, optimization, and learning communities. On one hand, learning overcomplete dictionaries that facilitate a sparse representation of the data as a liner combination of a few atoms from such dictionary leads to state-of-the-art results in image and video restoration and classification. On the other hand, the framework of compressed sensing (CS) has shown that sparse signals can be recovered from far less samples than those required by the classical Shannon-Nyquist Theorem. The samples used in CS correspond to linear projections obtained by a sensing projection matrix. It has been shown that, for example, a nonadaptive random sampling matrix satisfies the fundamental theoretical requirements of CS, enjoying the additional benefit of universality. On the other hand, a projection sensing matrix that is optimally designed for a certain class of signals can further improve the reconstruction accuracy or further reduce the necessary number of samples. In this paper, we introduce a framework for the joint design and optimization, from a set of training images, of the nonparametric dictionary and the sensing matrix. We show that this joint optimization outperforms both the use of random sensing matrices and those matrices that are optimized independently of the learning of the dictionary. Particular cases of the proposed framework include the optimization of the sensing matrix for a given dictionary as well as the optimization of the dictionary for a predefined sensing environment. The presentation of the framework and its efficient numerical optimization is complemented with numerous examples on classical image datasets.

Journal ArticleDOI
TL;DR: Sparse additive models as discussed by the authors combine ideas from sparse linear modeling and additive non-parametric regression, and derive an algorithm for fitting the models that is practical and effective even when the number of covariates is larger than the sample size.
Abstract: Summary. We present a new class of methods for high dimensional non-parametric regression and classification called sparse additive models. Our methods combine ideas from sparse linear modelling and additive non-parametric regression. We derive an algorithm for fitting the models that is practical and effective even when the number of covariates is larger than the sample size. Sparse additive models are essentially a functional version of the grouped lasso of Yuan and Lin. They are also closely related to the COSSO model of Lin and Zhang but decouple smoothing and sparsity, enabling the use of arbitrary non-parametric smoothers. We give an analysis of the theoretical properties of sparse additive models and present empirical results on synthetic and real data, showing that they can be effective in fitting sparse non-parametric models in high dimensional data.

Journal ArticleDOI
TL;DR: This paper investigates a new model reduction criterion that makes computationally demanding sparsification procedures unnecessary and incorporates the coherence criterion into a new kernel-based affine projection algorithm for time series prediction.
Abstract: Kernel-based algorithms have been a topic of considerable interest in the machine learning community over the last ten years. Their attractiveness resides in their elegant treatment of nonlinear problems. They have been successfully applied to pattern recognition, regression and density estimation. A common characteristic of kernel-based methods is that they deal with kernel expansions whose number of terms equals the number of input data, making them unsuitable for online applications. Recently, several solutions have been proposed to circumvent this computational burden in time series prediction problems. Nevertheless, most of them require excessively elaborate and costly operations. In this paper, we investigate a new model reduction criterion that makes computationally demanding sparsification procedures unnecessary. The increase in the number of variables is controlled by the coherence parameter, a fundamental quantity that characterizes the behavior of dictionaries in sparse approximation problems. We incorporate the coherence criterion into a new kernel-based affine projection algorithm for time series prediction. We also derive the kernel-based normalized LMS algorithm as a particular case. Finally, experiments are conducted to compare our approach to existing methods.

Book ChapterDOI
01 Dec 2009
TL;DR: This chapter presents in a self-contained manner recent advances in the design and analysis of gradient-based schemes for specially structured smooth and nonsmooth minimization problems.
Abstract: This chapter presents in a self-contained manner recent advances in the design and analysis of gradient-based schemes for specially structured smooth and nonsmooth minimization problems. We focus on the mathematical elements and ideas for building fast gradient-based methods and derive their complexity bounds. Throughout the chapter, the resulting schemes and results are illustrated and applied on a variety of problems arising in several specific key applications such as sparse approximation of signals, total variation-based image processing problems, and sensor location problems.

Proceedings ArticleDOI
28 Jun 2009
TL;DR: A new formulation of appearance-only SLAM suitable for very large scale navigation that naturally incorporates robustness against perceptual aliasing is described and demonstrated performing reliable online appearance mapping and loop closure detection over a 1,000 km trajectory.
Abstract: We describe a new formulation of appearance-only SLAM suitable for very large scale navigation. The system navigates in the space of appearance, assigning each new observation to either a new or previously visited location, without reference to metric position. The system is demonstrated performing reliable online appearance mapping and loop closure detection over a 1,000 km trajectory, with mean filter update times of 14 ms. The 1,000 km experiment is more than an order of magnitude larger than any previously reported result. The scalability of the system is achieved by defining a sparse approximation to the FAB-MAP model suitable for implementation using an inverted index. Our formulation of the problem is fully probabilistic and naturally incorporates robustness against perceptual aliasing. The 1,000 km data set comprising almost a terabyte of omni-directional and stereo imagery is available for use, and we hope that it will serve as a benchmark for future systems.

Journal ArticleDOI
TL;DR: An easy and efficient sparse-representation-based iterative algorithm for image inpainting that allows a high degree of flexibility to recover different structural components in the image (piecewise smooth, curvilinear, texture, etc.).
Abstract: Representing the image to be inpainted in an appropriate sparse representation dictionary, and combining elements from Bayesian statistics and modern harmonic analysis, we introduce an expectation maximization (EM) algorithm for image inpainting and interpolation. From a statistical point of view, the inpainting/interpolation can be viewed as an estimation problem with missing data. Toward this goal, we propose the idea of using the EM mechanism in a Bayesian framework, where a sparsity promoting prior penalty is imposed on the reconstructed coefficients. The EM framework gives a principled way to establish formally the idea that missing samples can be recovered/interpolated based on sparse representations. We first introduce an easy and efficient sparse-representation-based iterative algorithm for image inpainting. Additionally, we derive its theoretical convergence properties. Compared to its competitors, this algorithm allows a high degree of flexibility to recover different structural components in the image (piecewise smooth, curvilinear, texture, etc.). We also suggest some guidelines to automatically tune the regularization parameter.

Journal ArticleDOI
TL;DR: This work establishes conditions under which a regularized least squares estimator enjoys a nonasymptotic property, called the weak oracle property, where the dimensionality can grow exponentially with sample size and proposes the sequentially and iteratively reweighted squares (SIRS) algorithm for sparse recovery.
Abstract: Model selection and sparse recovery are two important problems for which many regularization methods have been proposed. We study the properties of regularization methods in both problems under the unified framework of regularized least squares with concave penalties. For model selection, we establish conditions under which a regularized least squares estimator enjoys a nonasymptotic property, called the weak oracle property, where the dimensionality can grow exponentially with sample size. For sparse recovery, we present a sufficient condition that ensures the recoverability of the sparsest solution. In particular, we approach both problems by considering a family of penalties that give a smooth homotopy between L0 and L1 penalties. We also propose the sequentially and iteratively reweighted squares (SIRS) algorithm for sparse recovery. Numerical studies support our theoretical results and demonstrate the advantage of our new methods for model selection and sparse recovery.

Journal ArticleDOI
TL;DR: Experimental results demonstrate the effectiveness of the proposed generic framework compared to existing algorithms, including iterative reweighted least-squares methods, and several algorithms in the literature dealing with nonconvex penalties are particular instances of the algorithm.
Abstract: This paper considers the problem of recovering a sparse signal representation according to a signal dictionary. This problem could be formalized as a penalized least-squares problem in which sparsity is usually induced by a lscr1-norm penalty on the coefficients. Such an approach known as the Lasso or Basis Pursuit Denoising has been shown to perform reasonably well in some situations. However, it was also proved that nonconvex penalties like the pseudo lscrq-norm with q < 1 or smoothly clipped absolute deviation (SCAD) penalty are able to recover sparsity in a more efficient way than the Lasso. Several algorithms have been proposed for solving the resulting nonconvex least-squares problem. This paper proposes a generic algorithm to address such a sparsity recovery problem for some class of nonconvex penalties. Our main contribution is that the proposed methodology is based on an iterative algorithm which solves at each iteration a convex weighted Lasso problem. It relies on the family of nonconvex penalties which can be decomposed as a difference of convex functions (DC). This allows us to apply DC programming which is a generic and principled way for solving nonsmooth and nonconvex optimization problem. We also show that several algorithms in the literature dealing with nonconvex penalties are particular instances of our algorithm. Experimental results demonstrate the effectiveness of the proposed generic framework compared to existing algorithms, including iterative reweighted least-squares methods.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: Without requiring any prior information of the blur kernel as the input, the proposed approach is able to recover high-quality images from given blurred images and the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem.
Abstract: Restoring a clear image from a single motion-blurred image due to camera shake has long been a challenging problem in digital imaging. Existing blind deblurring techniques either only remove simple motion blurring, or need user interactions to work on more complex cases. In this paper, we present an approach to remove motion blurring from a single image by formulating the blind blurring as a new joint optimization problem, which simultaneously maximizes the sparsity of the blur kernel and the sparsity of the clear image under certain suitable redundant tight frame systems (curvelet system for kernels and framelet system for images). Without requiring any prior information of the blur kernel as the input, our proposed approach is able to recover high-quality images from given blurred images. Furthermore, the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem. The experiments on both simulated images and real images showed that our algorithm can effectively removing complex motion blurring from nature images.

Proceedings Article
08 Sep 2009
TL;DR: This work presents an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes.
Abstract: We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes This \emph{structured sparse PCA} is based on a structured regularization recently introduced by [1] While classical sparse priors only deal with \textit{cardinality}, the regularization we use encodes higher-order information about the data We propose an efficient and simple optimization procedure to solve this problem Experiments with two practical tasks, face recognition and the study of the dynamics of a protein complex, demonstrate the benefits of the proposed structured approach over unstructured approaches

Proceedings Article
01 Jan 2009
TL;DR: This paper proposes a semi-supervised learning framework based on `1 graph to utilize both labeled and unlabeled data for inference on a graph and demonstrates the superiority of this framework over the counterparts based on traditional graphs.
Abstract: In this paper, we present a novel semi-supervised learning framework based on `1 graph. The `1 graph is motivated by that each datum can be reconstructed by the sparse linear superposition of the training data. The sparse reconstruction coefficients, used to deduce the weights of the directed `1 graph, are derived by solving an `1 optimization problem on sparse representation. Different from conventional graph construction processes which are generally divided into two independent steps, i.e., adjacency searching and weight selection, the graph adjacency structure as well as the graph weights of the `1 graph is derived simultaneously and in a parameter-free manner. Illuminated by the validated discriminating power of sparse representation in [16], we propose a semi-supervised learning framework based on `1 graph to utilize both labeled and unlabeled data for inference on a graph. Extensive experiments on semi-supervised face recognition and image classification demonstrate the superiority of our proposed semi-supervised learning framework based on `1 graph over the counterparts based on traditional graphs.

Journal ArticleDOI
TL;DR: A novel method for dictionary learning and extends the learning problem by introducing different constraints on the dictionary by using the majorization method, an optimization method that substitutes the original objective function with a surrogate function that is updated in each optimization step.
Abstract: In order to find sparse approximations of signals, an appropriate generative model for the signal class has to be known. If the model is unknown, it can be adapted using a set of training samples. This paper presents a novel method for dictionary learning and extends the learning problem by introducing different constraints on the dictionary. The convergence of the proposed method to a fixed point is guaranteed, unless the accumulation points form a continuum. This holds for different sparsity measures. The majorization method is an optimization method that substitutes the original objective function with a surrogate function that is updated in each optimization step. This method has been used successfully in sparse approximation and statistical estimation [ e.g., expectation-maximization (EM)] problems. This paper shows that the majorization method can be used for the dictionary learning problem too. The proposed method is compared with other methods on both synthetic and real data and different constraints on the dictionary are compared. Simulations show the advantages of the proposed method over other currently available dictionary learning methods not only in terms of average performance but also in terms of computation time.

Proceedings Article
07 Dec 2009
TL;DR: Mixed-norm regularization is used to achieve sparsity at the image level as well as a small overall dictionary and can be used to encourage using the same dictionary words for all the images in a class, providing a discriminative signal in the construction of image representations.
Abstract: Bag-of-words document representations are often used in text, image and video processing. While it is relatively easy to determine a suitable word dictionary for text documents, there is no simple mapping from raw images or videos to dictionary terms. The classical approach builds a dictionary using vector quantization over a large set of useful visual descriptors extracted from a training set, and uses a nearest-neighbor algorithm to count the number of occurrences of each dictionary word in documents to be encoded. More robust approaches have been proposed recently that represent each visual descriptor as a sparse weighted combination of dictionary words. While favoring a sparse representation at the level of visual descriptors, those methods however do not ensure that images have sparse representation. In this work, we use mixed-norm regularization to achieve sparsity at the image level as well as a small overall dictionary. This approach can also be used to encourage using the same dictionary words for all the images in a class, providing a discriminative signal in the construction of image representations. Experimental results on a benchmark image classification dataset show that when compact image or dictionary representations are needed for computational efficiency, the proposed approach yields better mean average precision in classification.