scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A note on the complexity of L p minimization

01 Oct 2011-Mathematical Programming (Springer-Verlag)-Vol. 129, Iss: 2, pp 285-299
TL;DR: It is proved that finding the global minimal value of the problem is strongly NP-Hard, but computing a local minimizer of theproblem can be done in polynomial time.
Abstract: We discuss the L p (0 ≤ p < 1) minimization problem arising from sparse solution construction and compressed sensing. For any fixed 0 < p < 1, we prove that finding the global minimal value of the problem is strongly NP-Hard, but computing a local minimizer of the problem can be done in polynomial time. We also develop an interior-point potential reduction algorithm with a provable complexity bound and demonstrate preliminary computational results of effectiveness of the algorithm.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This paper starts with a preliminary yet novel analysis for unconstrained $\ell_q$ minimization, which includes convergence, error bound, and local convergence behavior, and extends the algorithm and analysis to the recovery of low-rank matrices.
Abstract: In this paper, we first study $\ell_q$ minimization and its associated iterative reweighted algorithm for recovering sparse vectors. Unlike most existing work, we focus on unconstrained $\ell_q$ minimization, for which we show a few advantages on noisy measurements and/or approximately sparse vectors. Inspired by the results in [Daubechies et al., Comm. Pure Appl. Math., 63 (2010), pp. 1--38] for constrained $\ell_q$ minimization, we start with a preliminary yet novel analysis for unconstrained $\ell_q$ minimization, which includes convergence, error bound, and local convergence behavior. Then, the algorithm and analysis are extended to the recovery of low-rank matrices. The algorithms for both vector and matrix recovery have been compared to some state-of-the-art algorithms and show superior performance on recovering sparse vectors and low-rank matrices.

367 citations


Cites background from "A note on the complexity of L p min..."

  • ...However, the q quasi-norm is nonconvex for q 1, and q minimization is generally NP-hard [17]....

    [...]

  • ...However, the q quasi-norm is nonconvex for q < 1, and q minimization is generally NP-hard [17]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors consider a class of smoothing methods for minimization problems where the feasible set is convex but the objective function is not convex, not differentiable and perhaps not even locally Lipschitz at the solutions.
Abstract: We consider a class of smoothing methods for minimization problems where the feasible set is convex but the objective function is not convex, not differentiable and perhaps not even locally Lipschitz at the solutions. Such optimization problems arise from wide applications including image restoration, signal reconstruction, variable selection, optimal control, stochastic equilibrium and spherical approximations. In this paper, we focus on smoothing methods for solving such optimization problems, which use the structure of the minimization problems and composition of smoothing functions for the plus function (x)+. Many existing optimization algorithms and codes can be used in the inner iteration of the smoothing methods. We present properties of the smoothing functions and the gradient consistency of subdifferential associated with a smoothing function. Moreover, we describe how to update the smoothing parameter in the outer iteration of the smoothing methods to guarantee convergence of the smoothing methods to a stationary point of the original minimization problem.

270 citations


Cites methods from "A note on the complexity of L p min..."

  • ...Numerical methods for solving nonsmooth, nonconvex optimization problems have been studied extensively [7,10,12,35,37,51,59,68,77,102]....

    [...]

Proceedings Article
01 Jan 2018
TL;DR: It is shown that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain, in contrast to linearly fully connected networks, where gradient descent converging to the hard margin linear support vector machine solution, regardless of depth.
Abstract: We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear SVM solution, regardless of depth.

226 citations


Cites background from "A note on the complexity of L p min..."

  • ...When L > 2, and thus p = 2/L < 1, problem (10) is non-convex and intractable Ge et al. [2011]. Hence, we cannot expect to ensure convergence to a global minimum....

    [...]

  • ...First, the sequence of gradients∇βL(P(w)) converge in direction to a positive span of support vectors of β∞ = lim t→∞ P(w) ‖P(w(t))‖ (Lemma 8 in Gunasekar et al. [2018]), and this result relies on the loss function ` being exponential tailed....

    [...]

  • ...When L > 2, and thus p = 2/L < 1, problem (10) is non-convex and intractable Ge et al. [2011]. Hence, we cannot expect to ensure convergence to a global minimum. What we do show is convergence to a first order stationary point of (10) in the sense of sub-stationary points introduced in Rockafellar [1979] for optimization problems with non-smooth and non-convex objectives....

    [...]

  • ...Similarly, and as we shall see in this paper, changing to a different parameterization of the same model class can also dramatically change the implicit bias Gunasekar et al. [2017]. In particular, we study the implicit bias of optimizing multi-layer fully connected linear networks, and linear convolutional networks (multiple full width convolutional layers followed by a single fully connected layer) using gradient descent....

    [...]

Journal ArticleDOI
TL;DR: An overview of nonconvex regularization based sparse and low-rank recovery in various fields in signal processing, statistics, and machine learning, including compressive sensing, sparse regression and variable selection, sparse signals separation, sparse principal component analysis (PCA), large covariance and inverse covariance matrices estimation, matrix completion, and robust PCA is given.
Abstract: In the past decade, sparse and low-rank recovery has drawn much attention in many areas such as signal/image processing, statistics, bioinformatics, and machine learning. To achieve sparsity and/or low-rankness inducing, the $\ell _{1}$ norm and nuclear norm are of the most popular regularization penalties due to their convexity. While the $\ell _{1}$ and nuclear norm are convenient as the related convex optimization problems are usually tractable, it has been shown in many applications that a nonconvex penalty can yield significantly better performance. In recent, nonconvex regularization-based sparse and low-rank recovery is of considerable interest and it in fact is a main driver of the recent progress in nonconvex and nonsmooth optimization. This paper gives an overview of this topic in various fields in signal processing, statistics, and machine learning, including compressive sensing, sparse regression and variable selection, sparse signals separation, sparse principal component analysis (PCA), large covariance and inverse covariance matrices estimation, matrix completion, and robust PCA. We present recent developments of nonconvex regularization based sparse and low-rank recovery in these fields, addressing the issues of penalty selection, applications and the convergence of nonconvex algorithms. Code is available at https://github.com/FWen/ncreg.git .

132 citations

Posted Content
TL;DR: This article showed that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell 2/L}$ bridge penalty in the frequency domain.
Abstract: We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear support vector machine solution, regardless of depth.

129 citations

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Book
01 Jan 1995

12,671 citations


"A note on the complexity of L p min..." refers background in this paper

  • ...Thus, x∗ ≥ 0 satisfies the following necessary conditions ([2])....

    [...]

Journal ArticleDOI
TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Abstract: Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally expensive and ignore stochastic errors in the variable selection process. In this article, penalized likelihood approaches are proposed to handle these kinds of problems. The proposed methods select variables and estimate coefficients simultaneously. Hence they enable us to construct confidence intervals for estimated parameters. The proposed approaches are distinguished from others in that the penalty functions are symmetric, nonconcave on (0, ∞), and have singularities at the origin to produce sparse solutions. Furthermore, the penalty functions should be bounded by a constant to reduce bias and satisfy certain conditions to yield continuous solutions. A new algorithm is proposed for optimizing penalized likelihood functions. The proposed ideas are widely applicable. They are readily applied to a variety of ...

8,314 citations


"A note on the complexity of L p min..." refers background in this paper

  • ...Thus, one may consider sparse recovery by solving relaxation problem (1) or (2) for a flxed p, 0 5 ]....

    [...]