# Optimization with Sparsity-Inducing Penalties

##### Citations

[...]

3,627 citations

### Cites background from "Optimization with Sparsity-Inducing..."

...For instance, when G is tree-structured, meaning that either two groups g, g′ ∈ G are disjoint or one is a subset of the other, the proximal operator can still be evaluated in linear time, as discussed in [109, 4]....

[...]

...There is a wide literature on applying various proximal algorithms to particular problems or problem domains, such as nuclear norm problems [183], max norm problems [119], sparse inverse covariance selection [178], MAP inference in undirected graphical models [168], loss minimization in machine learning [32, 73, 110, 4], optimal control [155], energy management [116], and signal processing [61]....

[...]

^{1}

1,213 citations

972 citations

### Cites background from "Optimization with Sparsity-Inducing..."

...ptimization 1 arXiv:1606.04474v1 [cs.NE] 14 Jun 2016 problem is known [Martens and Grosse, 2015]. In contrast, communities who focus on sparsity tend to favor very different approaches [Donoho, 2006, Bach et al., 2012]. This is even more the case for combinatorial optimization for which relaxations are often the norm [Nemhauser and Wolsey, 1988]. optimizer optimizee p a r a m et r u p d a t e s e ror s ig n a l Fi...

[...]

928 citations

##### References

40,785 citations

### "Optimization with Sparsity-Inducing..." refers background or methods in this paper

...This leads for instance to the Lasso [134] or basis pursuit [37] with the square loss and to ℓ1-regularized logistic regression (see, for instance, [76, 128]) with the logistic loss....

[...]

...Combined with the square loss, it leads to the group Lasso formulation [142, 156]....

[...]

...Graph Lasso....

[...]

...(4.2) 53 54 (Block) Coordinate Descent Algorithms Lasso case....

[...]

...Section 6.2 focuses on the homotopy algorithm, which can efficiently construct the entire regularization path of the Lasso....

[...]

[...]

33,341 citations

17,764 citations

### "Optimization with Sparsity-Inducing..." refers background in this paper

...Note that such a scheme also appears in statistics in boosting procedures [46]....

[...]

17,433 citations

17,420 citations

### "Optimization with Sparsity-Inducing..." refers background or methods in this paper

...nces therein. 1.4 Optimization Tools The tools used in this paperare relatively basic and should be accessible to a broad audience. Most of them can be found in classical books on convex optimization [18, 20, 25, 91], but for self-containedness, we present here a few of them related to non-smooth unconstrained optimization. In particular, these tools allow the derivation of rigorous approximate optimality conditi...

[...]

...s met, it is easy to see that these procedures stop in a ﬁnite number of iterations. This class of algorithms is typically applied to linear programming and quadratic programming problems (see, e.g., [91]), and here takes speciﬁc advantage of sparsity from a computational point of view [9, 56, 69, 92, 102, 104, 113], since the subproblems that need to be solved are typically much smaller than the orig...

[...]

... kαk2 K = α ⊤Kα. 33 Chapter 5 Reweighted-ℓ2 Algorithms Approximating a nonsmooth or constrained optimization problem by a series of smooth unconstrained problems is common in optimization (see, e.g., [25, 88, 91]). In the context of objective functions regularized by sparsity-inducing norms, it is natural to consider variational formulations of these norms in terms of squared ℓ2-norms, since many eﬃcient meth...

[...]