A note on the complexity of L p minimization
Citations
367 citations
Cites background from "A note on the complexity of L p min..."
...However, the q quasi-norm is nonconvex for q 1, and q minimization is generally NP-hard [17]....
[...]
...However, the q quasi-norm is nonconvex for q < 1, and q minimization is generally NP-hard [17]....
[...]
270 citations
Cites methods from "A note on the complexity of L p min..."
...Numerical methods for solving nonsmooth, nonconvex optimization problems have been studied extensively [7,10,12,35,37,51,59,68,77,102]....
[...]
226 citations
Cites background from "A note on the complexity of L p min..."
...When L > 2, and thus p = 2/L < 1, problem (10) is non-convex and intractable Ge et al. [2011]. Hence, we cannot expect to ensure convergence to a global minimum....
[...]
...First, the sequence of gradients∇βL(P(w)) converge in direction to a positive span of support vectors of β∞ = lim t→∞ P(w) ‖P(w(t))‖ (Lemma 8 in Gunasekar et al. [2018]), and this result relies on the loss function ` being exponential tailed....
[...]
...When L > 2, and thus p = 2/L < 1, problem (10) is non-convex and intractable Ge et al. [2011]. Hence, we cannot expect to ensure convergence to a global minimum. What we do show is convergence to a first order stationary point of (10) in the sense of sub-stationary points introduced in Rockafellar [1979] for optimization problems with non-smooth and non-convex objectives....
[...]
...Similarly, and as we shall see in this paper, changing to a different parameterization of the same model class can also dramatically change the implicit bias Gunasekar et al. [2017]. In particular, we study the implicit bias of optimizing multi-layer fully connected linear networks, and linear convolutional networks (multiple full width convolutional layers followed by a single fully connected layer) using gradient descent....
[...]
132 citations
129 citations
References
42,654 citations
40,020 citations
12,671 citations
"A note on the complexity of L p min..." refers background in this paper
...Thus, x∗ ≥ 0 satisfies the following necessary conditions ([2])....
[...]
12,336 citations
8,314 citations
"A note on the complexity of L p min..." refers background in this paper
...Thus, one may consider sparse recovery by solving relaxation problem (1) or (2) for a flxed p, 0 5 ]....
[...]