scispace - formally typeset
Search or ask a question

Showing papers by "Dmitriy Drusvyatskiy published in 2017"


Posted Content
TL;DR: In this paper, the authors consider a nonsmooth formulation of the real phase retrieval problem and show that under standard statistical assumptions, a simple subgradient method converges linearly when initialized within a constant relative distance of an optimal solution.
Abstract: We consider a popular nonsmooth formulation of the real phase retrieval problem. We show that under standard statistical assumptions, a simple subgradient method converges linearly when initialized within a constant relative distance of an optimal solution. Seeking to understand the distribution of the stationary points of the problem, we complete the paper by proving that as the number of Gaussian measurements increases, the stationary points converge to a codimension two set, at a controlled rate. Experiments on image recovery problems illustrate the developed algorithm and theory.

59 citations


Book
19 Dec 2017
TL;DR: In this article, the authors describe various reasons for the loss of strict feasibility, whether due to poor modelling choices or (more interestingly) rich underlying structure, and discuss ways to cope with it and, in many pronounced cases, how to use it as an advantage.
Abstract: Slater's condition -- existence of a "strictly feasible solution" -- is a common assumption in conic optimization. Without strict feasibility, first-order optimality conditions may be meaningless, the dual problem may yield little information about the primal, and small changes in the data may render the problem infeasible. Hence, failure of strict feasibility can negatively impact off-the-shelf numerical methods, such as primal-dual interior point methods, in particular. New optimization modelling techniques and convex relaxations for hard nonconvex problems have shown that the loss of strict feasibility is a more pronounced phenomenon than has previously been realized. In this text, we describe various reasons for the loss of strict feasibility, whether due to poor modelling choices or (more interestingly) rich underlying structure, and discuss ways to cope with it and, in many pronounced cases, how to use it as an advantage. In large part, we emphasize the facial reduction preprocessing technique due to its mathematical elegance, geometric transparency, and computational potential.

53 citations


Posted Content
TL;DR: The role of the proximal point method in large scale optimization is revisited, with focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.
Abstract: In this short survey, I revisit the role of the proximal point method in large scale optimization. I focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.

47 citations


Posted Content
TL;DR: A generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions and promising experimental results obtained by applying the proposed approach to SVRG and SAGA for sparse matrix factorization and for learning neural networks are shown.
Abstract: We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. When the objective is convex, the proposed approach enjoys the same properties as the Catalyst approach of Lin et al, 2015. When the objective is nonconvex, it achieves the best known convergence rate to stationary points for first-order methods. Specifically, the proposed algorithm does not require knowledge about the convexity of the objective; yet, it obtains an overall worst-case efficiency of O(e−2) and, if the function is convex, the complexity reduces to the near-optimal rate O(e −2/3). We conclude the paper by showing promising experimental results obtained by applying the proposed approach to SVRG and SAGA for sparse matrix factorization and for learning neural networks.

41 citations


Journal ArticleDOI
TL;DR: It is observed that Sturm’s error bounds readily imply that for semidefinite feasibility problems, the method of alternating projections converges at a rate of O(k-12d+1-2), where d is the singularity degree of the problem—the minimal number of facial reduction iterations needed to induce Slater's condition.
Abstract: We observe that Sturm's error bounds readily imply that for semidefinite feasibility problems, the method of alternating projections converges at a rate of $$\mathcal {O}\Big (k^{-\frac{1}{2^{d+1}-2}}\Big )$$O(k-12d+1-2), where d is the singularity degree of the problem--the minimal number of facial reduction iterations needed to induce Slater's condition. Consequently, for almost all such problems (in the sense of Lebesgue measure), alternating projections converge at a worst-case rate of $$\mathcal {O}\Big (\frac{1}{\sqrt{k}}\Big )$$O(1k).

34 citations


Journal ArticleDOI
TL;DR: In this article, the Euclidean distance degree of an orthogonally invariant matrix variety is shown to be the same as that of a real variety with a restriction to diagonal matrices.
Abstract: The Euclidean distance degree of a real variety is an important invariant arising in distance minimization problems. We show that the Euclidean distance degree of an orthogonally invariant matrix variety equals the Euclidean distance degree of its restriction to diagonal matrices. We illustrate how this result can greatly simplify calculations in concrete circumstances.

28 citations


Journal ArticleDOI
TL;DR: Two algorithms for large-scale noisy low-rank Euclidean distance matrix completion problems, based on semidefinite optimization, are presented, one of which is a first-order method for maximizing the trace in the formulation of the problem with a constrained misfit.
Abstract: We present two algorithms for large-scale noisy low-rank Euclidean distance matrix completion problems, based on semidefinite optimization. Our first method works by relating cliques in the graph of the known distances to faces of the positive semidefinite cone, yielding a combinatorial procedure that is provably robust and partly parallelizable. Our second algorithm is a first-order method for maximizing the trace---a popular low-rank inducing regularizer---in the formulation of the problem with a constrained misfit. Both of the methods output a point configuration that can serve as a high-quality initialization for local optimization techniques. Numerical experiments on large-scale sensor localization problems illustrate the two approaches.

18 citations


Posted Content
TL;DR: Various reasons for the loss of strict feasibility are described, whether due to poor modelling choices or rich underlying structure, and ways to cope with it are discussed and, in many pronounced cases, how to use it as an advantage are discussed.
Abstract: Slater's condition -- existence of a "strictly feasible solution" -- is a common assumption in conic optimization. Without strict feasibility, first-order optimality conditions may be meaningless, the dual problem may yield little information about the primal, and small changes in the data may render the problem infeasible. Hence, failure of strict feasibility can negatively impact off-the-shelf numerical methods, such as primal-dual interior point methods, in particular. New optimization modelling techniques and convex relaxations for hard nonconvex problems have shown that the loss of strict feasibility is a more pronounced phenomenon than has previously been realized. In this text, we describe various reasons for the loss of strict feasibility, whether due to poor modelling choices or (more interestingly) rich underlying structure, and discuss ways to cope with it and, in many pronounced cases, how to use it as an advantage. In large part, we emphasize the facial reduction preprocessing technique due to its mathematical elegance, geometric transparency, and computational potential.

16 citations


Posted Content
TL;DR: In this article, the Fenchel-Rockafellar dual of the gauge dual is explained using a modern approach to duality based on a perturbation framework, including explaining gauge dual variables as sensitivity measures.
Abstract: We revisit the foundations of gauge duality and demonstrate that it can be explained using a modern approach to duality based on a perturbation framework. We therefore put gauge duality and Fenchel-Rockafellar duality on equal footing, including explaining gauge dual variables as sensitivity measures, and showing how to recover primal solutions from those of the gauge dual. This vantage point allows a direct proof that optimal solutions of the Fenchel-Rockafellar dual of the gauge dual are precisely the primal solutions rescaled by the optimal value. We extend the gauge duality framework to the setting in which the functional components are general nonnegative convex functions, including problems with piecewise linear quadratic functions and constraints that arise from generalized linear models used in regression.

15 citations


Journal ArticleDOI
TL;DR: BASAL PFB-03 (Chile), FONDECYT 1171854 ( Chile), MINECO of Spain and ERDF of EU as mentioned in this paper, 2015.
Abstract: BASAL PFB-03 (Chile), FONDECYT 1171854 (Chile) MTM2014-59179-C2-1-P (MINECO of Spain and ERDF of EU). AFOSR YIP FA9550-15-1-0237

1 citations