Open AccessPosted Content
First-order Methods Almost Always Avoid Saddle Points
Jason D. Lee,Ioannis Panageas,Georgios Piliouras,Max Simchowitz,Michael I. Jordan,Benjamin Recht +5 more
TLDR
In this article, it was shown that first-order methods avoid saddle points for almost all initializations, and that neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoid saddle point.Abstract:
We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including gradient descent, block coordinate descent, mirror descent and variants thereof. The connecting thread is that such algorithms can be studied from a dynamical systems perspective in which appropriate instantiations of the Stable Manifold Theorem allow for a global stability analysis. Thus, neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoid saddle points.read more
Citations
More filters
Journal ArticleDOI
A high-bias, low-variance introduction to Machine Learning for physicists
Pankaj Mehta,Marin Bukov,Ching-Hao Wang,Alexandre G. R. Day,Charles C. Richardson,Charles K. Fisher,David J. Schwab +6 more
TL;DR: The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning.
Journal ArticleDOI
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Yuejie Chi,Yue Lu,Yuxin Chen +2 more
TL;DR: This tutorial-style overview highlights the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees and reviews two contrasting approaches: two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and global landscape analysis and initialization-free algorithms.
Journal ArticleDOI
Denoising Prior Driven Deep Neural Network for Image Restoration
TL;DR: Zhang et al. as mentioned in this paper proposed a convolutional neural network (CNN) based denoiser that can exploit the multi-scale redundancies of natural images and leverages the prior of the observation model.
Proceedings Article
A Lyapunov-based Approach to Safe Reinforcement Learning
TL;DR: In this paper, the authors propose a method for constructing Lyapunov functions, which provide an effective way to guarantee the global safety of a behavior policy during training via a set of local linear constraints.
Posted Content
Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
Panayotis Mertikopoulos,Bruno Lecouat,Bruno Lecouat,Houssam Zenati,Houssam Zenati,Chuan-Sheng Foo,Chuan-Sheng Foo,Vijay Chandrasekhar,Vijay Chandrasekhar,Vijay Chandrasekhar,Georgios Piliouras +10 more
TL;DR: This paper showed that mirror descent may fail to converge even in bilinear models with a unique solution, but this deficiency is mitigated by optimism: by taking an extra-gradient step, optimistic mirror descent (OMD) converges in all coherent problems.
References
More filters
Book
Understanding Machine Learning: From Theory To Algorithms
TL;DR: The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way in an advanced undergraduate or beginning graduate course.
Book
Optimization Algorithms on Matrix Manifolds
TL;DR: Optimization Algorithms on Matrix Manifolds offers techniques with broad applications in linear algebra, signal processing, data mining, computer vision, and statistical analysis and will be of interest to applied mathematicians, engineers, and computer scientists.
Book
Differential Equations and Dynamical Systems
TL;DR: In this paper, the Third Edition of the Third edition of Linear Systems: Local Theory and Nonlinear Systems: Global Theory (LTLT) is presented, along with an extended version of the second edition.
Journal ArticleDOI
Matrix Completion From a Few Entries
TL;DR: OptimSpace as mentioned in this paper reconstructs an n? × n matrix from a uniformly random subset of its entries with probability larger than 1 - 1/n3, which is a generalization of the result of Friedman-Kahn-Szemeredi and Feige-Ofek.
Journal ArticleDOI
Some NP-complete problems in quadratic and nonlinear programming
Katta G. Murty,Santosh N. Kabadi +1 more
TL;DR: A special class of indefinite quadratic programs is constructed, with simple constraints and integer data, and it is shown that checking (a) or (b) on this class is NP-complete.