Open AccessPosted Content
Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence
TLDR
A randomized second-order method for optimization known as the Newton Sketch, based on performing an approximate Newton step using a randomly projected or sub-sampled Hessian, is proposed, which has super-linear convergence with exponentially high probability and convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities.Abstract:
We propose a randomized second-order method for optimization known as the Newton Sketch: it is based on performing an approximate Newton step using a randomly projected or sub-sampled Hessian. For self-concordant functions, we prove that the algorithm has super-linear convergence with exponentially high probability, with convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities. Given a suitable initialization, similar guarantees also hold for strongly convex and smooth objectives without self-concordance. When implemented using randomized projections based on a sub-sampled Hadamard basis, the algorithm typically has substantially lower complexity than Newton's method. We also describe extensions of our methods to programs involving convex constraints that are equipped with self-concordant barriers. We discuss and illustrate applications to linear programs, quadratic programs with convex constraints, logistic regression and other generalized linear models, as well as semidefinite programs.read more
Citations
More filters
Posted Content
Optimization Methods for Large-Scale Machine Learning
TL;DR: A major theme of this study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter, leading to a discussion about the next generation of optimization methods for large- scale machine learning.
Journal ArticleDOI
Randomized Sketches of Convex Programs With Sharp Guarantees
TL;DR: This work analyzes RP-based approximations of convex programs, in which the original optimization problem is approximated by solving a lower dimensional problem, and proves that the approximation ratio of this procedure can be bounded in terms of the geometry of the constraint set.
Journal ArticleDOI
Exact and inexact subsampled Newton methods for optimization
TL;DR: This paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact).
Journal Article
Iterative hessian sketch: fast and accurate solution approximation for constrained least-squares
TL;DR: In this paper, the authors study randomized sketching methods for approximately solving least-squares problem with a general convex constraint and provide a general lower bound on any randomized method that sketches both the data matrix and vector.
Posted Content
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information
TL;DR: The canonical problem of finite-sum minimization is considered, and appropriate uniform and non-uniform sub-sampling strategies are provided to construct such Hessian approximations, and optimal iteration complexity is obtained for the correspondingSub-sampled trust-region and adaptive cubic regularization methods.
References
More filters
Journal ArticleDOI
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Book
Convex Optimization
Stephen Boyd,Lieven Vandenberghe +1 more
TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.
Journal ArticleDOI
Least angle regression
Bradley Efron,Trevor Hastie,Iain M. Johnstone,Robert Tibshirani,Hemant Ishwaran,Keith Knight,Jean-Michel Loubes,Jean-Michel Loubes,Pascal Massart,Pascal Massart,David Madigan,David Madigan,Greg Ridgeway,Greg Ridgeway,Saharon Rosset,Saharon Rosset,Ji Zhu,Robert A. Stine,Berwin A. Turlach,Sanford Weisberg +19 more
TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.
Book ChapterDOI
Introduction to the non-asymptotic analysis of random matrices.
TL;DR: This is a tutorial on some basic non-asymptotic methods and concepts in random matrix theory, particularly for the problem of estimating covariance matrices in statistics and for validating probabilistic constructions of measurementMatrices in compressed sensing.