Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence

Open AccessPosted Content

Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence

- 09 May 2015 -

TLDR

A randomized second-order method for optimization known as the Newton Sketch, based on performing an approximate Newton step using a randomly projected or sub-sampled Hessian, is proposed, which has super-linear convergence with exponentially high probability and convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities.

Abstract:

We propose a randomized second-order method for optimization known as the Newton Sketch: it is based on performing an approximate Newton step using a randomly projected or sub-sampled Hessian. For self-concordant functions, we prove that the algorithm has super-linear convergence with exponentially high probability, with convergence and complexity guarantees that are independent of condition numbers and related problem-dependent quantities. Given a suitable initialization, similar guarantees also hold for strongly convex and smooth objectives without self-concordance. When implemented using randomized projections based on a sub-sampled Hadamard basis, the algorithm typically has substantially lower complexity than Newton's method. We also describe extensions of our methods to programs involving convex constraints that are equipped with self-concordant barriers. We discuss and illustrate applications to linear programs, quadratic programs with convex constraints, logistic regression and other generalized linear models, as well as semidefinite programs.

Citations

PDF

Open Access

More filters

Posted Content

Optimization Methods for Large-Scale Machine Learning

Léon Bottou, +2 more

- 15 Jun 2016 -

arXiv: Machine Learning

TL;DR: A major theme of this study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter, leading to a discussion about the next generation of optimization methods for large- scale machine learning.

...read moreread less

Journal ArticleDOI

Randomized Sketches of Convex Programs With Sharp Guarantees

Mert Pilanci, +1 more

- 29 Jun 2015 -

IEEE Transactions on Information Theory

TL;DR: This work analyzes RP-based approximations of convex programs, in which the original optimization problem is approximated by solving a lower dimensional problem, and proves that the approximation ratio of this procedure can be bounded in terms of the geometry of the constraint set.

...read moreread less

Journal ArticleDOI

Exact and inexact subsampled Newton methods for optimization

Raghu Bollapragada, +2 more

- 01 Jan 2019 -

Ima Journal of Numerical Analysis

TL;DR: This paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact).

...read moreread less

Journal Article

Iterative hessian sketch: fast and accurate solution approximation for constrained least-squares

Mert Pilanci, +1 more

- 01 Jan 2016 -

Journal of Machine Learning Research

TL;DR: In this paper, the authors study randomized sketching methods for approximately solving least-squares problem with a general convex constraint and provide a general lower bound on any randomized method that sketches both the data matrix and vector.

...read moreread less

Posted Content

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information

Peng Xu, +2 more

- 23 Aug 2017 -

arXiv: Optimization and Control

TL;DR: The canonical problem of finite-sum minimization is considered, and appropriate uniform and non-uniform sub-sampling strategies are provided to construct such Hessian approximations, and optimal iteration complexity is obtained for the correspondingSub-sampled trust-region and adaptive cubic regularization methods.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996 -

Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Book

Matrix computations

Gene H. Golub

Book

Convex Optimization

Stephen Boyd, +1 more

TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.

...read moreread less

Journal ArticleDOI

Least angle regression

Bradley Efron, +19 more

- 01 Apr 2004 -

Annals of Statistics

TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

...read moreread less

Book ChapterDOI

Introduction to the non-asymptotic analysis of random matrices.

Roman Vershynin

TL;DR: This is a tutorial on some basic non-asymptotic methods and concepts in random matrix theory, particularly for the problem of estimating covariance matrices in statistics and for validating probabilistic constructions of measurementMatrices in compressed sensing.

...read moreread less

Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence

Citations

Optimization Methods for Large-Scale Machine Learning

Randomized Sketches of Convex Programs With Sharp Guarantees

Exact and inexact subsampled Newton methods for optimization

Iterative hessian sketch: fast and accurate solution approximation for constrained least-squares

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information

References

Regression Shrinkage and Selection via the Lasso

Matrix computations

Convex Optimization

Least angle regression

Introduction to the non-asymptotic analysis of random matrices.

Related Papers (5)

Accelerating Stochastic Gradient Descent using Predictive Variance Reduction

Numerical Optimization

Deep learning via Hessian-free optimization

On the limited memory BFGS method for large scale optimization

Stochastic dual coordinate ascent methods for regularized loss