scispace - formally typeset
Open AccessJournal ArticleDOI

On the steplength selection in gradient methods for unconstrained optimization

TLDR
This work investigates the relationships between the steplengths of a variety of gradient methods and the spectrum of the Hessian of the objective function, providing insight into the computational effectiveness of the methods, for both quadratic and general unconstrained optimization problems.
About
This article is published in Applied Mathematics and Computation.The article was published on 2018-02-01 and is currently open access. It has received 98 citations till now. The article focuses on the topics: Nonlinear programming & Hessian matrix.

read more

Citations
More filters
Journal ArticleDOI

A family of spectral gradient methods for optimization

TL;DR: A family of spectral gradient methods, whose stepsize is determined by a convex combination of the long Barzilai–Borwein (BB) stepsize and the short BB stepsize, is proposed, and it is proved that the family of methods is R-superlinearly convergent for two-dimensional strictly convex quadratics.
Journal ArticleDOI

Steplength selection in gradient projection methods for box-constrained quadratic programs

TL;DR: This work investigates how the presence of the box constraints affects the spectral properties of the Barzilai–Borwein rules in quadratic programming problems and suggests the introduction of new steplength selection strategies specifically designed for taking account of the active constraints at each iteration.
Journal ArticleDOI

A two-phase gradient method for quadratic programming problems with a single linear constraint and bounds on the variables

TL;DR: Toraldo et al. as mentioned in this paper proposed a gradient-based method for quadratic programming problems with a single linear constraint and bounds on the variables, which alternates between two phases until convergence: an identification phase, which performs gradient projection iterations until either a candidate active set is identified or no reasonable progress is made.
Journal ArticleDOI

Gradient methods exploiting spectral properties

TL;DR: In this article, a new stepsize for the gradient method was proposed, which converges to the reciprocal of the largest eigenvalue of the Hessian, when Dai-Yang's asymptotic optimal gradient method fails.
Journal ArticleDOI

Adaptive $$l_1$$ l 1 -regularization for short-selling control in portfolio selection

TL;DR: This work proposes an updating rule for the regularization parameter in Bregman iteration to control both the sparsity and the number of short positions in the solution.
References
More filters
Book

Matrix computations

Gene H. Golub
Journal ArticleDOI

A Rapidly Convergent Descent Method for Minimization

TL;DR: A number of theorems are proved to show that it always converges and that it converges rapidly, and this method has been used to solve a system of one hundred non-linear simultaneous equations.
Journal ArticleDOI

Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems

TL;DR: This paper proposes gradient projection algorithms for the bound-constrained quadratic programming (BCQP) formulation of these problems and test variants of this approach that select the line search parameters in different ways, including techniques based on the Barzilai-Borwein method.
Book

Introductory Lectures on Convex Optimization: A Basic Course

TL;DR: A polynomial-time interior-point method for linear optimization was proposed in this paper, where the complexity bound was not only in its complexity, but also in the theoretical pre- diction of its high efficiency was supported by excellent computational results.
Journal ArticleDOI

Distribution of eigenvalues for some sets of random matrices

TL;DR: In this article, the authors studied the distribution of eigenvalues for two sets of random Hermitian matrices and one set of random unitary matrices in the energy spectra of disordered systems.
Related Papers (5)
Frequently Asked Questions (13)
Q1. What contributions have the authors mentioned in the paper "On the steplength selection in gradient methods for unconstrained optimization" ?

Their aim is to investigate the relationship between the steplengths of some gradient methods and the spectrum of the Hessian of the objective function, in order to provide insight into the computational effectiveness of these methods. Their study suggests that, in the quadratic case, the methods that tend to use groups of small steplengths followed by some large steplengths, attempting to approximate the inverses of some eigenvalues of the Hessian matrix, exhibit better numerical behaviour. The methods considered in the general case seem to preserve the behaviour of their quadratic counterparts, in the sense that they appear to follow somehow the spectrum of the Hessian of the objective function during their progress toward a stationary point. 

a suitable alternation of small and large steplengths appears to be a key issue to reduce the gradient eigencomponents in a more balanced way. 

The inverses of the steplengths must be chosen as symmetric pairs, in the sense that 1/α2k+1 = λ1 + λn − 1/α2k for sufficiently large k. 

A possibility for avoiding the zigzagging pattern of the gradient is to foster the sequence {1/αk} to sweep all the spectrum of the Hessian matrix. 

Among the gradient methods analysed in the previous section, BB1, LMSD and ABBmin can be extended in a natural way to the general minimization problem (1), using line search strategies to ensure convergence to a stationary point [30, 46, 24]. 

Note that αBB1k is equal to the Cauchy steplength at iteration k− 1, i.e., αSDk−1, while αBB2k is equal to the steplength of the Minimal Gradient method at iteration k − 1, i.e.,αMGk−1 = argmin α>0 ‖∇f(xk−1 − αgk−1)‖. 

Another technique to build steplengths such that the corresponding gradient method approach the optimal complexity is based on the use of the Chebyshev nodes, i.e., the roots of the Chebyshev polynomial of the first kind. 

regardless of the steplength rule, all the methods keep the sequence of tentative steplengths {αk} bounded below and above by the positive constants αmin and αmax. 

as shown in Figure 8, when xk is far from x∗, the LMSD method with ms = 5 generates some very small steplengths whose inverses fall out of the spectra of the Hessian matrices; the choice ms = 3 mitigates this drawback, thanks to the smaller number of previous gradients taken into account. 

It is worth noting that the author of [29] points out that the gradient method described there is not proposed as a practical algorithm, but only to prove that a complexity bound is achievable. 

The convergence rate of these BB-related methods is generally R-linear, but their practical convergence behaviour is superior than the SD one, like the original BB methods. 

The values of 1/νk generated by LMSD during a sweep attempt to travel in the spectra of the Hessian matrices corresponding to that sweep; in particular, the extreme Ritz values obtained in a sweep can be considered as an attempt to approximate the extreme eigenvalues of the Hessians in that sweep. 

the number of iterations of ABBmin ranges between 27% and 69% of the number of iterations of BB1; on NQP1, the latter method is not able to achieve the required accuracy within 5000 iterations.