Q2. What is the key issue to reduce the gradient eigencomponents in a more?
a suitable alternation of small and large steplengths appears to be a key issue to reduce the gradient eigencomponents in a more balanced way.
Q3. What is the way to calculate the inverses of the steplengths?
The inverses of the steplengths must be chosen as symmetric pairs, in the sense that 1/α2k+1 = λ1 + λn − 1/α2k for sufficiently large k.
Q4. What is the way to avoid the zigzagging pattern of the gradient?
A possibility for avoiding the zigzagging pattern of the gradient is to foster the sequence {1/αk} to sweep all the spectrum of the Hessian matrix.
Q5. How can the gradient methods be extended to the general minimization problem?
Among the gradient methods analysed in the previous section, BB1, LMSD and ABBmin can be extended in a natural way to the general minimization problem (1), using line search strategies to ensure convergence to a stationary point [30, 46, 24].
Q6. What is the steplength of the Minimal Gradient method?
Note that αBB1k is equal to the Cauchy steplength at iteration k− 1, i.e., αSDk−1, while αBB2k is equal to the steplength of the Minimal Gradient method at iteration k − 1, i.e.,αMGk−1 = argmin α>0 ‖∇f(xk−1 − αgk−1)‖.
Q7. What is the way to build steplengths?
Another technique to build steplengths such that the corresponding gradient method approach the optimal complexity is based on the use of the Chebyshev nodes, i.e., the roots of the Chebyshev polynomial of the first kind.
Q8. What is the steplength rule for the BB1 and ABBmin methods?
regardless of the steplength rule, all the methods keep the sequence of tentative steplengths {αk} bounded below and above by the positive constants αmin and αmax.
Q9. What is the drawback of the LMSD method?
as shown in Figure 8, when xk is far from x∗, the LMSD method with ms = 5 generates some very small steplengths whose inverses fall out of the spectra of the Hessian matrices; the choice ms = 3 mitigates this drawback, thanks to the smaller number of previous gradients taken into account.
Q10. Why is the gradient method not proposed as a practical algorithm?
It is worth noting that the author of [29] points out that the gradient method described there is not proposed as a practical algorithm, but only to prove that a complexity bound is achievable.
Q11. What is the convergence rate of the BB methods?
The convergence rate of these BB-related methods is generally R-linear, but their practical convergence behaviour is superior than the SD one, like the original BB methods.
Q12. What is the inverse of the Hessian matrices?
The values of 1/νk generated by LMSD during a sweep attempt to travel in the spectra of the Hessian matrices corresponding to that sweep; in particular, the extreme Ritz values obtained in a sweep can be considered as an attempt to approximate the extreme eigenvalues of the Hessians in that sweep.
Q13. How many iterations of ABBmin is required?
the number of iterations of ABBmin ranges between 27% and 69% of the number of iterations of BB1; on NQP1, the latter method is not able to achieve the required accuracy within 5000 iterations.