What is the basic idea of such strategies?

The basic idea of such strategies is to reuse the CP until its effectiveness deteriorates in terms of inner iterations required to solve the system.

Why is the fill-in problem a natural choice?

Since the matrices to be factorized are often sparse, suitable reordering strategies are exploited to deal with the fill-in problem.

Why is SQMR able to be applied to the preconditioned KKT system?

Note that, due to the symmetry of K and P , SQMR can be applied to the preconditioned KKT system, which is transpose-free, and hence computationally more efficient.

What are the main issues that are related to the iterative linear algebra solvers?

The authors focused on large-scale problems and on the iterative linear algebra solvers, addressing, in particular, three fundamental issues which are related to specific needs of IP methods and have a significat impact on their effectiveness: preconditioning of the KKT system, with special attention to CPs, adaptive stopping criteria for the inner iterations, and controlling the inertia of the KKT matrix.

How has the PRQP solver been compiled?

The PR code, written in Fortran 77 with a C driver that manages dynamic memory allocation, has been compiled using the g77 3.4.6 and gcc 4.1.3 compilers.

What is the definition of inertia control?

The ability of a solver to reveal and modify the inertia of K is referred to as inertia control, and hence a solver that has this capability is referred to as inertia-controlling solver.

How can the authors obtain an approximate solution of (6)?

An approximate solution of (6) can be obtained by applying a Newton step to the KKT conditions of the BP, starting from a previous approximation.

What is the default choice for the preconditioner?

The default choice for the preconditioner is the exact CP; it is applied through the sparse LBLT factorization provided by the MA27 suite of routines [27] from the Harwell Subroutine Library.

What can be used to decide when to update the preconditioner?

Other criteria can be chosen to decide when to update the preconditioner; furthermore, CG and SQMR can be also applied alternately, i.e. CG when the CP is used for the first time and SQMR in all the remaining cases (see [16] for details).

Why is the cost of factorizations prohibitive?

When the problem is large-scale, the cost of the factorizations may be prohibitive in terms of memory and time, thus limiting the effective use of optimization codes.

How can one obtain an approximate CP?

CP approximations are obtained by reusing for multiple IP iterations the CP that has been computed at a certain iteration.

Why is the CP idea motivated by the observation that when the IP method progresses toward?

This idea is motivated by the observation that, when the IP method progresses toward the solution, the entries in D generally get smaller.

What are the conditions of the first order optimality conditions of reduced quadratic optimization problems?

The authors observe that systems (24), (25) and (20) are the first order optimality conditions of reduced quadratic optimization problems, obtained by eliminating all the constraints from the original quadratic problems.

What is the difference between QMR and GMRES?

Unlike GMRES, which uses a long-term recurrence for generating an orthogonal basis for the corresponding Krylov subspace, QMR is based on a short-term recurrence, but generates a nonorthogonal basis.

(Open Access) On mutual impact of numerical linear algebra and large-scale optimization with focus on interior point methods (2010) | Marco D'Apuzzo

Q: What are the contributions mentioned in the paper "Dipartimento di matematica" ?

In this paper the authors discuss the mutual impact of linear algebra and optimization, focusing on interior point methods and on the iterative solution of the KKT system.

Q: What are the main shortcomings of the SQP methods?

The SQP methods also have several critical shortcomings, such as the possibility that the subproblem is not convex, the linearized constraints are inconsistent and the iterates do not converge.

DIPARTIMENTO DI MATEMATICA

Complesso universitario, via Vivaldi - 81100 Caserta

ON MUTUAL IMPACT OF NUMERICAL LINEAR ALGEBRA

AND LARGE

SCALE OPTIMIZATION

WITH FOCUS ON INTERIOR POINT METHODS

Marco D’Apuzzo

,Valentina De Simone

and Daniela di Serafino

PREPRINT: n. 1, impaginato nel mese di marzo 2008

CLASSIFICAZIONE AMS: 65F10, 65K05, 90C30, 90C06

Dipartimento di Matematica, Seconda Università di Napoli, via Vivaldi 43, I-81100 Caserta, Italy

On mutual impact of numerical linear algebra

and large-scale optimization

with focus on interior point methods

Marco D’Apuzz o, Valentina De Simone, Daniela di Seraﬁno

Department of Mathematics, Second University of Naples, Caserta, Italy

E-mail: {marco.dapuzzo,valentina.desimone,daniela.diseraﬁno}@unina2.it

Abstract

The solution of KKT systems is ubiquitous in optimization methods and often

domi nates the computation time, especially when l arge-scale problems are consid-

ered. Thus, the eﬀective implementation of such methods is highl y dependent on

the availability of eﬀective linear algebra algorithms and software, that are able, in

turn, to take into account speciﬁc needs of optimization. In this paper we discuss

the mutual impact of linear algebra and optimization, focusing on interior point

methods and on the iterative solution of the KKT system. Three critical issues

are addressed: preconditioning, termination control for the inner iterations, and

inertia control.

Keywords: large-scale optimization, interior point methods, KKT system, constraint

preconditioners, adaptive stopping criteria, inertia control.

1 Introduction

The strong interplay between numerical linear algebra and optimization has been ev-

ident for a long time. Much progress in numerical linear algebra has been spurred by

the need of solving linear systems with special features in the context o f optimization,

and many optimization codes have beneﬁted, in terms of both eﬃciency and robust-

ness, from advances in numerical linear algebra, coming out also fro m needs in other

ﬁelds of scientiﬁc computing. This interplay is clearly recog nized in the textbook [41]

by Gil l et al., where a presentation of numerical optim ization and numerical linear

algebra techniques is provided, highlig hting the relations between the two ﬁelds in t he

broader context of scientiﬁc computing. A general di s cussion of the r ole of numerical

linear algebra in optimization in the 20th century is in t he essay by O’Leary [65]. She

points out that in a ny optimization algorithm the work involved “in generati ng points

approximating an o pt imal point” is often “dom inated by linear algebra, usually in the

form of solution of a linear system or least squares problem and updat ing of matrix

information”. By looking at the connections between the advances in numerical linear

algebra and in optimization, O’Leary comes to the conclusion that t here is a symbiosis

between the two ﬁelds and foresees that it will continue in the current century.

In our opinio n, this symbiosis is getting stronger and st ronger, especially in the con-

text of large-scale o ptimization problems, where the solution of linear algebra problems

often dominates the computation time. A clear signal of this trend is also the organi-

zation of events gathering people working i n the two ﬁelds (see, e.g., [82, 83, 84]). The

aim of this paper is j ust to discuss the mutual impact of recent developments in numer-

ical linear algebra and optimization, focusing on the so lution of large-scale nonlinear

optimization problems and on iterative linear algebra techniques, where much progress

has been made in the last years (see [2, 5, 48, 74] and the references therein).

Thus, we consider the following general nonlinear optimizati on problem:

minimize

f(x)

s. t. c

(x) ≥ 0,

(x) = 0,

(1)

where f : <

−→ < is the objective function, c

: <

−→ <

and c

: <

−→ <

are the inequality and equality constraints, respectively, and m

≤ n. We assume

that f, c

and c

are twice continuously diﬀerentiable and some constraint qualiﬁcation

holds, such as the Linear Independence or the Mangasari an-Fromovitz one, so that a

solution of problem (1) satisﬁes the Karush-Kuhn-Tucker (KKT) conditions (see, e.g.,

[64, Chapter 12]).

General problems of type (1) are often hard t o solve and in the last years many

research eﬀorts have been devoted to improve optimizati on algorithms with the double

goal of success and high performance over a wide range of pro blems. However, no single

approach has resulted uniformly robust and eﬃcient in tackling nonli near optimization

problems. Among the various methods developed for such problems, two approaches

have emerged: Sequential Quadratic Programming (SQP) and Interior Point (IP). Both

approaches are based on the idea of moving toward a (local) solution of problem (1) by

approximately solving a sequence of “simpler” problems; of course, they strongly diﬀer

for the characterist ics of these problems and the strategies used to solve them.

As explained in the next section, SQP and IP methods have a commo n linear algebra

kernel; at each iteration they require the solution of the so-called KKT lin ear system:



H −J

−J −D









, (2)

where H ∈ <

n×n

, D ∈ <

m×m

and J ∈ <

m×n

, wi th m ≤ m

+ m

. The matr ix H

is usually (an approximation of) the Hessian of the Lagrangian of problem (1) at the

current iteration, and hence it is symmetric and possi bly indeﬁnite, J is the Jacobian

of some or all the constraints, and D is diagonal and positive semideﬁnite, possibly

null. Note that, in large-scal e problems, system (2) is usually sparse. In the following,

the matrix of this system is denoted by K and is called KKT matrix.

The solution of KKT systems is often the most computationally expensive task in

SQP and IP methods. Thus, the eﬀective implement ation of such methods is highly

dependent on the availability of eﬀective linear algebra algorithms and software, that

are able, in turn, to t ake into account speciﬁc needs of the optimiza tion solvers. Note

that KKT sys tems arise also in the solution of other optimization problems, such as lea st

squares ones, and in the more general context of saddle-point problems [7], therefore

very l arge interest is devoted to this subject. For these reasons, our discussion on linear

algebra and optimi zation is centred on the KKT system.

The remainder of the paper is organized as follows. In Section 2 we show how the

KKT sys tem arises in SQP and IP methods for solvi ng problem (1). In Section 3 we

report main properties of the KKT matr ix, which must be taken into account in s olving

the related system in the context of optimization. In Section 4 fundamental issues in

solving the KKT system are discussed, focusing on interior p oint methods and iterative

linear algebra solvers; preconditioning, adaptive termination of the inner iterations and

inertia control are addressed. Finally, in Section 5 we report some experiences i n the

application of iterative linear algebra techniques in the context of a potential reduction

method for quadratic programming. Concluding remarks are given in Section 6.

2 Linear algebra in SQP and IP methods

In order to show how KKT sys tems arise in SQP and IP methods, we give a sketch of

both, presenting only their basic ideas applied to the general pro blem (1). A deeper

discussion of these methods is outside the scope of the paper; many details can be found

in the surveys [ 33, 48, 50] and in the references therein.

Henceforth we use the following notations: g(x) = ∇f(x) (gradient of the objective

function), L(x, y) = f(x) − y

(x) − y

(x) (Lagrangi an function of the probl em),

H(x, y) ≈ ∇

L(x, y) (approximation to) the Hessian of the Lagrangian function w ith

respect to x, J

(x) = ∇c

(x) and J

(x) = ∇c

(x) (Jacobian matrices of the inequality

and equality constraints, respectively). Furthermo re, the identity matrix is denoted

by I and the vector of all 1’s by e; for any vector v the diagonal ma trix di ag(v) is

denoted by the corresponding uppercase letter V , and, for any vectors v and w, (v, w)

is a shorthand for (v

, w

)

The basic idea of a SQP m ethod is to generate a sequence of approximate (local)

solutions of problem (1), by solving, at each iteration, a Quadratic Programming (QP)

problem, such as

minimize

δx

q(δx) ≡ δx

g(x) +

δx

H(x, y)δx

s. t. c

(x) + J

(x)δx ≥ 0,

(x) + J

(x)δx = 0,

(3)

where q(δx) is a quadratic (e.g. Quasi-Newt on) approximation of L(x, y) and δx is a

search direction. A commonly used strategy to solve the SQP subproblem is based on

the active-set approach. It tries to predict the inequality constraints that are active

at the solution and solves an equality constrained optimization problem, hence it is

called Sequential Equality-constrained Quadratic Programming (SEQP). The quadrati c

problem (3) reduces to

minimize

δx

q(δx) ≡ δx

g(x) +

δx

H(x, y)δx

s.t. c

(x) + J

(x)δx = 0,

(4)

where A is an estim ate of the active set at x, c

are t he co nstraints corresponding t o

A and J

is t he related Jacobi an. An optimal solution of problem (4) satisﬁes the ﬁrst

order optimality conditions for this problem, i.e. it is solution of the linear system



H(x, y) −J

(x)

−J

(x) 0



δx





−g(x)

(x)



, (5)

where y

is the vector of Lagrangian multipliers for the quadratic problem (4), that

provides an approximation of the Lagrangian multipliers corresponding to c

in the

original problem (1). This system has the form (2 ), with H = H(x, y) and J = J

(x).

Systems of this type are obtained also in the case of Sequential Inequality-constrained

Quadratic Programming met hods, where no a priori prediction of the active set is

made [48]. The step δx resulting from the solution of (5) is used to update the current

approximation of the solution; actually, a linesearch or t rust-region approach must b e

applied to obtain useful updates.

A fundamental aspect for the eﬀectiveness of SQP methods is the choice of the

Hessian approximation H(x, y ) at each iterat ion. The SQP methods also have several

criti cal shortcomings, such as the possibility that the subproblem is not convex, the

linearized constraints are inconsist ent and the iterates do not converge. For a discussion

on these issues and the strategies to deal with them the reader is referred to [4 8] and

the references therein. We only note that any variant of the SQP method outlined here

requires the solution of KKT systems.

The key idea of IP met hods is to approach a so lution of problem (1) by approxi-

mately solving a sequence of barrier problems (BPs), depending on a parameter µ > 0.

On mutual impact of numerical linear algebra and large-scale optimization with focus on interior point methods

Figures

Citations

Interior point methods 25 years later

Matrix-free interior point method

Convergence Analysis of an Inexact Feasible Interior Point Method for Convex Quadratic Programming

Convergence Analysis of an Inexact Feasible Interior Point Method for Convex Quadratic Programming

A comparison of reduced and unreduced KKT systems arising from interior point methods

References

Matrix computations

Numerical Optimization

GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems

Matrix computations (3rd ed.)

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming

Related Papers (5)

Primal-Dual Interior-Point Methods

Numerical solution of saddle point problems

Interior point methods 25 years later

On the Implementation of a Primal-Dual Interior Point Method

GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems

Frequently Asked Questions (16)

Q1. What are the contributions mentioned in the paper "Dipartimento di matematica" ?

Q2. What is the basic idea of such strategies?

Q3. Why is the fill-in problem a natural choice?

Q4. Why is SQMR able to be applied to the preconditioned KKT system?

Q5. What are the main issues that are related to the iterative linear algebra solvers?

Q6. What are the main shortcomings of the SQP methods?

Q7. How has the PRQP solver been compiled?

Q8. What is the definition of inertia control?

Q9. How can the authors obtain an approximate solution of (6)?

Q10. What is the default choice for the preconditioner?

Q11. What can be used to decide when to update the preconditioner?

Q12. Why is the cost of factorizations prohibitive?

Q13. How can one obtain an approximate CP?

Q14. Why is the CP idea motivated by the observation that when the IP method progresses toward?

Q15. What are the conditions of the first order optimality conditions of reduced quadratic optimization problems?

Q16. What is the difference between QMR and GMRES?