scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Iterative Solution of Augmented Systems Arising in Interior Methods

01 May 2007-Siam Journal on Optimization (Society for Industrial and Applied Mathematics)-Vol. 18, Iss: 2, pp 666-690
TL;DR: A family of constraint preconditioners is proposed that provably eliminates the inherent ill-conditioning in the augmented system of linear equations that arise in interior methods for general nonlinear optimization.
Abstract: Iterative methods are proposed for certain augmented systems of linear equations that arise in interior methods for general nonlinear optimization. Interior methods define a sequence of KKT equations that represent the symmetrized (but indefinite) equations associated with Newton's method for a point satisfying the perturbed optimality conditions. These equations involve both the primal and dual variables and become increasingly ill-conditioned as the optimization proceeds. In this context, an iterative linear solver must not only handle the ill-conditioning but also detect the occurrence of KKT matrices with the wrong matrix inertia. A one-parameter family of equivalent linear equations is formulated that includes the KKT system as a special case. The discussion focuses on a particular system from this family, known as the “doubly augmented system,” that is positive definite with respect to both the primal and dual variables. This property means that a standard preconditioned conjugate-gradient method involving both primal and dual variables will either terminate successfully or detect if the KKT matrix has the wrong inertia. Constraint preconditioning is a well-known technique for preconditioning the conjugate-gradient method on augmented systems. A family of constraint preconditioners is proposed that provably eliminates the inherent ill-conditioning in the augmented system. A considerable benefit of combining constraint preconditioning with the doubly augmented system is that the preconditioner need not be applied exactly. Two particular “active-set” constraint preconditioners are formulated that involve only a subset of the rows of the augmented system and thereby may be applied with considerably less work. Finally, some numerical experiments illustrate the numerical performance of the proposed preconditioners and highlight some theoretical properties of the preconditioned matrices.

Summary (1 min read)

1.3. Notation and assumptions.

  • It is often the case in practice that the equations and variables corresponding to unit rows of A are eliminated directly from the KKT system.
  • This elimination creates no additional nonzero elements and provides a smaller "partially condensed" system with an Ω(1/μ) diagonal term added to H.
  • It will be shown that preconditioners for both the full and partially condensed KKT systems depend on the eigenvalues of the same matrix (see Lemmas 3.4 and 3.5) .
  • It follows that their analysis also applies to preconditioners defined for the partially condensed system.

By successively replacing H by

  • Fixing μ defines a regularization of the problem, which allows the formulation of methods that do not require an assumption on the rank of the equality constraint Jacobian.
  • With an appropriate choice of constraints, this feature can be used to guarantee that the nonlinear functions and their derivatives are well defined at all points generated by the interior method.
  • Note that the authors cannot compute the condensed or doubly augmented system for these equation because of the zero block.
  • Then, provided that the constraint preconditioner is applied exactly at every PCG step, the right-hand side of (5.3) will remain zero for all subsequent iterations.

6. Some numerical examples.

  • To illustrate the numerical performance of the proposed preconditioners, a PCG method was applied to a collection of illustrative large sparse KKT systems.
  • The test matrices were generated from a number of realistic KKT systems arising in the context of primal-dual interior methods.
  • The authors conclude with some randomly generated problems that illustrate some of the properties of the preconditioned matrices.

6.2. Results from randomly generated problems.

  • The authors expect that the proposed preconditioners would asymptotically give a cluster of 700 unit eigenvalues.
  • Table 6 .3 was generated with the same data used for Table 6 .2, with the one exception that strict complementarity was assumed not to hold.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
SIAM J. OPTIM.
c
2007 Society for Industrial and Applied Mathematics
Vol. 18, No. 2, pp. 666–690
ITERATIVE SOLUTION OF AUGMENTED SYSTEMS ARISING IN
INTERIOR METHODS
ANDERS FORSGREN
, PHILIP E. GILL
, AND JOSHUA D. GRIFFIN
§
Abstract. Iterative methods are proposed for certain augmented systems of linear equations
that arise in interior methods for general nonlinear optimization. Interior methods define a sequence
of KKT equations that represent the symmetrized (but indefinite) equations associated with New-
ton’s method for a point satisfying the perturbed optimality conditions. These equations involve both
the primal and dual variables and become increasingly ill-conditioned as the optimization proceeds.
In this context, an iterative linear solver must not only handle the ill-conditioning but also detect the
occurrence of KKT matrices with the wrong matrix inertia. A one-parameter family of equivalent
linear equations is formulated that includes the KKT system as a special case. The discussion focuses
on a particular system from this family, known as the “doubly augmented system,” that is positive
definite with respect to both the primal and dual variables. This property means that a standard
preconditioned conjugate-gradient method involving both primal and dual variables will either termi-
nate successfully or detect if the KKT matrix has the wrong inertia. Constraint preconditioning is a
well-known technique for preconditioning the conjugate-gradient method on augmented systems. A
family of constraint preconditioners is proposed that provably eliminates the inherent ill-conditioning
in the augmented system. A considerable benefit of combining constraint preconditioning with the
doubly augmented system is that the preconditioner need not be applied exactly. Two particular
“active-set” constraint preconditioners are formulated that involve only a subset of the rows of the
augmented system and thereby may be applied with considerably less work. Finally, some numerical
experiments illustrate the numerical performance of the proposed preconditioners and highlight some
theoretical properties of the preconditioned matrices.
Key words. large-scale nonlinear programming, nonconvex optimization, interior methods,
augmented systems, KKT systems, iterative methods, conjugate-gradient method, constraint pre-
conditioning
AMS subject classifications. 49J20, 49J15, 49M37, 49D37, 65F05, 65K05, 90C30
DOI. 10.1137/060650210
1. Introduction. This paper concerns the formulation and analysis of precon-
ditioned iterative methods for the solution of augmented systems of the form
(1.1)
H A
T
AG

x
1
x
2
=
b
1
b
2
,
with A an m×n matrix, H symmetric, and G symmetric positive semidefinite. These
equations arise in a wide variety of scientific and engineering applications, where they
are known by a number of different names, including “augmented systems,” “saddle-
point systems,” “KKT systems,” and “equilibrium systems.” (The bibliography of the
survey by Benzi, Golub, and Liesen [3] contains 513 related articles.) The main focus
Received by the editors January 17, 2006; accepted for publication (in revised form) March 5,
2007; published electronically August 22, 2007. The research of the second and third authors was
supported by National Science Foundation grants DMS-9973276, CCF-0082100, and DMS-0511766.
http://www.siam.org/journals/siopt/18-2/65021.html
Optimization and Systems Theory, Department of Mathematics, Royal Institute of Technology,
SE-100 44 Stockholm, Sweden (andersf@kth.se). The research of this author was supported by the
Swedish Research Council (VR).
Department of Mathematics, University of California, San Diego, La Jolla, CA 92093-0112
(pgill@ucsd.edu).
§
Sandia National Laboratories, Livermore, CA 94551-9217 (jgriffi@sandia.gov). Part of this work
was carried out during the Spring of 2003 while this author was visiting KTH with financial support
from the oran Gustafsson Foundation.
666

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ITERATIVE SOLUTION OF AUGMENTED SYSTEMS 667
of this paper will be on the solution of augmented systems arising in interior methods
for general constrained optimization, in which case (1.1) is the system associated with
Newton’s method for finding values of the primal and dual variables that satisfy the
perturbed KKT optimality conditions (see, e.g., Wright [49] and Forsgren, Gill, and
Wright [15]). In this context H is the Hessian of the Lagrangian, A is the constraint
Jacobian, and G is diagonal.
Many of the benefits associated with the methods discussed in this paper derive
from formulating the interior method so that the diagonal G is positive definite.We
begin by presenting results for G positive definite and consider the treatment of sys-
tems with positive semidefinite and singular G in section 5. Throughout, for the case
where G is positive definite, we denote G by D and rewrite (1.1) as an equivalent
symmetric system Bx = b, where
(1.2) B =
H A
T
A D
and b =
b
1
b
2
,
with D positive definite and diagonal. We will refer to this symmetric system as the
KKT system. (It is possible to symmetrize (1.1) in a number of different ways. The
format (1.2) will simplify the linear algebra in later sections.) When D is nonsingular,
it is well known that the augmented system is equivalent to the two smaller systems
(1.3) (H + A
T
D
1
A)x
1
= b
1
+ A
T
D
1
b
2
and x
2
= D
1
(b
2
Ax
1
),
where the system for x
1
is known as the condensed system. It is less well known that
another equivalent system is the doubly augmented system
(1.4)
H +2A
T
D
1
AA
T
AD

x
1
x
2
=
b
1
+2A
T
D
1
b
2
b
2
,
which has been proposed for use with direct factorization methods by Forsgren and
Gill [16]. In this paper we investigate the properties of preconditioned iterative meth-
ods applied to system (1.2) directly or to the equivalent systems (1.3) and (1.4).
If the underlying optimization problem is not convex, the matrix H may be in-
definite. The KKT matrix B of (1.2) is said to have correct inertia if the matrix
H + A
T
D
1
A is positive definite. This definition is based on the properties of the un-
derlying optimization problem. Broadly speaking, the KKT system has correct inertia
if the problem is locally convex (for further details see, e.g., Forsgren and Gill [16],
Forsgren [18], and Griffin [32]). If the KKT matrix has correct inertia, then systems
(1.2)–(1.4) have a common unique solution (see section 2).
1.1. Properties of the KKT system. The main issues associated with using
iterative methods to solve KKT systems are (i) termination control, (ii) inertia con-
trol, and (iii) inherent ill-conditioning. The first of these issues is common to other
applications where the linear system represents a linearization of some underlying non-
linear system of equations. Issues (ii) and (iii), however, are unique to optimization
and will be the principal topics of this paper.
In the context of interior methods, the KKT system (1.2) is solved as part of
a two-level iterative scheme. At the outer level, nonlinear equations that define the
first-order optimality conditions are parameterized by a small positive quantity μ.
The idea is that the solution of the parameterized equations should approach the
solution of the optimization problem as μ 0. At the inner level, equations (1.2)
represent the symmetrized Newton equations associated with finding a zero of the

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
668 ANDERS FORSGREN, PHILIP E. GILL, AND JOSHUA D. GRIFFIN
perturbed optimality conditions for a given value of μ. Although systems (1.2)–(1.4)
have identical solutions, an iterative method will generally produce a different se-
quence of iterates in each case (see section 3 for a discussion of the equivalence of
iterative solvers in this context). An iterative method applied to the augmented sys-
tem (1.2) or the doubly augmented system (1.4) treats x
1
and x
2
as independent
variables, which is appropriate in the optimization context because x
1
and x
2
are as-
sociated with independent quantities in the perturbed optimality conditions (i.e., the
primal and dual variables). In contrast, an iterative solver for the condensed system
(1.3) will generate approximations to x
1
only, with the variables x
2
being defined as
x
2
= D
1
(b
2
Ax
1
). This becomes an important issue when an approximate solution
is obtained by truncating the iterations of the linear solver. During the early outer
iterations, it is usually inefficient to solve the KKT system accurately, and it is better
to accept an inexact solution that gives a residual norm that is less than some factor
of the norm of the right-hand side (see, e.g., Dembo, Eisenstat, and Steihaug [7]).
For the condensed system, the residual for the second block of equations will be zero
regardless of the accuracy of x
1
, which implies that termination must be based on the
accuracy of x
1
alone. It is particularly important for the solver to place equal weight
on x
1
and x
2
when system (1.2) is being solved in conjunction with a primal-dual
trust-region method (see Gertz and Gill [20] and Griffin [32]). The conjugate-gradient
version of this method exploits the property that the norms of the (x
1
,x
2
) iterates
increase monotonically (see Steihaug [44]). This property does not hold for (x
1
,x
2
)
iterates generated for the condensed system.
If the KKT matrix does not have the correct inertia, the solution of (1.2) is not
useful, and the optimization continues with an alternative technique based on either
implicitly or explicitly modifying the matrix H (see, e.g., Toint [45], Steihaug [44],
Gould et al. [30], Hager [33], and Griffin [32]). It is therefore important that the
iterative solver is able to detect if B does not have correct inertia.
As the perturbation parameter μ is reduced, the KKT systems become increas-
ingly ill-conditioned. The precise form of this ill-conditioning depends on the formu-
lation of the interior method, but a common feature is that some diagonal elements
of D are big and some are small. (It is almost always possible to formulate an interior
method that requires the solution of an unsymmetric system that does not exhibit
inevitable ill-conditioning as μ 0. This unsymmetric system could be solved us-
ing an unsymmetric solver such as
GMRES or QMR. Unfortunately, this approach
is unsuitable for general KKT systems because an unsymmetric solver is unable to
determine if the KKT matrix has correct inertia.) In section 3 we consider a pre-
conditioned conjugate-gradient (PCG) method that provably removes the inherent
ill-conditioning. In particular, we define a one-parameter family of preconditioners
related to the class of so-called constraint preconditioners proposed by Keller, Gould,
and Wathen [34]. Several authors have used constraint preconditioners in conjunction
with the conjugate-gradient method to solve the indefinite KKT system (1.2) with
b
2
= 0 and D = 0 (see, e.g., Lukˇsan and Vlˇcek [36], Gould, Hribar, and Nocedal [29],
Perugia and Simoncini [40], and Bergamaschi, Gondzio, and Zilli [4]). Recently, Dol-
lar [12] and Dollar et al. [11] have proposed constraint preconditioners for system (1.2)
with no explicit inertial or diagonal condition on D, but a full row-rank requirement
on A and the assumption that b
2
=0.
Methods that require b
2
= 0 must perform an initial projection step that effec-
tively shifts the right-hand side to zero. The constraint preconditioner then forces the
x
1
iterates to lie in the null space of A. A disadvantage with this approach is that the

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ITERATIVE SOLUTION OF AUGMENTED SYSTEMS 669
constraint preconditioner must be applied exactly if subsequent iterates are to lie in
the null space. This limits the ability to perform approximate solves with the precon-
ditioner, as is often required when the matrix A has a PDE-like structure that also
must be handled using an iterative solver (see, e.g., Saad [41], Notay [37], Simoncini
and Szyld [43], and Elman et al. [14]). In section 3 we consider preconditioners that
do not require the assumption that b
2
= 0, and hence do not require an accurate solve
with the preconditioner.
1.2. A PCG method for the KKT system. The goal of this paper is to for-
mulate iterative methods that not only provide termination control and inertia control,
but also eliminate the inevitable ill-conditioning associated with interior methods. All
these features are present in an algorithm based on applying a PCG method to the
doubly augmented system (1.4). This system is positive definite if the KKT matrix
has correct inertia, and gives equal weight to x
1
and x
2
for early terminations. As
preconditioner we use the constraint preconditioner
(1.5) P =
M +2A
T
D
1
AA
T
AD
,
where M is an approximation of H such that M + A
T
D
1
A is positive definite.
The equations Pv = r used to apply the preconditioner are solved by exploiting the
equivalence of the systems
M +2A
T
D
1
AA
T
AD

v
1
v
2
=
r
1
r
2
,(1.6a)
M A
T
A D

v
1
v
2
=
r
1
2A
T
D
1
r
2
r
2
, and(1.6b)
(M + A
T
D
1
A)v
1
= r
1
A
T
D
1
r
2
,v
2
= D
1
(r
2
Av
1
)(1.6c)
(see section 3). This allows us to compute the solution of (1.6a) by solving either
(1.6b) or (1.6c). (The particular choice will depend on the relative efficiency of the
methods available to solve the condensed and augmented systems.)
We emphasize that the doubly augmented systems are never formed or factored
explicitly. The matrix associated with the doubly augmented equations (1.4) is used
only as an operator to define products of the form v = Bu. As mentioned above, the
equations (1.6a) that apply the preconditioner are solved using either (1.6b) or (1.6c).
An important property of the method is that these equations also may be solved using
an iterative method. (It is safe to use the augmented or condensed system for the
preconditioner equations Pv = r because the inertia of P is guaranteed by the choice
of M (see section 3).)
In section 4 we formulate and analyze two variants of the preconditioner (1.5)
that exploit the asymptotic behavior of the elements of D. The use of these so-called
active-set preconditioners may require significantly less work when the underlying
optimization problem has more constraints than variables. In section 5, we consider
the case where G is positive semidefinite and singular. Finally, in section 6, we present
some numerical examples illustrating the properties of the proposed preconditioners.
1.3. Notation and assumptions. Unless explicitly indicated otherwise, ·
denotes the vector two-norm or its subordinate matrix norm. The inertia of a real
symmetric matrix A, denoted by In(A), is the integer triple (a
+
,a
,a
0
) giving the
number of positive, negative, and zero eigenvalues of A. The spectrum of a (possi-
bly unsymmetric) matrix A is denoted by eig(A). As the analysis concerns matrices

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
670 ANDERS FORSGREN, PHILIP E. GILL, AND JOSHUA D. GRIFFIN
with only real eigenvalues, eig(A) is regarded as an ordered set, with the least (i.e.,
“leftmost”) eigenvalue, denoted by eig
min
(A), appearing first. The quantity σ
k
(A)
denotes the kth largest singular value of A. Given a positive-definite A, the unique
positive-definite X such that X
2
= A is denoted by A
1/2
. Given vectors x
1
and x
2
,
the column vector consisting of the elements of x
1
augmented by the elements of x
2
is denoted by (x
1
,x
2
).
When μ is a positive scalar such that μ 0, the notation p = O
μ
means that
there exists a constant K such that |p|≤ for all μ sufficiently small. For a positive
p, p = Ω(1) implies that there exists a constant K such that 1/p for all
μ sufficiently small. In particular, p = O
1
means that |p| is bounded, and, for a
positive p, p = Ω(1) means that p is bounded away from zero. For a positive p, the
notation p = Θ
1
is used for the case where both p = O
1
and p = Ω(1), so that p
remains bounded and is bounded away from zero as μ 0.
As discussed in section 1.1, we are concerned with solving a sequence of systems of
the form (1.2), where the matrices A, H, and D depend implicitly on μ. In particular,
A and H are first and second derivatives evaluated at a point depending on μ, and
D is an explicit function of μ. The notation defined above allows us to characterize
the properties of H, A, and D in terms of their behavior as μ 0. Throughout the
analysis, it is assumed that the following properties hold:
(A
1
) H and A are both O
1
.
(A
2
) The row indices of A may be partitioned into disjoint subsets S, M, and B
such that d
ii
= O
μ
for i ∈S, d
ii
= Θ
1
for i ∈M, and d
ii
= Ω(1) for
i ∈B.
(A
3
)IfA
S
is the matrix of rows of A with indices in S and r = rank(A
S
), then r
remains constant as μ 0 and σ
r
(A
S
) = Θ(1).
The second assumption reflects the fact that for μ sufficiently small, some diagonal
elements of D are “small,” some are “medium,” and some are “big.”
It is often the case in practice that the equations and variables corresponding to
unit rows of A are eliminated directly from the KKT system. This elimination creates
no additional nonzero elements and provides a smaller “partially condensed” system
with an Ω(1) diagonal term added to H. It will be shown that preconditioners for
both the full and partially condensed KKT systems depend on the eigenvalues of the
same matrix (see Lemmas 3.4 and 3.5). It follows that our analysis also applies to
preconditioners defined for the partially condensed system.
2. A parameterized system of linear equations. In this section, it is shown
how the indefinite KKT system (1.2) may be embedded in a family of equivalent
linear systems, parameterized by a scalar ν. This parameterization facilitates the
simultaneous analysis of the three systems (1.2)–(1.4).
Definition 2.1 (the parameterized system). Let ν denote a scalar. Associ-
ated with the KKT equations Bx = b of (1.2), we define the parameterized equations
B(ν)x = b(ν), with
B(ν)=
H +(1+ν)A
T
D
1
A
T
νA νD
and b(ν)=
b
1
+(1+ν)A
T
D
1
b
2
νb
2
,
where H is symmetric and D is positive definite and diagonal.
The following proposition states the equivalence of the KKT system (1.2) and the
parameterized system of Definition 2.1.
Proposition 2.2 (equivalence of the parameterized systems). Let ν denote a
scalar parameter. If ν =0, then the system Bx = b of (1.2) and the system B(ν)x =

Citations
More filters
Journal ArticleDOI
TL;DR: This work presents combinatorial methods to preprocess these matrices to establish more favorable numerical properties for the subsequent factorization in a sparse direct LDLT factorization method where the pivoting is restricted to static supernode data structures.
Abstract: Interior-point methods are among the most efficient approaches for solving large-scale nonlinear programming problems. At the core of these methods, highly ill-conditioned symmetric saddle-point problems have to be solved. We present combinatorial methods to preprocess these matrices in order to establish more favorable numerical properties for the subsequent factorization. Our approach is based on symmetric weighted matchings and is used in a sparse direct LDL T factorization method where the pivoting is restricted to static supernode data structures. In addition, we will dynamically expand the supernode data structure in cases where additional fill-in helps to select better numerical pivot elements. This technique can be seen as an alternative to the more traditional threshold pivoting techniques. We demonstrate the competitiveness of this approach within an interior-point method on a large set of test problems from the CUTE and COPS sets, as well as large optimal control problems based on partial differential equations. The largest nonlinear optimization problem solved has more than 12 million variables and 6 million constraints.

186 citations

Journal ArticleDOI
TL;DR: The mutual impact of linear algebra and optimization is discussed, focusing on interior point methods and on the iterative solution of the KKT system, with a focus on preconditioning, termination control for the inner iterations, and inertia control.
Abstract: The solution of KKT systems is ubiquitous in optimization methods and often dominates the computation time, especially when large-scale problems are considered. Thus, the effective implementation of such methods is highly dependent on the availability of effective linear algebra algorithms and software, that are able, in turn, to take into account specific needs of optimization. In this paper we discuss the mutual impact of linear algebra and optimization, focusing on interior point methods and on the iterative solution of the KKT system. Three critical issues are addressed: preconditioning, termination control for the inner iterations, and inertia control.

55 citations

Journal ArticleDOI
TL;DR: A preconditioning technique applied to the problem of solving linear systems arising from primal-dual interior point algorithms in linear and quadratic programming has the attractive property of improved eigenvalue clustering with increased ill-conditioning of the (1,1) block of the saddle point matrix.
Abstract: We explore a preconditioning technique applied to the problem of solving linear systems arising from primal-dual interior point algorithms in linear and quadratic programming. The preconditioner has the attractive property of improved eigenvalue clustering with increased ill-conditioning of the (1,1) block of the saddle point matrix. It fits well into the optimization framework since the interior point iterates yield increasingly ill-conditioned linear systems as the solution is approached. We analyze the spectral characteristics of the preconditioner, utilizing projections onto the null space of the constraint matrix, and demonstrate performance on problems from the NETLIB and CUTEr test suites. The numerical experiments include results based on inexact inner iterations.

54 citations


Cites background from "Iterative Solution of Augmented Sys..."

  • ...Forsgren, Gill and Griffin [10] extend constraint-based preconditioners to deal with regularized saddle point systems using an approximation of the (1, 1) block coupled with an augmenting term (related to a product with the constraint matrix and regularized (2, 2) block)....

    [...]

Journal ArticleDOI
TL;DR: This paper shows that this bottleneck can be overcome by solving the Schur-complement equations implicitly, using a quasi-Newton preconditioned conjugate gradient method and dramatically reduces the computational cost for problems with many coupling variables.

48 citations

Journal ArticleDOI
TL;DR: A method is proposed that allows the trust-region norm to be defined independently of the preconditioner over a sequence of evolving low-dimensional subspaces and shows that the method can require significantly fewer function evaluations than other methods.
Abstract: We consider methods for large-scale unconstrained minimization based on finding an approximate minimizer of a quadratic function subject to a two-norm trust-region constraint. The Steihaug-Toint method uses the conjugate-gradient method to minimize the quadratic over a sequence of expanding subspaces until the iterates either converge to an interior point or cross the constraint boundary. However, if the conjugate-gradient method is used with a preconditioner, the Steihaug-Toint method requires that the trust-region norm be defined in terms of the preconditioning matrix. If a different preconditioner is used for each subproblem, the shape of the trust-region can change substantially from one subproblem to the next, which invalidates many of the assumptions on which standard methods for adjusting the trust-region radius are based. In this paper we propose a method that allows the trust-region norm to be defined independently of the preconditioner. The method solves the inequality constrained trust-region subproblem over a sequence of evolving low-dimensional subspaces. Each subspace includes an accelerator direction defined by a regularized Newton method for satisfying the optimality conditions of a primal-dual interior method. A crucial property of this direction is that it can be computed by applying the preconditioned conjugate-gradient method to a positive-definite system in both the primal and dual variables of the trust-region subproblem. Numerical experiments on problems from the CUTEr test collection indicate that the method can require significantly fewer function evaluations than other methods. In addition, experiments with general-purpose preconditioners show that it is possible to significantly reduce the number of matrix-vector products relative to those required without preconditioning.

48 citations

References
More filters
Book
01 Jan 1987
TL;DR: This chapter discusses Primal Method Primal-Dual Methods, Path-Following Algorithm, and Infeasible-Interior-Point Algorithms, and their applications to Linear Programming and Interior-Point Methods.
Abstract: Preface Notation 1. Introduction. Linear Programming Primal-Dual Methods The Central Path A Primal-Dual Framework Path-Following Methods Potential-Reduction Methods Infeasible Starting Points Superlinear Convergence Extensions Mehrotra's Predictor-Corrector Algorithm Linear Algebra Issues Karmarkar's Algorithm 2. Background. Linear Programming and Interior-Point Methods Standard Form Optimality Conditions, Duality, and Solution Sets The B {SYMBOL 200 \f "Symbol"} N Partition and Strict Complementarity A Strictly Interior Point Rank of the Matrix A Bases and Vertices Farkas's Lemma and a Proof of the Goldman-Tucker Result The Central Path Background. Primal Method Primal-Dual Methods. Development of the Fundamental Ideas Notes and References 3. Complexity Theory. Polynomial Versus Exponential, Worst Case vs Average Case Storing the Problem Data. Dimension and Size The Turing Machine and Rational Arithmetic Primal-Dual Methods and Rational Arithmetic Linear Programming and Rational Numbers Moving to a Solution from an Interior Point Complexity of Simplex, Ellipsoid, and Interior-Point Methods Polynomial and Strongly Polynomial Algorithms Beyond the Turing Machine Model More on the Real-Number Model and Algebraic Complexity A General Complexity Theorem for Path-Following Methods Notes and References 4. Potential-Reduction Methods. A Primal-Dual Potential-Reduction Algorithm Reducing Forces Convergence A Quadratic Estimate of \Phi _{\rho } along a Feasible Direction Bounding the Coefficients in The Quadratic Approximation An Estimate of the Reduction in \Phi _{\rho } and Polynomial Complexity What About Centrality? Choosing {SYMBOL 114 \f "Symbol"} and {SYMBOL 97 \f "Symbol"} Notes and References 5. Path-Following Algorithms. The Short-Step Path-Following Algorithm Technical Results The Predictor-Corrector Method A Long-Step Path-Following Algorithm Limit Points of the Iteration Sequence Proof of Lemma 5.3 Notes and References 6. Infeasible-Interior-Point Algorithms. The Algorithm Convergence of Algorithm IPF Technical Results I. Bounds on u _k \delimiter "026B30D (x^k,s^k) \delimiter "026B30D Technical Results II. Bounds on (D^k)^{-1} \Delta x^k and D^k \Delta s^k Technical Results III. A Uniform Lower Bound on {SYMBOL 97 \f "Symbol"}k Proofs of Theorems 6.1 and 6.2 Limit Points of the Iteration Sequence 7. Superlinear Convergence and Finite Termination. Affine-Scaling Steps An Estimate of ({SYMBOL 68 \f "Symbol"}x, {SYMBOL 68 \f "Symbol"} s). The Feasible Case An Estimate of ({SYMBOL 68 \f "Symbol"} x, {SYMBOL 68 \f "Symbol"} s). The Infeasible Case Algorithm PC Is Superlinear Nearly Quadratic Methods Convergence of Algorithm LPF+ Convergence of the Iteration Sequence {SYMBOL 206 \f "Symbol"}(A,b,c) and Finite Termination A Finite Termination Strategy Recovering an Optimal Basis More on {SYMBOL 206 \f "Symbol"} (A,b,c) Notes and References 8. Extensions. The Monotone LCP Mixed and Horizontal LCP Strict Complementarity and LCP Convex QP Convex Programming Monotone Nonlinear Complementarity and Variational Inequalities Semidefinite Programming Proof of Theorem 8.4. Notes and References 9. Detecting Infeasibility. Self-Duality The Simplified HSD Form The HSDl Form Identifying a Solution-Free Region Implementations of the HSD Formulations Notes and References 10. Practical Aspects of Primal-Dual Algorithms. Motivation for Mehrotra's Algorithm The Algorithm Superquadratic Convergence Second-Order Trajectory-Following Methods Higher-Order Methods Further Enhancements Notes and References 11. Implementations. Three Forms of the Step Equation The Cholesky Factorization Sparse Cholesky Factorization. Minimum-Degree Orderings Other Orderings Small Pivots in the Cholesky Factorization Dense Columns in A The Augmented System Formulat

2,277 citations


"Iterative Solution of Augmented Sys..." refers background in this paper

  • ...The main focus ∗Received by the editors January 17, 2006; accepted for publication (in revised form) March 5, 2007; published electronically August 22, 2007....

    [...]

Journal ArticleDOI
TL;DR: A large selection of solution methods for linear systems in saddle point form are presented, with an emphasis on iterative methods for large and sparse problems.
Abstract: Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has been a surge of interest in saddle point problems, and numerous solution techniques have been proposed for this type of system. The aim of this paper is to present and discuss a large selection of solution methods for linear systems in saddle point form, with an emphasis on iterative methods for large and sparse problems.

2,253 citations


"Iterative Solution of Augmented Sys..." refers methods in this paper

  • ...This paper concerns the formulation and analysis of preconditioned iterative methods for the solution of augmented systems of the form (1.1) ( H −AT A G )( x1 x2 ) = ( b1 b2 ) , with A an m×n matrix, H symmetric, and G symmetric positive semidefinite....

    [...]

Journal ArticleDOI
TL;DR: An SQP algorithm that uses a smooth augmented Lagrangian merit function and makes explicit provision for infeasibility in the original problem and the QP subproblems is discussed and a reduced-Hessian semidefinite QP solver (SQOPT) is discussed.
Abstract: Sequential quadratic programming (SQP) methods have proved highly effective for solving constrained optimization problems with smooth nonlinear functions in the objective and constraints. Here we consider problems with general inequality constraints (linear and nonlinear). We assume that first derivatives are available and that the constraint gradients are sparse. Second derivatives are assumed to be unavailable or too expensive to calculate. We discuss an SQP algorithm that uses a smooth augmented Lagrangian merit function and makes explicit provision for infeasibility in the original problem and the QP subproblems. The Hessian of the Lagrangian is approximated using a limited-memory quasi-Newton method. SNOPT is a particular implementation that uses a reduced-Hessian semidefinite QP solver (SQOPT) for the QP subproblems. It is designed for problems with many thousands of constraints and variables but is best suited for problems with a moderate number of degrees of freedom (say, up to 2000). Numerical results are given for most of the CUTEr and COPS test collections (about 1020 examples of all sizes up to 40000 constraints and variables, and up to 20000 degrees of freedom).

2,205 citations

Journal ArticleDOI
TL;DR: The method of conjugate gradients for solving systems of linear equations with a symmetric positive definite matrix A is given as a logical development of the Lanczos algorithm for tridiagonalizing...
Abstract: The method of conjugate gradients for solving systems of linear equations with a symmetric positive definite matrix A is given as a logical development of the Lanczos algorithm for tridiagonalizing...

1,644 citations

Journal ArticleDOI
TL;DR: A classical algorithm for solving the system of nonlinear equations is Newton's method as mentioned in this paper, which is known as Newton's algorithm for nonlinear systems of equations (see Fig. 1 ).
Abstract: A classical algorithm for solving the system of nonlinear equations $F(x) = 0$ is Newton’s method \[ x_{k + 1} = x_k + s_k ,\quad {\text{where }}F'(x_k )s_k = - F(x_k ),\quad x_0 {\text{ given}}.\]...

1,551 citations

Frequently Asked Questions (16)
Q1. What have the authors contributed in "Iterative solution of augmented systems arising in interior methods∗" ?

In this paper, a family of constraint preconditioners is proposed to eliminate the inherent ill-conditioning in the augmented system. 

zero elements of G are associated with linearized equality constraints, where the corresponding subset of equations (5.1) are the Newton equations for a zero of the constraint residual. 

The interior-point method requires the solution of systems with a KKT matrix of the form(6.1)( H −JT−J −Γ) ,where H is the n × n Hessian of the Lagrangian, J is the m × n Jacobian matrix of constraint gradients, and Γ is a positive-definite diagonal with some large and smallCopyright © by SIAM. 

If the zero elements of G are associated with linear constraints, and the system (5.3) is solved exactly, it suffices to compute the special step y only once, when solving the first system. 

The authors prefer to do the analysis in terms of the doubly augmented system because it provides the parameterization based on the scalar parameter ν.Copyright © by SIAM. 

provided that the constraint preconditioner is applied exactly at every PCG step, the right-hand side of (5.3) will remain zero for all subsequent iterations. 

It should be emphasized that the choice of C and B affects only the efficiency of the active-set constraint preconditioners and not the definition of the linear equations that need to be solved. 

An advantage of using preconditioning in conjunction with the doubly augmented system is that the linear equations used to apply the preconditioner need not be solved exactly. 

The preconditioner (4.5) has the factorization P 1P(ν) = RPP 2 P(ν)R T P , where RPis the upper-triangular matrix (4.2a) and P 2P(ν) is given by(4.6) P 2P(ν) = ⎛⎝M + (1 + ν)ATCD−1C AC νATCνAC νDC νDB ⎞⎠ . 

In this strict-complementarity case, the authors expect that the proposed preconditioners would asymptotically give a cluster of 700 unit eigenvalues. 

Consider the PCG method applied to a generic symmetric system Ax = b with symmetric positive-definite preconditioner P and initial iterate x0 = 0. 

The data for the test matrices was generated using a primaldual trust-region method (see, e.g., [16, 20, 32]) applied to eight problems, Camshape, Channel, Gasoil, Marine, Methanol, Pinene, Polygon, and Tetra, from the COPS 3.0 test collection [6, 8, 9, 10]. 

Theorems 3.6 and 4.1 predict that for the preconditioners P (1) and P 1P(1), 700 (= m + rank(AS)) eigenvalues of the preconditioned matrix will cluster close to unity, with 600 of these eigenvalues exactly equal to one. 

Since SC is independent of ν, the spectrum of P 1P(ν)−1B(ν) is also independent of ν. Lemma 3.5 implies that SC has at least rank(AS) eigenvalues that are 1 + O ( μ1/2 ) , which establishes part (b).To establish part (c), the authors need to estimate the difference between the eigenvalues of SC and S, where S is given by (3.3a). 

For their example with 100 variables and 100,000 inequality constraints, a matrix of dimension 150 would need to be factored instead of a matrix of dimension 100,100. 

Without loss of generality it may be assumed that the rows of A are ordered so that the m1 row indices in S corresponding to linearlyCopyright © by SIAM.