TL;DR: A family of constraint preconditioners is proposed that provably eliminates the inherent ill-conditioning in the augmented system of linear equations that arise in interior methods for general nonlinear optimization.
Abstract: Iterative methods are proposed for certain augmented systems of linear equations that arise in interior methods for general nonlinear optimization. Interior methods define a sequence of KKT equations that represent the symmetrized (but indefinite) equations associated with Newton's method for a point satisfying the perturbed optimality conditions. These equations involve both the primal and dual variables and become increasingly ill-conditioned as the optimization proceeds. In this context, an iterative linear solver must not only handle the ill-conditioning but also detect the occurrence of KKT matrices with the wrong matrix inertia. A one-parameter family of equivalent linear equations is formulated that includes the KKT system as a special case. The discussion focuses on a particular system from this family, known as the “doubly augmented system,” that is positive definite with respect to both the primal and dual variables. This property means that a standard preconditioned conjugate-gradient method involving both primal and dual variables will either terminate successfully or detect if the KKT matrix has the wrong inertia. Constraint preconditioning is a well-known technique for preconditioning the conjugate-gradient method on augmented systems. A family of constraint preconditioners is proposed that provably eliminates the inherent ill-conditioning in the augmented system. A considerable benefit of combining constraint preconditioning with the doubly augmented system is that the preconditioner need not be applied exactly. Two particular “active-set” constraint preconditioners are formulated that involve only a subset of the rows of the augmented system and thereby may be applied with considerably less work. Finally, some numerical experiments illustrate the numerical performance of the proposed preconditioners and highlight some theoretical properties of the preconditioned matrices.
TL;DR: This work presents combinatorial methods to preprocess these matrices to establish more favorable numerical properties for the subsequent factorization in a sparse direct LDLT factorization method where the pivoting is restricted to static supernode data structures.
Abstract: Interior-point methods are among the most efficient approaches for solving large-scale nonlinear programming problems. At the core of these methods, highly ill-conditioned symmetric saddle-point problems have to be solved. We present combinatorial methods to preprocess these matrices in order to establish more favorable numerical properties for the subsequent factorization. Our approach is based on symmetric weighted matchings and is used in a sparse direct LDL T factorization method where the pivoting is restricted to static supernode data structures. In addition, we will dynamically expand the supernode data structure in cases where additional fill-in helps to select better numerical pivot elements. This technique can be seen as an alternative to the more traditional threshold pivoting techniques. We demonstrate the competitiveness of this approach within an interior-point method on a large set of test problems from the CUTE and COPS sets, as well as large optimal control problems based on partial differential equations. The largest nonlinear optimization problem solved has more than 12 million variables and 6 million constraints.
TL;DR: A preconditioning technique applied to the problem of solving linear systems arising from primal-dual interior point algorithms in linear and quadratic programming has the attractive property of improved eigenvalue clustering with increased ill-conditioning of the (1,1) block of the saddle point matrix.
Abstract: We explore a preconditioning technique applied to the problem of solving linear systems arising from primal-dual interior point algorithms in linear and quadratic programming. The preconditioner has the attractive property of improved eigenvalue clustering with increased ill-conditioning of the (1,1) block of the saddle point matrix. It fits well into the optimization framework since the interior point iterates yield increasingly ill-conditioned linear systems as the solution is approached. We analyze the spectral characteristics of the preconditioner, utilizing projections onto the null space of the constraint matrix, and demonstrate performance on problems from the NETLIB and CUTEr test suites. The numerical experiments include results based on inexact inner iterations.
Cites background from "Iterative Solution of Augmented Sys..."
...Forsgren, Gill and Griffin  extend constraint-based preconditioners to deal with regularized saddle point systems using an approximation of the (1, 1) block coupled with an augmenting term (related to a product with the constraint matrix and regularized (2, 2) block)....
TL;DR: A method is proposed that allows the trust-region norm to be defined independently of the preconditioner over a sequence of evolving low-dimensional subspaces and shows that the method can require significantly fewer function evaluations than other methods.
Abstract: We consider methods for large-scale unconstrained minimization based on finding an approximate minimizer of a quadratic function subject to a two-norm trust-region constraint. The Steihaug-Toint method uses the conjugate-gradient method to minimize the quadratic over a sequence of expanding subspaces until the iterates either converge to an interior point or cross the constraint boundary. However, if the conjugate-gradient method is used with a preconditioner, the Steihaug-Toint method requires that the trust-region norm be defined in terms of the preconditioning matrix. If a different preconditioner is used for each subproblem, the shape of the trust-region can change substantially from one subproblem to the next, which invalidates many of the assumptions on which standard methods for adjusting the trust-region radius are based. In this paper we propose a method that allows the trust-region norm to be defined independently of the preconditioner. The method solves the inequality constrained trust-region subproblem over a sequence of evolving low-dimensional subspaces. Each subspace includes an accelerator direction defined by a regularized Newton method for satisfying the optimality conditions of a primal-dual interior method. A crucial property of this direction is that it can be computed by applying the preconditioned conjugate-gradient method to a positive-definite system in both the primal and dual variables of the trust-region subproblem. Numerical experiments on problems from the CUTEr test collection indicate that the method can require significantly fewer function evaluations than other methods. In addition, experiments with general-purpose preconditioners show that it is possible to significantly reduce the number of matrix-vector products relative to those required without preconditioning.
TL;DR: This paper shows that this bottleneck can be overcome by solving the Schur-complement equations implicitly, using a quasi-Newton preconditioned conjugate gradient method and dramatically reduces the computational cost for problems with many coupling variables.
Abstract: In this work, we address optimization of large-scale, nonlinear, block-structured problems with a significant number of coupling variables. Solving these problems using interior-point methods requires the solution of a linear system that has a block-angular structure at each iteration. Parallel solution is possible using a Schur-complement decomposition. In an explicit Schur-complement decomposition, the computational cost of forming and factorizing the Schur-complement is prohibitive for problems with many coupling variables. In this paper, we show that this bottleneck can be overcome by solving the Schur-complement equations implicitly, using a quasi-Newton preconditioned conjugate gradient method. This new algorithm avoids explicit formation and factorization of the Schur-complement. The computational efficiency of this algorithm is compared with the serial full-space approach, and the serial and parallel explicit Schur-complement approach. These results show that the PCG Schur-complement approach dramatically reduces the computational cost for problems with many coupling variables.
TL;DR: The mutual impact of linear algebra and optimization is discussed, focusing on interior point methods and on the iterative solution of the KKT system, with a focus on preconditioning, termination control for the inner iterations, and inertia control.
Abstract: The solution of KKT systems is ubiquitous in optimization methods and often dominates the computation time, especially when large-scale problems are considered. Thus, the effective implementation of such methods is highly dependent on the availability of effective linear algebra algorithms and software, that are able, in turn, to take into account specific needs of optimization. In this paper we discuss the mutual impact of linear algebra and optimization, focusing on interior point methods and on the iterative solution of the KKT system. Three critical issues are addressed: preconditioning, termination control for the inner iterations, and inertia control.
TL;DR: An efficient translator is implemented that takes as input a linear AMPL model and associated data, and produces output suitable for standard linear programming optimizers.
Abstract: Practical large-scale mathematical programming involves more than just the application of an algorithm to minimize or maximize an objective function. Before any optimizing routine can be invoked, considerable effort must be expended to formulate the underlying model and to generate the requisite computational data structures. AMPL is a new language designed to make these steps easier and less error-prone. AMPL closely resembles the symbolic algebraic notation that many modelers use to describe mathematical programs, yet it is regular and formal enough to be processed by a computer system; it is particularly notable for the generality of its syntax and for the variety of its indexing operations. We have implemented an efficient translator that takes as input a linear AMPL model and associated data, and produces output suitable for standard linear programming optimizers. Both the language and the translator admit straightforward extensions to more general mathematical programs that incorporate nonlinear expressions or discrete variables.
TL;DR: It is shown that performance profiles combine the best features of other tools for performance evaluation to create a single tool for benchmarking and comparing optimization software.
Abstract: We propose performance profiles — distribution functions for a performance metric — as a tool for benchmarking and comparing optimization software. We show that performance profiles combine the best features of other tools for performance evaluation.
TL;DR: An SQP algorithm that uses a smooth augmented Lagrangian merit function and makes explicit provision for infeasibility in the original problem and the QP subproblems is discussed.
Abstract: Sequential quadratic programming (SQP) methods have proved highly effective for solving constrained optimization problems with smooth nonlinear functions in the objective and constraints. Here we consider problems with general inequality constraints (linear and nonlinear). We assume that first derivatives are available and that the constraint gradients are sparse.
We discuss an SQP algorithm that uses a smooth augmented Lagrangian merit function and makes explicit provision for infeasibility in the original problem and the QP subproblems. SNOPT is a particular implementation that makes use of a semidefinite QP solver. It is based on a limited-memory quasi-Newton approximation to the Hessian of the Lagrangian and uses a reduced-Hessian algorithm (SQOPT) for solving the QP subproblems. It is designed for problems with many thousands of constraints and variables but a moderate number of degrees of freedom (say, up to 2000). An important application is to trajectory optimization in the aerospace industry. Numerical results are given for most problems in the CUTE and COPS test collections (about 900 examples).