Interior Methods for Nonlinear Optimization

doi:10.1137/S0036144502414942

SIAM REVIEW

c



2002 Society for Industrial and Applied Mathematics

Vol. 44, No. 4, pp. 525–597

Interior Methods for Nonlinear

Optimization

∗

Anders Forsgren

†

Philip E. Gill

‡

Margaret H. Wright

§

Abstract. Interior methods are an omnipresent, conspicuous feature of the constrained optimiza-

tion landscape today, but it was not always so. Primarily in the form of barrier methods,

interior-point techniques were popular during the 1960s for solving nonlinearly constrained

problems. However, their use for linear programming was not even contemplated because

of the total dominance of the simplex method. Vague but continuing anxiety about barrier

methods eventually led to their abandonment in favor of newly emerging, apparently more

eﬃcient alternatives such as augmented Lagrangian and sequential quadratic programming

methods. By the early 1980s, barrier methods were almost without exception regarded as

a closed chapter in the history of optimization.

This picture changed dramatically with Karmarkar’s widely publicized announcement

in 1984 of a fast polynomial-time interior method for linear programming; in 1985, a formal

connection was established between his method and classical barrier methods. Since then,

interior methods have advanced so far, so fast, that their inﬂuence has transformed both

the theory and practice of constrained optimization. This article provides a condensed, se-

lective look at classical material and recent research about interior methods for nonlinearly

constrained optimization.

Key words. nonlinear programming, constrained minimization, nonlinear constraints, primal-dual

methods, interior methods, penalty methods, barrier methods

AMS subject classiﬁcations. 49J20, 49J15, 49M37, 49D37, 65F05, 65K05, 90C30

PII. S0036144502414942

1. Introduction. It is a truth universally acknowledged that the ﬁeld of continu-

ous optimization has undergone a dramatic change since 1984. This change, sometimes

described as the “interior-point revolution,” has featured a continual blending of old

and new, with eﬀects far beyond optimization. An especially appealing aspect of the

interior-point revolution is its spirit of uniﬁcation, which has brought together areas

of optimization that for many years were treated as ﬁrmly disjoint. Prior to 1984,

linear and nonlinear programming, one a subset of the other, had evolved for the

∗

Received by the editors September 22, 2002; accepted for publication (in revised form) October 1,

2002; published electronically October 30, 2002.

http://www.siam.org/journals/sirev/44-4/41494.html

†

Optimization and Systems Theory, Department of Mathematics, Royal Institute of Technol-

ogy, SE-100 44 Stockholm, Sweden (anders.forsgren@math.kth.se). The research of this author was

supported by the Swedish Research Council (VR).

‡

Department of Mathematics, University of California, San Diego, La Jolla, CA 92093-0112

(pgill@ucsd.edu). The research of this author was supported by National Science Foundation grants

DMS-0208449 and ACI-0082100.

§

Department of Computer Science, Courant Institute, New York University, New York, NY 10012

(mhw@cs.nyu.edu).

525

526 ANDERS FORSGREN, PHILIP E. GILL, AND MARGARET H. WRIGHT

most part along unconnected paths, without even a common terminology. (The use

of “programming” to mean “optimization” serves as a persistent reminder of these

diﬀerences.) Today this separation seems, as it indeed was, artiﬁcial, yet it was a fully

accepted part of the culture of optimization not so many years ago.

1.1. Roots in Linear and Nonlinear Programming. Although the focus of this

article is on nonlinearly constrained problems, understanding the context of the

interior-point revolution requires a short digression on linear programming (mini-

mization of a linear function subject to linear constraints). A fundamental property

of well-behaved n-variable linear programs with m inequality constraints is that a ver-

tex minimizer must exist, i.e., a point where n constraints with linearly independent

gradients hold with equality. (See, e.g., [20, 92] for details about linear program-

ming.) The simplex method, invented by Dantzig in 1947, is an iterative procedure

that solves linear programs by exploiting this property. A simplex iteration moves

from vertex to vertex, changing (one at a time) the set of constraints that hold ex-

actly, decreasing the objective as it goes, until an optimal vertex is found. From the

very start, the simplex method dominated the ﬁeld of linear programming. Although

“nonsimplex” strategies for linear programming were suggested and tried from time

to time, they could not consistently match the simplex method in overall speed and

reliability. Furthermore, a simplex-centric world view had the eﬀect that even “new”

techniques mimicked the motivation of the simplex method by always staying on a

subset of exactly satisﬁed constraints.

The preeminence of the simplex method was challenged not because of failures

in practice—the simplex method was, and is, used routinely to solve enormous linear

programs—but by worries about its computational complexity. One can argue that

the simplex method and its progeny are inherently combinatorial, in that their per-

formance seems to be bound in the worst case to the maximum number of ways in

which n out of m constraints can hold with equality. In fact, with standard pivoting

rules specifying the constraint to be dropped and added at each iteration, the simplex

method can visit every vertex of the feasible region [64]; thus its worst-case complex-

ity is exponential in the problem dimension. As a result, there was great interest in

ﬁnding a polynomial-time linear programming algorithm.

1

The ﬁrst success in this direction was achieved in 1979 by Khachian, whose el-

lipsoid method was derived from approaches proposed originally for nonlinear opti-

mization. (See [92] for details about Khachian’s method.) Despite its polynomial

complexity bound, however, the ellipsoid method performed poorly in practice com-

pared to the simplex method, and the search continued for a polynomial-time linear

programming method that was genuinely fast in running time.

The start of the interior-point revolution was Karmarkar’s announcement [63]

in 1984 of a polynomial-time linear programming method that was 50 times faster

than the simplex method. Amid the frenzy of interest in Karmarkar’s method, it

was shown in 1985 [51] that there was a formal equivalence between Karmarkar’s

method and the classical logarithmic barrier method (see sections 1.2 and 3) applied

to linear programming, and long-discarded barrier methods were soon rejuvenated

as polynomial-time algorithms for linear programming. Furthermore, barrier meth-

ods (unlike the simplex method) could be applied not only to linear programming

1

Assuming various distributions of random inputs, [11, 94] showed that the simplex method

converges in expected polynomial time. The recent development of “smoothed” complexity analysis

[95] has led to new insights about the average behavior of the simplex method.

INTERIOR METHODS 527

but also to other optimization problems, such as quadratic programming, linear and

nonlinear complementarity, and nonlinear programming. Although the ties between

Karmarkar’s method and barrier methods were controversial for a few years, these

disagreements have (mostly) faded away. Pedagogical and philosophical issues remain

about the best way to motivate interior-point methods—perturbing optimality condi-

tions? minimizing a barrier function?—and the multiplicity of viewpoints continues

to create new insights and new algorithms.

The interior-point revolution has led to a fundamental shift in thinking about

continuous optimization. Linear and nonlinear programming are seen as related parts

of a uniﬁed whole, and no one would seriously claim today that linear programming

is completely diﬀerent from nonlinear optimization. (Of course, methods for solving

linear programs and nonlinear problems vary signiﬁcantly in detail.)

As we shall see, the signature of interior methods is the existence of continu-

ously parameterized families of approximate solutions that asymptotically converge

to the exact solution. These paths trace smooth trajectories with algebraic and geo-

metric properties (such as being “centered” in a precisely deﬁned sense) that can be

analyzed and exploited algorithmically. Many interior methods are characterized as

“path-following” to signal their dependence on properties of these paths, which pro-

vide the foundation for all complexity analyses of interior-point algorithms for linear,

quadratic, and convex programming.

The monumental work [79] of Nesterov and Nemirovskii proposed new families

of barrier methods and extended polynomial-time complexity results to new convex

optimization problems. Semideﬁnite programming—minimization of a convex func-

tion in the space of symmetric matrices subject to semideﬁniteness constraints—is

arguably the most notable of these problems to receive widespread attention as a di-

rect result of the development of interior methods (see, e.g., the surveys [65, 99, 101]).

The evident similarity of interior methods to longstanding continuation approaches

(see, e.g., [1, 2]) has been recognized since the early days of modern interior methods

(see, e.g., [71]), but numerous aspects remain to be explored.

As a remarkable bonus, interior methods are playing a growing role in the study

of hard combinatorial problems. Many of the most important problems in discrete

optimization (where the variables are required to be integers) are NP-hard, i.e., they

cannot be solved in polynomial time unless someone favorably resolves the still-open

question of whether P = NP. In the meantime, good approximate solutions are being

found by approximation algorithms—polynomial-time algorithms whose solution is

provably within a certain factor of the optimal solution for the hard problem. A

main ingredient in a successful approximation algorithm is formulation of a convex

relaxation (often a semideﬁnite program) in which integrality constraints are replaced

by deﬁniteness constraints on associated matrices. An exceptionally clear introduction

to this subject is given in [108].

Almost twenty years after the beginning of the interior-point revolution, there

seems to be no end in sight to new applications of interior methods and new interpre-

tations of the interior-point perspective.

1.2. Classical Barrier Methods. As we have just sketched, classical barrier meth-

ods are closely related to modern interior methods, and we brieﬂy summarize their

history. During the 1960s, the accepted way to solve constrained problems was to

transform them into parameterized unconstrained problems via penalty or barrier

terms. For inequality constraints, a barrier method is motivated by unconstrained

528 ANDERS FORSGREN, PHILIP E. GILL, AND MARGARET H. WRIGHT

minimization of a function combining f and a positively weighted “barrier” that pre-

vents iterates from leaving the feasible region. Penalty methods, in contrast, are based

on minimizing a function that includes f and a positive penalty if evaluated at any

infeasible point.

A large body of beautiful mathematical theory about barrier and penalty func-

tions was developed during the 1960s by Fiacco and McCormick. They also seem to

have introduced the term “interior-point methods” in their seminal book [33, p. 41],

which describes in detail the relationships between minimizers of barrier and penalty

function and solutions of the original constrained problem.

Despite the good features of barrier methods, they were dogged by several con-

cerns. The worry expressed most often in print involved ill-conditioning, after Lootsma

[66] and Murray [73] showed independently in the late 1960s that in general the Hes-

sian of a barrier function becomes increasingly ill-conditioned as the solution is ap-

proached and is singular in the limit. Increasing awareness of this property led to

serious anxiety about the reliability of barrier methods just as other methods were

coming along that seemed to be more eﬃcient in practice without being plagued

by unavoidable ill-conditioning. In particular, augmented Lagrangian and sequen-

tial quadratic programming (SQP) methods (see, for example, [6, 34, 52, 77, 80])

are based directly on the optimality conditions for constrained optimization. Barrier

methods appeared distinctly unappealing by comparison, and almost all researchers

in mainstream optimization lost interest in them.

As described in section 1.1, the dormancy of barrier methods ended in high drama

near the start of the interior-point revolution. An obvious question then needed to

be answered: Are classical barrier methods fundamentally ﬂawed, as once feared?

The answer turns out to be “yes,” but, surprisingly, not because of ill-conditioning.

Classical barrier methods are indeed ineﬃcient—but, by a strange twist of fate, ill-

conditioning, their longtime bugbear, has recently been shown not to be harmful under

circumstances that almost always hold in practice. We explore several interesting

properties, good and bad, of the classical Newton barrier method in section 4.3. An

obvious strategy has been to create interior methods that retain the good properties

of classical barrier methods, yet do not suﬀer from their defects. The general opinion

today is that primal-dual methods, to be discussed in section 5, oﬀer the greatest

promise for achieving these ends.

It is impossible to cover interior methods for nonlinear optimization thoroughly

in anything less than a large volume. A major goal of this article is thus to show con-

nections between classical and modern ideas and to cover highlights of both theory

and practice; readers interested in learning more about interior-point methods will

ﬁnd an abundance of papers and books on the subject. Since linear algebra is a spe-

cial interest of the authors, we have devoted extra attention to linear algebraic issues

associated with interior methods. The linear algebra needs of interior methods are

interesting for several reasons. Certain key matrices display increasing ill-conditioning

as the solution is approached, but the ill-conditioning is highly structured. In con-

trast to active-set methods like the simplex method that continually update a set of

constraints temporarily treated as equalities, interior methods typically include all

constraints at every iteration. Hence the matrices arising in interior methods must

somehow reveal, without omitting any constraints, that some constraints are more

important than others. Similarly, two subspaces—the range space of the transposed

Jacobian of the active constraints and the associated null space—strongly aﬀect all

calculations near the solution, but these subspaces are not known explicitly.

INTERIOR METHODS 529

1.3. Statement of the Problem. We concentrate on interior methods for con-

tinuous nonlinear optimization problems of the following form:

(1.1)

minimize

x∈R

n

f(x)

subject to c

i

(x)=0,i∈E,

c

i

(x) ≥ 0,i∈I,

where c(x)isanm-vector of nonlinear constraint functions with ith component c

i

(x),

i =1,...,m, and E and I are nonintersecting index sets. It is assumed throughout

that f and c are twice-continuously diﬀerentiable. Any point x satisfying the con-

straints of (1.1) is called a feasible point, and the set of all such points is the feasible

region. We ﬁrst consider problems containing only inequality constraints (sections 2

through 5) and then turn in sections 6 and 7 to the general form (1.1).

1.4. A Few Words on Coverage and Notation. Since thousands of scientiﬁc

papers have been written about interior methods, as already noted we cannot cover

more than a tiny fraction of the ﬁeld, and it would be equally impractical to cite

all relevant references. We apologize in advance to all those whose favorite topics or

works have not been mentioned here.

Because this is a survey intended for nonexperts, we have included a substantial

amount of background material on optimality conditions in section 2. Readers already

familiar with optimization should skip directly to section 3. Various useful deﬁnitions,

lemmas, and miscellaneous results are collected in the appendix.

Finally, because there are not enough letters in the alphabet, especially letters that

are free from previous connotations, we confess to straining at times to ﬁnd notation

that is clear and precise without being cluttered. To alleviate this dilemma, we

sometimes introduce local abbreviations for the sake of short formulas. For example,

when considering a particular point, say x

∗

, we will sometimes abbreviate quantities

evaluated at x

∗

by adding a superscript “∗” and omitting the argument, e.g., we denote

c(x

∗

)byc

∗

. Following common usage in the interior-point literature, if a vector is

denoted by a lowercase letter, the same uppercase letter denotes the diagonal matrix

whose elements are those of the vector, so that V



= diag(v). Finally, e denotes the

vector of all ones whose dimension is determined by the context.

2. Inequality-Constrained Optimization. We begin with problems containing

only inequality constraints:

(2.1) minimize

x∈R

n

f(x) subject to c(x) ≥ 0,

where c(x)isanm-vector of functions {c

i

(x)}, i =1, ...,m, and we assume through-

out that f and {c

i

} are twice-continuously diﬀerentiable. The gradient of f is de-

noted by either ∇f(x)org(x), and ∇

2

f(x) denotes the Hessian matrix of second

partial derivatives of f. The gradient and Hessian of c

i

(x) are denoted by ∇c

i

(x)

and ∇

2

c

i

(x). The m × n Jacobian matrix c



(x) of ﬁrst derivatives of c(x) has rows

{∇c

i

(x)

T

}, and we sometimes (to avoid clutter) use J(x) to denote this Jacobian.

The topic of optimality conditions for nonlinearly constrained optimization can

be complicated and confusing. We present only aspects that will be needed later;

detailed discussions may be found in, for example, [6, 88].

2.1. The KKT Conditions. The terms “KKT point” (standing for “Karush–

Kuhn–Tucker point”) and “KKT conditions” will be used often. In deﬁning these

Interior Methods for Nonlinear Optimization

Figures

Citations

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming

A sparse signal reconstruction perspective for source localization with sensor arrays

Control Barrier Function Based Quadratic Programs for Safety Critical Systems

Convex Optimization & Euclidean Distance Geometry

Matrix Analysis and Applications

References

Numerical Optimization

Nonlinear Programming

Practical Methods of Optimization.

Iterative Solution of Nonlinear Equations in Several Variables

Practical Methods of Optimization

Related Papers (5)

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming

Numerical Optimization

Convex Optimization

Nonlinear Programming: Sequential Unconstrained Minimization Techniques

Primal-Dual Interior-Point Methods