A Class of Smoothing Functions
for Nonlinear and Mixed Complementarity Problems
Dedicated to Richard W. Cottle, friend and colleague,
on the o ccasion of his sixtieth birthday
Chunhui Chen
& O. L. Mangasarian
Mathematical Programming Technical Rep ort 94-11
August 1994 / Revised February 1995
Abstract
We prop ose a class of parametric smo oth functions that approximate the fun-
damental plus function,
(
x
)
+
=
max
f
0
; x
g
, by twice integrating a probability density
function. This leads to classes of smo oth parametric nonlinear equation approx-
imations of nonlinear and mixed complementarity problems (NCPs and MCPs).
For any solvable NCP or MCP, existence of an arbitrarily accurate solution to the
smooth nonlinear equation as well as the NCP or MCP, is established for suciently
large value of a smoothing parameter
. Newton-based algorithms are proposed
for the smooth problem. For strongly monotone NCPs, global convergence and
local quadratic convergence are established. For solvable monotone NCPs, each
accumulation p oint of the proposed algorithms solves the smo oth problem. Exact
solutions of our smooth nonlinear equation for various values of the parameter
,
generate an interior path, which is dierent from the central path for interior p oint
method. Computational results for 52 test problems compare favorably with those
for another Newton-based method. The smo oth technique is capable of solving
eciently the test problems solved by Dirkse & Ferris [8], Harker & Xiao [13] and
Pang & Gabriel [30].
1 Intro duction
The complementarity condition
0
x
?
y
0
;
where
x
and
y
are vectors in
R
n
and the symbol
?
denotes orthogonality, plays a fundamental role
in mathematical programming. Many problems can b e formulated by using this complementarity
condition. For example, most optimality conditions of mathematical programming [26] as well as
variational inequalities [6] and extended complementarity problems [23, 11, 40] can b e so formulated.
It is obvious that the vectors
x
and
y
satisfy complementarity condition if and only if
x
= (
x
y
)
+
;
Computer Sciences Department, University of Wisconsin, 1210 West Dayton Street, Madison, WI 53706, email:
chunhui@cs.wisc.edu, olvi@cs.wisc.edu.
This material is based on research supp orted by Air Force Oce of Scientic
Research Grant F49620-94-1-0036 and National Science Foundation Grant CCR-9322479.
1
where the plus function (
)
+
is dened as
(
)
+
= max
f
;
0
g
;
for a real number
. For a vector
x
, the vector (
x
)
+
denotes the plus function applied to each com-
ponent of
x
. In this sense, the plus function plays an imp ortant role in mathematical programming.
But one big disadvantage of the plus function is that it is not smooth because it is not dierentiable.
Thus numerical methods that use gradients cannot be directly applied to solve a problem involving
a plus function. The basic idea of this pap er is to use a smo oth function approximation to the plus
function. With this approximation, many ecient algorithms, such as the Newton method, can be
easily employed.
There are many Newton-based algorithms for solving nonlinear complementarity problems, vari-
ational inequalities and mixed complementarity problems. In [12] a goo d summary and references
up to 1988 are given. Generalizations of the Newton metho d to nonsmooth equations can be
found in [34], [35] and [36]. Since then, several approaches based on B-dierentiable equations
were investigated in [13], [28] and [29]. In addition, an algorithm based on nonsmo oth equations
and successive quadratic programming was given [30], as well as a Newton metho d with a path
following technique [32, 8], and a trust region Newton metho d for solving a nonlinear least squares
reformulation of the NCP [24]. With the exception of [24], a feature common to all these metho ds
is that the subproblem at each Newton iteration is still a combinatorial problem. In contrast, by
using the smooth technique prop osed here, we avoid this combinatorial diculty by approximately
reformulating the nonlinear or mixed complementarity problem as a smooth nonlinear equation.
Consequently, at each Newton step, we only need to solve a linear equation. This is much simpler
than solving a mixed linear complementarity problem or a quadratic program.
Smoothing techniques have already b een applied to dierent problems, such as,
l
1
minimization
problems [21], multi-commodity ow problems [31], nonsmooth programming [41, 20], linear and
convex inequalities [5], and linear complementarity problems [4], [5] and [17]. These successful
techniques motivate a systematic study of the smo othing approach. Questions we wish to address
include the following. How to generate new smo othing functions? What is a common property of
smoothing functions?
In Section 2, we relate the plus function through a parametric smo othing pro cedure, to a
probability density function with a parameter
. As the parameter
approaches zero, the smooth
plus function approaches the nonsmooth plus function (
)
+
. This gives us a to ol for generating a
class of smo oth plus functions and a systematic way to develop prop erties of these functions. In
Section 3, we approximate the NCP by a smooth parametric nonlinear equation. For the strongly
monotone case, we establish existence of a solution for the nonlinear equation and estimate the
distance between its solution and the solution of original NCP. For a general solvable NCP, existence
of an arbitrarily accurate solution to the nonlinear equation, and hence to the NCP, is established.
For a xed value of the smoothing parameter
=
1
, we give a Newton-Armijo type algorithm and
establish its convergence. In Section 4, we treat the MCP, the mixed complementarity problem
(21). For the case of a solvable monotone MCP with nite bounds
l; u
2
R
n
, we prove that if the
smoothing parameter
is suciently small, then the smo oth system has a solution. An ecient
smooth algorithm based on the Newton-Armijo approach with an adjusted smo othing parameter is
also given and convergence is established. In Section 5 we show that exact solutions of our smooth
nonlinear equation, for various values of the smoothing parameter
generate an interior path to
the feasible region, dierent from the central path of the interior point metho d [19]. We compare
the two paths on a simple example and show that our path gives a smaller error for the same value
of the smoothing parameter
. In Section 6, encouraging numerical testing results are given for
2
52 problems from the MCPLIB [9] which includes all the problems attempted in [13], [30] and [8].
These problems range in size of up to 8192 variables. These examples include the dicult von
Thunen NCP mo del [30, 39] which is solved here to an accuracy of 1.0e-7.
A few words ab out our notation. For
f
:
R
!
R
and
x
2
R
n
, the vector
f
(
x
) in
R
n
is dened
by the components (
f
(
x
))
i
=
f
(
x
i
)
; i
= 1
;
; n
. The supp ort set of
f
(
x
), which is the set of
points such that
f
(
x
)
6
= 0, will be denoted by supp
f
f(x)
g
. The set of
m
-by-
n
real matrices will
be denoted by
R
m
n
. The notation
0
and
1
will represent vectors with all components 0 and 1
respectively, of appropriate dimension. The innity,
l
1
and
l
2
norms will b e denoted by
k k
1
,
k k
1
and
k k
2
respectively. The identity matrix of arbitrary dimension will b e denoted by
I
.
For a dierentiable function
f
:
R
n
!
R
m
;
r
f
will denote the
m
n
Jacobian matrix of partial
derivatives. If
F
(
x
) has Lipschitz continuous rst partial derivatives on
R
n
with constant
K >
0,
that is
kr
F
(
x
)
r
F
(
y
)
k
K
k
x
y
k
;
8
x; y
2
R
n
;
we write
F
(
x
)
2
LC
1
K
(
R
n
)
:
2 A Class of Smo othing Functions
We consider a class of smooth approximations to the fundamental function (
x
)
+
= max
f
x;
0
g
.
Notice rst that (
x
)
+
=
R
x
1
(
y
)
dy
, where
(
x
) is the step function:
(
x
) =
1 if
x >
0
0 if
x
0
The step function
(
x
) can in turn be written as,
(
x
) =
R
x
1
(
y
)
dy
, where
(
x
) is the Dirac delta
function which satises the following prop erties
(
x
)
0
;
Z
+
1
1
(
y
)
dy
= 1
:
Figures 1 to 3 depict the ab ove functions. The fact that the plus function is obtained by twice
integrating the Dirac delta function, prompts us to prop ose probability density functions as a means
of smoothing the Dirac delta function and its integrals. Hence we consider the piecewise continuous
function
d
(
x
) with nite number of pieces which is a density function, that is it satises
d
(
x
)
0 and
Z
1
1
d
(
x
)
dx
= 1
:
(1)
To parametrize the density function we dene
^
t
(
x;
) =
1
d
(
x
) (2)
where
is a p ositive parameter. When
goes to 0, the limit of
^
t
(
x;
) is the Dirac delta function
(
x
). This motivates a class of smooth approximations as follows:
^
s
(
x;
) =
Z
x
1
^
t
(
t;
)
dt
(
x
)
and
^
p
(
x;
) =
Z
x
1
^
s
(
t;
)
dt
(
x
)
+
(3)
3
Therefore, we can get an approximate plus function by twice integrating a density function. In
fact, this is the same as dening
^
p
(
x;
) =
Z
+
1
1
(
x
t
)
+
^
t
(
t;
)
dt
=
Z
x
1
(
x
t
)
^
t
(
t;
)
dt:
(4)
This formulation was given in [20] and [15, p.12] for a density(kernel) function with nite supp ort.
We will give our results in terms of a density function with arbitrary support. This includes the
nite support density function as a sp ecial case.
Proposition 2.1
Let
d
(
x
)
be a probability density function and
^
t
(
x;
) =
1
d
(
x
)
, where
is a
positive parameter. Let
d
(
x
)
satisfy the fol lowing assumptions:
(A1)
d
(
x
)
is piecewise continuous with nite number of pieces and satises (1).
(A2)
E
[
j
x
j
]
d
(
x
)
=
R
+
1
1
j
x
j
d
(
x
)
dx <
+
1
:
Then the denitions of
^
p
(
x;
)
given by (3) and (4) are consistent.
Proof
By the denition (2) and assumption (A2), we have that ^
p
(
x;
) dened by (4) satises
^
p
(
x;
) =
x
Z
x
1
d
(
t
)
dt
Z
x
1
td
(
t
)
dt
(5)
By direct computation,
^
p
0
(
x;
) =
Z
x
1
d
(
t
)
dt
(6)
=
Z
x
1
^
t
(
t;
)
dt
= ^
s
(
x;
)
Hence the derivatives of ^
p
(
x;
) dened by (3) and (4) are the same and the dierence between the
two representations of ^
p
(
x;
) is a constant, say
c
. If we let
x
approach
1
in b oth (3) and (4),
then ^
p
(
x;
) approaches 0 in b oth, and hence
c
= 0. Therefore the denitions of ^
p
(
x;
) given by
(3) and (4) are consistent.
Now we give prop erties of ^
p
(
x;
) that show that it is an accurate approximation of the plus
function (
x
)
+
as
approaches zero.
Proposition 2.2 Properties of ^p
(
x
;
)
; >
0
Let
d
(
x
)
and
^
t
(
x;
)
be as in Proposition 2.1, and let
d
(
x
)
satisfy (A1) and (A2). Then
^
p
(
x;
)
has the fol lowing properties:
(1)
^
p
(
x;
)
is continuously dierentiable. If, in addition, d(x) is k-times continuously dieren-
tiable,
^
p
(
x;
)
is (k+2)-times continuously dierentiable.
(2)
D
2
^
p
(
x;
)
(
x
)
+
D
1
, where
D
1
=
Z
0
1
j
x
j
d
(
x
)
dx
(7)
and
D
2
= max
f
Z
+
1
1
xd
(
x
)
dx;
0
g
(8)
4
-2
0
2
4
6
8
10
-10 -8 -6 -4 -2 0 2 4 6 8 10
(
x
)
+
=
R
(
)
d
Figure 1:
The plus function
(
x
)
+
= max
f
x
;
0
g
-1
-0.5
0
0.5
1
1.5
2
-10 -8 -6 -4 -2 0 2 4 6 8 10
(
x
) =
R
(
)
d
Figure 2:
The step function
(
x
) =
1 if x
>
0, 0 if x
0
0
0.2
0.4
0.6
0.8
1
1.2
-10 -8 -6 -4 -2 0 2 4 6 8 10
(
x
)
Figure 3:
The Dirac delta function
(
x
)
5