scispace - formally typeset
Open AccessJournal ArticleDOI

On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators

Reads0
Chats0
TLDR
This paper shows, by means of an operator called asplitting operator, that the Douglas—Rachford splitting method for finding a zero of the sum of two monotone operators is a special case of the proximal point algorithm, which allows the unification and generalization of a variety of convex programming algorithms.
Abstract
This paper shows, by means of an operator called asplitting operator, that the Douglas--Rachford splitting method for finding a zero of the sum of two monotone operators is a special case of the proximal point algorithm. Therefore, applications of Douglas--Rachford splitting, such as the alternating direction method of multipliers for convex programming decomposition, are also special cases of the proximal point algorithm. This observation allows the unification and generalization of a variety of convex programming algorithms. By introducing a modified version of the proximal point algorithm, we derive a new,generalized alternating direction method of multipliers for convex programming. Advances of this sort illustrate the power and generality gained by adopting monotone operator theory as a conceptual framework.

read more

Content maybe subject to copyright    Report

Mathematical Programming 55 (1992) 293-318 293
North-Holland
On the Douglas-Rachford splitting method
and the proximal point algorithm for
maximal monotone operators
Jonathan Eckstein
Mathematical Sciences Research Group, Thinking Machines Corporation, Cambridge, MA 02142, USA
Dimitri P. Bertsekas
Laboratory for Information and Decision Systems, Massachusetts Institute of Technology,
Cambridge, MA 02139, USA
Received 20 November 1989
Revised manuscript received 9 July 1990
This paper shows, by means of an operator called a
splitting operator,
that the Douglas-Rachford splitting
method for finding a zero of the sum of two monotone operators is a special case of the proximal point
algorithm, Therefore, applications of Douglas-Rachford splitting, such as the alternating direction method
of multipliers for convex programming decomposition, are also special cases of the proximal point
algorithm. This observation allows the unification and generalization of a variety of convex programming
algorithms. By introducing a modified version of the proximal point algorithm, we derive a new,
generalized
alternating direction method of multipliers for convex programming. Advances of this sort illustrate the
power and generality gained by adopting monotone operator theory as a conceptual framework.
Key words:
Monotone operators, proximal point algorithm, decomposition.
I. Introduction
The theory of maximal set-valued monotone operators (see, for example, [4])
provides a powerful general framework for the study of convex programming and
variational inequalities. A fundamental algorithm for finding a root of a monotone
operator is the proximal point algorithm [48]. The well-known method of multipliers
[23, 41] for constrained convex programming is known to be a special case of the
proximal point algorithm [49J. This paper will reemphasize the power and generality
of the monotone operator framework in the analysis and derivation of convex
optimization algorithms, with an emphasis on decomposition algorithms.
The proximal point algorithm requires evaluation of resolvent operators of the
form (I+AT) -1, where T is monotone and set-valued, h is a positive scalar, and I
This paper is drawn largely from the dissertation research of the first author. The dissertation was
performed at M.I.T. under the supervision of the second author, and was supported in part by the Army
Research Office under grant number DAAL03-86-K-01710 and by the National Science Foundation under
grant number ECS-8519058.

294 J.
Eckstein, D.P. Bertsekas / On Douglas-Rachford splitting
denotes the identity mapping. The main difficulty with the method is that I +AT
may be hard to invert, depending on the nature of T. One alternative is to find
maximal monotone operators A and B such that A + B = T, but I + AA and I + AB
are easier to invert that I+ AT. One can then devise an algorithm that uses only
operators of the form (I+AA) -1 and (I+AB) -1, rather than
(I+A(A+B)) -~=
(/+AT) -~. Such an approach is called a
splitting method,
and is inspired by
well-established techniques from numerical linear algebra (for example, see [33]).
A number of authors, mainly in the French mathematical community, have
extensively studied monotone operator splitting methods, which fall into four
principal classes: forward-backward [40, 13, 56], double-backward [30, 40], Peace-
man-Rachford [31], and Douglas-Rachford [31]. For a survey, readers may wish
to refer to [1 I, Chapter 3]. We will focus on the "Douglas-Rachford" class, which
appears to have the most general convergence properties. Gabay [13] has shown
that the
alternating direction
method of multipliers, a variation on the method of
multipliers designed to be more conducive to decomposition, is a special case of
Douglas-Rachford splitting. The alternating direction method of multipliers was
first introduced in [16] and [14]; additional contributions appear in [12]. An
interesting presentation can be found in [15], and [3] provides a relative accessible
exposition. Despite Gabay's result, most developments of the alternating direction
method multipliers rely on a lengthy analysis from first principles. Here, we seek
to demonstrate the benefit of using the operator-theoretic approach.
This paper hinges on a demonstration that Douglas-Rachford splitting is an
application of the proximal point algorithm. As a consequence, much of the theory
of the proximal point and related algorithms may be carried over to the context of
Douglas-Rachford splitting and its special cases, including the alternating direction
method of multipliers. As one example of this carryover, we present a
generalized
form of the proximal point algorithm -- created by synthesizing the work of
Rockafellar [48] with that of Gol'shtein and Tret'yakov [22] -- and show how it
gives rise to a new method,
generalized
Douglas-Rachford splitting. This in turn
allows the derivation of a new augmented Lagrangian method for convex program-
ming, the
generalized
alternating direction method of multipliers. This result illus-
trates the benefits of adopting the monotone operator analytic approach. Because
it allows over-relaxation factors, which are often found to accelerate proximal
point-based methods in practice, the generalized alternating direction method of
multipliers may prove to be faster than the alternating direction method of multipliers
in some applications. Because it permits approximate computation, it may also be
more widely applicable.
While the current paper was under review, [28] was brought to our attention.
There, Lawrence and Spingarn briefly draw the connection between the proximal
point algorithm and Douglas-Rachford splitting in a somewhat different -- and
very elegant -- manner. However, the implications for extensions to the Douglas-
Rachford splitting methodology and for convex programming decomposition theory
were not pursued.

J. Eckstein, D.P. Bertsekas / On Douglas-Rachford splitting
295
Most of the results presented here are refinements of those in the recent thesis
by Eckstein [11], which contains more detailed development, and also relates the
theory to the work of Gol'shtein [17, 18, 19, 20, 21, 22]. Some preliminary versions
of our results have also appeared in [10]. Subsequent papers will introduce applica-
tions of the development given here to parallel optimization algorithms, again
capitalizing on the underpinnings provided by monotone operator theory.
This paper is organized as follows: Section 2 introduces the basic theory of
monotone operators in Hilbert space, while Section 3 proves the convergence of a
generalized form of the proximal point algorithm. Section 4 discusses Douglas-
Rachford splitting, showing it to be a special case of the proximal point algorithm
by means of a specially-constructed
splitting operator.
This notion is combined with
the result of Section 3 to yield
generalized
Douglas-Rachford splitting. Section 5
applies this theory, generalizing the alternating direction method of multipliers. It
also discusses Spingarn's [52, 54]
method of partial inverses,
with a minor extension.
Section 6 briefly presents a negative result concerning finite termination of Douglas-
Rachford splitting methods.
2. Monotone operators
An
operator T
on a Hilbert space Y( is a (possibly null-valued) point-to-set map
T: Y(~2 ~. We will make no distinction between an operator T and its graph, that
is, the set {(x,
y)[y ~
T(x)}. Thus, we may simply say that an operator is any subset
T of Text, and define T(x)=
Tx={y](x,y)c T}.
If T is single-valued, that is, the cardinality of
Tx
is at most 1 for all x c ~, we
will by slight abuse of notation allow
Tx
and
T(x)
to stand for the unique y c Y
such that (x, y) c T, rather than the singleton set {y}. The intended meaning should
be clear from the context.
The
domain
of a mapping T is its "projection" onto the first coordinate,
dom
T={x E Ygl3y6 Y(: (x, y)c
T} ={xc ~[
Tx#O}.
We say that T has
full domain
if dora T --- Yg. The
range
or
image
of T is similarly
defined as its projection onto the second coordinate,
im
T= {y c YfI 3x 6 Y(: (x, y) ~ T}.
The
inverse T -1
of T is
{(y,x)l(x,y)6 T}.
For any real number e and operator T, we let
cT
be the operator {(x,
cy) ] (x, y) ~ T},
and if A and B are any operators, we let
A + B= {(x, y + z)l(x, y)c A, (x, z)E B}.
We will use the symbol I to denote the
identity
operator {(x, x) [x ~ ~}. Let (., }
denote the inner product on ~. Then an operator T is
monotone
if
(x'-x,y'-y}>~O V(x,y),(x',y')~T.

296 £ Eckstein, D,P. Bertsekas / On Douglas-Rachford splitting
A monotone operator is maximal if (considered as a graph) it is not strictly contained
in any other monotone operator on Y(. Note that an operator is (maximal) monotone
if and only if its inverse is (maximal) monotone. The best-known example of maximal
monotone operator is the subgradient mapping af of a closed proper convex function
f: Y~-~ ~ ~ {+co} [42, 44, 45]. The following theorem, originally due to Minty [36, 37],
provides a crucial characterization of maximal monotone operators:
Theorem 1. A monotone operator T on ~( is maximal if and only if im(I + T) -- Y(. []
For alternative proofs of Theorem 1, or stronger related theorems, see [45, 4, 6,
or 24]. All proofs of the theorem require Zorn's lemma, or, equivalently, the axiom
of choice.
Given any operator A, let JA denote the operator (I + A) -~. Given any positive
scalar e and operator T, Jcr = (I+ cT) -1 is called a resolvent of T. An operator C
on Y( is said to be nonexpansive if
Ily'-yll<~ ]]x'-xlJ V(x,y), (x',y')c C.
Note that nonexpansive operators are necessarily single-valued and Lipschitz con-
tinuous. An operator J on ~ is said to be firmly nonexpansive if
liy'-y[12<~(x'-x,y'-y) V(x,y),(x',y')6J.
The following lemma summarizes some well-known properties of firmly nonexpan-
sive operators. The proof is straightforward and is omitted (or see, for example,
[48] or [11, Section 3.2.4]). Figure 1 illustrates the lemma.
Lemma 1. (i) All firmly nonexpansive operators are nonexpansive. (ii) An operator J
is firmly nonexpansive if and only if 2J- I is nonexpansive. (iii) An operator is firmly
nonexpansive if and only if it is of the form ½(C+I), where C is nonexpansive. (iv)
An operator J is firmly nonexpansive if and only if I - J is firmly nonexpansive. []
We now give a critical theorem, The "only if" part of the following theorem has
been well known for some time [48], but the "if" part, just as easily obtained,
appears to have been obscure. The purpose here is to stress the complete symmetry
that exists between (maximal) monotone operators and (full-domained) firmly
nonexpansive operators over any Hilbert space.
Theorem 2. Let c be any positive scalar. An operator T on Y( is monotone ~ and only
if its resolvent JeT = (I + cT) ~ is firmly nonexpansive. Furthermore, T is maximal
monotone if and only if J~r is firmly nonexpansive and dom(J~.r) = ~.
Proof. By the definition of the scaling, addition, and inversion operations,
(x,y)c T <=5 (x+ey, x) c(I+cT) -I.

J. Eckstein, D.P. Bertsekas / On Douglas-Rachford splitting
/
\
\,. /
X'--X
297
Fig, I. Illustration of the action of firmly nonexpansive operators in Hilbert space. If J is nonexpansive,
then
J(x')-J(x) must
lie in the larger sphere, which has radius
Ilx'-xl[
and is centered at 0. If J is
firmly
nonexpansive, then
J(x')-J(x)
must lie in the smaller sphere, which has radius ½[[x'-x H and is
centered at
½(x'-x).
This characterization follows directly from J being of the form ~1l +vC,l where
C is nonexpansive. Note that if
J(x')-J(x)
lies in the smaller sphere, so must
(1 -J)(x')- (1 -J)(x),
illustrating Lemma l(iv).
Therefore,
T monotone ¢:>
(x'-x,y'-y)>~O V(x,y),
(x',y')c T,
¢:> (x'-x, cy'-cy)>~O V(x,y),(x',y')cT,
¢:> (x'-x+cy'-cy, x'-x)>~[Ix'-xll z V(x,y),(x',y')c T,
¢:> (I + cT) -~
firmly nonexpansive.
The first claim is established. Clearly, T is maximal if and only if
cT
is maximal.
So, by Theorem 1, T is maximal if and only if
im(I+eT)=-Y(.
This is in turn true
if and only if
(I+ cT) -~
has domain Y(, establishing the second statement. []
Corollary 2.1.
An operator K is firmly nonexpansive if and only if K -l - I is monotone.
K is firmly nonexpansive with full domain if and only if K -~- I is maximal
monotone. []
Corollary 2.2.
For any
c > 0,
the resolvent JeT of a monotone operator T is single-
valued. If T is also maximal, then J~T has full domain. []
Corollary
2.3 (The Representation Lemma).
Let e > 0 and let T be monotone on ~.
Then every element z of Y{ can be written in at most one way as x+ cy, where y c Tx.

Citations
More filters
Journal ArticleDOI

Total Variation Spatial Regularization for Sparse Hyperspectral Unmixing

TL;DR: The total variation (TV) regularization to the classical sparse regression formulation is included, thus exploiting the spatial-contextual information present in the hyperspectral images and developing a new algorithm called sparse unmixing via variable splitting augmented Lagrangian and TV.
Posted Content

Coordinate Descent Algorithms

TL;DR: Coordinate descent algorithms solve optimization problems by successively performing approximate minimization along coordinate directions or coordinate hyperplanes as mentioned in this paper, and they have been used in many applications, such as data analysis, machine learning, and other areas of current interest.
Journal ArticleDOI

The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent

TL;DR: A sufficient condition is presented to ensure the convergence of the direct extension of ADMM, and an example to show its divergence is given, which is not necessarily convergent.
Journal ArticleDOI

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding

TL;DR: In this article, the alternating directions method of multipliers is used to solve the homogeneous self-dual embedding, an equivalent feasibility problem involving finding a nonzero point in the intersection of a subspace and a cone.
Journal ArticleDOI

A Fast Alternating Direction Method for TVL1-L2 Signal Reconstruction From Partial Fourier Data

TL;DR: This paper proposes the use of the alternating direction method - a classic approach for optimization problems with separable variables - for signal reconstruction from partial Fourier measurements, and runs very fast (typically in a few seconds on a laptop) because it requires a small number of iterations.
References
More filters
Book

Parallel and Distributed Computation: Numerical Methods

TL;DR: This work discusses parallel and distributed architectures, complexity measures, and communication and synchronization issues, and it presents both Jacobi and Gauss-Seidel iterations, which serve as algorithms of reference for many of the computational approaches addressed later.
Book

Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert

Haim Brezis
TL;DR: In this article, Operateurs Maximaux Monotones: Et Semi-Groupes De Contractions Dans Les Espaces De Hllbert are described and discussed. But the focus is not on the performance of the operators.
Journal ArticleDOI

Monotone Operators and the Proximal Point Algorithm

TL;DR: In this paper, the proximal point algorithm in exact form is investigated in a more general form where the requirement for exact minimization at each iteration is weakened, and the subdifferential $\partial f$ is replaced by an arbitrary maximal monotone operator T.
Related Papers (5)