On Projection Algorithms for Solving Convex Feasibility Problems

doi:10.1137/S0036144593251710

SIAM

REVIEW

Vol.

38,

No.

3,

pp.

367-426,

September

1996

()

1996

Society

for

Industrial

and

Applied

Mathematics

001

ON

PROJECTION

ALGORITHMS

FOR

SOLVING

CONVEX

FEASIBILITY

PROBLEMS*

HEINZ

H.

BAUSCHKEt

AND

JONATHAN

M.

BORWEINt

Abstract.

Due

to

their

extraordinary

utility

and

broad

applicability

in

many

areas

of

classical

mathematics

and

modem

physical

sciences

(most

notably,

computerized

tomography),

algorithms

for

solving

convex

feasibility

prob-

lems

continue

to

receive

great

attention.

To

unify,

generalize,

and

review

some

of

these

algorithms,

a

very

broad

and

flexible

framework

is

investigated.

Several

crucial

new

concepts

which

allow

a

systematic

discussion

of

questions

on

behaviour

in

general

Hilbert

spaces

and

on

the

quality

of

convergence

are

brought

out.

Numerous

examples

are

given.

Key

words,

angle

between

two

subspaces,

averaged

mapping,

Cimmino’s

method,

computerized

tomography,

convex

feasibility

problem,

convex

function,

convex

inequalities,

convex

programming,

convex

set,

Fej6r

monotone

sequence,

firmly

nonexpansive

mapping,

Hilbert

space,

image

recovery,

iterative

method,

Kaczmarz’s

method,

linear

convergence,

linear

feasibility

problem,

linear

inequalities,

nonexpansive

mapping,

orthogonal

projection,

projection

algorithm,

projection

method,

Slater

point,

subdifferential,

subgradient,

subgradient

algorithm,

successive

projections

AMS

subject

classifications.

47H09,

49M45,

65-02,

65J05,

90C25

1.

Introduction,

preliminaries,

and

notation.

A

very

common

problem

in

diverse

areas

of

mathematics

and

physical

sciences

consists

of

trying

to

find

a

point

in

the

intersection

of

convex

sets.

This

problem

is

referred

to

as

the

convex

feasibility

problem;

its

precise

mathematical

formulation

is

as

follows.

Suppose

X

is

a

Hilbert

space

and

C1

CN

are

closed

convex

subsets

with

nonempty

intersection

C:

C

C10’’’("ICN

O.

Convex

feasibility

problem:

Find

some

point

x

in

C.

We

distinguish

two

major

types.

1.

The

set

Ci

is

"simple"

in

the

sense

that

the

projection

(i.e.,

the

nearest

point

mapping)

onto

Ci

can

be

calculated

explicitly;

Ci

might

be

a

hyperplane

or

a

halfspace.

2.

It

is

not

possible

to

obtain

the

projection

onto

Ci;

however,

it

is

at

least

possible

to

describe

the

projection

onto

some

approximating

superset

of

Ci.

(There

is

always

a

trivial

approximating

superset

of

Ci,

namely,

X.)

Typically,

Ci

is

a

lower

level

set

of

some

convex

function.

One

frequently

employed

approach

in

solving

the

convex

feasibility

problem

is

algorith-

mic.

The

idea

is

to

involve

the

projections

onto

each

set

Ci

(resp.,

onto

a

superset

of

Ci)

to

generate

a

sequence

of

points

that

is

supposed

to

converge

to

a

solution

of

the

convex

feasibility

problem.

This

is

the

approach

we

will

investigate.

We

are

aware

of

four

distinct

(although

intertwining)

branches,

which

we

classify

by

their

applications.

I.

Best

approximation

theory.

Properties:

Each

set

Ci

is

a

closed

subspace.

The

algorithmic

scheme

is

simple

("cyclic"

control).

Basic

results:

yon

Neumann

103,

Thin.

13.7],

Halperin

[61

].

Comments:

The

generated

sequence

converges

in

norm

to

the

point

in

C

that

is

closest

to

the

starting

point.

Quality

of

convergence

is

well

understood.

References:

Deutsch

[44].

*Received

by

the

editors

July

7,

1993;

accepted

for

publication

(in

revised

form)

June

19,

1995.

This

research

was

supported

by

NSERC

and

by

the

Shrum

Endowment.

Department

of

Mathematics

and

Statistics,

Simon

Fraser

University,

Burnaby,

British

Columbia,

Canada

V5A

$6

(bauschke

@

cecm.sfu.ca

and

jborwein

@

cecm,sfu.ca).

367

Downloaded 06/06/13 to 134.148.10.12. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

368

H.H.

BAUSCHKE

AND

J.

M.

BORWEIN

Areas

of

application:

Diverse.

Statistics

(linear

prediction

theory),

partial

differential

equations

(Dirichlet

problem),

and

complex

analysis

(Bergman

kernels,

conformal

mappings),

to

name

only

a

few.

II.

Image

reconstruction:

Discrete

models.

Properties:

Each

set

Ci

is

a

halfspace

or

a

hyperplane.

X

is

a

Euclidean

space

(i.e.,

a

finite-dimensional

Hilbert

space).

Very

flexible

algorithmic

schemes.

Basic

results:

Kaczmarz

[71],

Cimmino

[29],

Agmon

[1],

Motzkin

and

Schoen-

berg

[83].

Comments:

Behaviour

in

general

Hilbert

space

and

quality

of

convergence

only

par-

tially

understood.

References:

Censor

[21,

23,

24],

Censor

and

Herman

[27],

Viergever

[102],

Sezan

[91].

Areas

of

application:

Medical

imaging

and

radiation

therapy

treatment

planning

(computerized

tomography),

electron

microscopy.

III.

Image

reconstruction:

Continuous

models.

Properties:

X

is

usually

an

infinite-dimensional

Hilbert

space.

Fairly

simple

algo-

rithmic

schemes.

Basic

results:

Gubin,

Polyak,

and

Raik

[60].

Comments:

Quality

of

convergence

is

fairly

well

understood.

References:

Herman

[63],

Youla

and

Webb

[108],

Stark

[95].

Areas

of

application:

Computerized

tomography,

signal

processing.

IV.

Subgradient

algorithms.

Properties:

Some

sets

Ci

are

of

type

2.

Fairly

simple

algorithmic

schemes

("cyclic"

or

"weighted"

control).

Basic

results:

Eremin

[52],

Polyak

[86],

Censor

and

Lent

[28].

Comments:

Quality

of

convergence

is

fairly

well

understood.

References:

Censor

[22],

Shor

[92].

Areas

of

application:

Solution

of

convex

inequalities,

minimization

of

convex

non-

smooth

functions.

To

improve,

unify,

and

review

algorithms

for

these

branches,

we

must

study

a

flexible

algorithmic

scheme

in

general

Hilbert

space

and

be

able

to

draw

conclusions

on

the

quality

of

convergence.

This

is

our

objective

in

this

paper.

We

will

analyze

algorithms

in

a

very

broad

and

adaptive

framework

that

is

essentially

due

to

Flgtm

and

Zowe

[53].

(Related

frameworks

with

somewhat

different

ambitions

were

investigated

by

Browder

[17]

and

Schott

[89].)

The

algorithmic

scheme

is

as

follows.

Given

the

current

iterate

X

(n),

the

x

n+)

is

obtained

by

(*)

where

every

Pi

(n)

is

the

projection

onto

some

approximating

superset

C

n)

of

Ci,

every

0

n)

is

a

relaxation

parameter

between

0

and

2,

and

the

)}n),s

are

nonnegative

weights

summing

up

to

1.

In

short,

x

n+l

is

a

weighted

average

of

relaxed

projections

of

x

n.

Censor

and

Herman

[27]

expressly

suggested

the

study

of

a

(slightly)

restricted

version

of

(.)

in

the

context

of

computerized

tomography.

It

is

worthwhile

to

point

out

that

the

scheme

(.)

can

be

thought

of

as

a

combination

of

the

schemes

investigated

by

Aharoni,

Berman,

and

Censor

[2]

and

Aharoni

and

Censor

[3].

In

Euclidean

spaces,

norm

convergence

results

were

obtained

by

Flm

and

Zowe

for

(,)

and

by

Aharoni

and

Censor

[3]

for

the

restricted

version.

However,

neither

behaviour

in

general

Hilbert

spaces

nor

quality

of

convergence

has

been

much

discussed

so

far.

To

do

this

comprehensively

and

clearly,

it

is

important

to

bring

out

Downloaded 06/06/13 to 134.148.10.12. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

ALGORITHMS

FOR

CONVEX

FEASIBILITY

PROBLEMS

369

some

underlying

recurring

concepts.

We

feel

these

concepts

lie

at

the

heart

of

many

algorithms

and

will

be

useful

for

other

researchers

as

well.

The

paper

is

organized

as

follows.

In

2,

the

two

important

concepts

of

attracting

mappings

and

Fejr

monotone

sequences

are

investigated.

The

former

concept

captures

essential

properties

of

the

operator

A

(n),

whereas

the

latter

deals

with

inherent

qualities

of

the

sequence

(xn).

The

idea

of

a

focusing

algorithm

is

introduced

in

3.

The

very

broad

class

of

focusing

algorithms

admits

results

on

convergence.

In

addition,

the

well-known

ideas

of

cyclic

and

weighted

control

are

subsumed

under

the

notion

of

intermittent

control.

Weak

topology

results

on

intermittent

focusing

algorithms

are

given.

We

actually

study

a

more

general

form

of

the

iteration

(.)

without

extra

work;

as

a

by-product,

we

obtain

a

recent

result

by

Tseng

100]

and

make

connections

with

work

by

Browder

[17]

and

Baillon

[7].

At

the

start

of

4,

we

exclusively

consider

algorithms

such

as

(.),

which

we

name

projec-

tion

algorithms.

Prototypes

of

focusing

and

linearly

focusing

(a

stronger,

more

quantitative

version)

projection

algorithms

are

presented.

When

specialized

to

Euclidean

spaces,

our

analysis

yields

basic

results

by

Flm

and

Zowe

[53]

and

Aharoni

and

Censor

[3].

The

fifth

section

discusses

norm

and

particularly

linear

convergence.

Many

known

suffi-

cient

sometimes

ostensibly

different

looking

conditions

for

linear

convergence

can

be

thought

of

as

special

instances

of

a

single

new

geometric

concept--regularity.

Here

the

N-tuple

(C1

Cv)

is

called

regular

if"closeness

to

all

sets

Ci

implies

closeness

to

their

intersection

C."

Four

quantitative

versions

of

(bounded)

(linear)

regularity

are

described.

Having

gotten

all

the

crucial

concepts

together,

we

deduce

our

main

results,

one

of

which

states

in

short

that

linearly

focusing

projection

algorithm

+

intermittent

control

+

imply

linear

convergence.

"nice"

relaxation

parameters

and

weights

+

(C1

Cv)

boundedly

linearly

regular

This

section

ends

with

results

on

(bounded)

(linear)

regularity,

including

a

characterization

of

regular

N-tuples

of

closed

subspaces.

Section

6

contains

a

multitude

of

examples

of

algorithms

from

branches

I,

II,

and

III.

The

final

section

examines

the

subgradient

algorithms

of

branch

IV,

to

which

our

also

apply.

Thus,

a

well-known

Slater

point

condition

emerges

as

a

sufficient

condition

for

a

subgradient

algorithm

to

be

linearly

focusing,

thus

yielding

a

conceptionally

simple

proof

of

an

important

result

by

De

Pierro

and

Iusem

[40].

It

is

very

satisfactory

that

analogous

results

are

obtained

for

algorithms

suggested

by

Dos

Santos

[47]

and

Yang

and

Murty

105].

For

the

reader’s

convenience,

an

index

is

included.

We

conclude

this

section

with

a

collection

of

frequently-used

facts,

definitions,

and

no-

tation.

The

"stage"

throughout

this

paper

is

a

real

Hilbert

space

X;

its

unit

ball

{x

X

IIx

_<

is

denoted

Bx.

FACTS

1.1.

(i)

(parallelogram

law)

If

x,

y

X,

then

IIx

+

Yll

a

+

IIx

Yll

a

2(llxll

a

+

IlYlla).

(ii)

(strict

convexity)

If

x,

y

X,

then

IIx

+

Yll

Ilxll

+

IlYll

implies

IlYlI"

x

Ilxll"

Y.

(iii)

Every

bounded

sequence

in

X

possesses

a

weakly

convergent

subsequence.

Downloaded 06/06/13 to 134.148.10.12. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

370

H.H.

BAUSCHKE

AND

J.

M.

BORWEIN

Proof

(i)

is

easy

to

verify

and

implies

(ii).

(iii)

follows

from

the

Eberlein-,mulian

theorem

(see,

for

example,

Holmes

[67,

18]).

All

"actors"

turn

out

to

be

members

of

the

distinguished

class

of

nonexpansive

mappings.

A

mapping

T

D

----+

X,

where

the

domain

D

is

a

closed

convex

nonempty

subset

of

X,

is

called

nonexpansive

if

Tx

Ty

IIx

y

for

all

x,

y

6

D.

If

IITx

TyI[

IIx

YlI,

for

all

x,

y

6

D,

then

we

say

T

is

an

isometry.

In

contrast,

if

Zx

Tyll

<

IIx

y

ll,

for

all

distinct

x,

y

6

D,

then

we

speak

of

a

strictly

nonexpansive

mapping.

If

T

is

a

nonexpansive

mapping,

then

the

set

of

all

fixed

points

Fix

T,

which

is

defined

by

Fix

T

{x

6

D

:x

Tx},

is

always

closed

and

convex

[58,

Lem.

3.4].

FACT

1.2

(demiclosedness

principle).

If

D

is

a

closed

convex

subset

of

X,

T

D

-----+

X

is

nonexpansive,

(xn)

is

a

sequence

in

D,

and

x

6

D,

then

implies

x

6

Fix

T,

where,

by

convention,

"--+"

(resp.,

"---")

stands

for

norm

(resp.,

weak)

convergence.

Proof.

This

is

a

special

case

of

Opial’s

[84,

Lem.

2].

[3

It

is

obvious

that

the

identity

Id

is

nonexpansive

and

easy

to

see

that

convex

combinations

of

nonexpansive

mappings

are

also

nonexpansive.

In

particular,

if

N

is

a

nonexpansive

mapping,

then

so

is

(1

ot)Id

+

aN

for

all

ot

6

[0,

1[.

These

mappings

are

called

averaged

mappings.

A

firmly

nonexpansive

mapping

is

a

nonex-

pansive

mapping

that

can

be

written

as

1/2Id

+

g

N

for

some

nonexpansive

mapping

N.

FACT

1.3.

If

D

is

a

closed

convex

subset

of

X

and

T

D

X

is

a

mapping,

then

the

following

conditions

are

equivalent.

(i)

T

is

firmly

nonexpansive.

(ii)

IlTx

Tyll

2

<

(Tx

Ty,

x

y)

for

all

x,

y

6

D.

(iii)

2T

Id

is

nonexpansive.

Proof.

See,

for

example,

Zarantonello’s

[109,

1]

or

Goebel

and

Kirk’s

[56,

Thm.

12.1].

[3

A

mapping

is

called

relaxed

firmly

nonexpansive

if

it

can

be

expressed

as

(1

ot)Id

/

ot

F

for

some

firmly

nonexpansive

mapping

F.

COROLLARY

1.4.

Suppose

D

is

a

closed

convex

subset

of

X

and

T

D

-----+

X

is

a

mapping.

Then

T

is

averaged

if

and

only

if

it

is

relaxed

firmly

nonexpansive.

The

"principal

actor"

is

the

projection

operator.

Given

a

closed

convex

nonempty

subset

C

of

X,

the

mapping

that

sends

every

point

to

its

nearest

point

in

C

(in

the

norm

induced

by

the

inner

product

of

X)

is

called

the

projection

onto

C

and

denoted

Pc.

Downloaded 06/06/13 to 134.148.10.12. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

ALGORITHMS

FOR

CONVEX

FEASIBILITY

PROBLEMS

371

FACTS

1.5.

Suppose

C

is

a

closed

convex

nonempty

subset

of

X

with

projection

Pc.

Then

(i)

Pc

is

firmly

nonexpansive.

(ii)

If

x

X,

then

Pcx

is

characterized

by

Pcx

C

and

(C-

Pcx,

x-

Pcx)

<_0.

Proof

See,

for

example,

[109,

Lem.

1.2]

for

(i)

and

[109,

Lem.

1.1]

for

(ii).

Therefore,

projection

firmly

nonexpansive

relaxed

firmly

nonexpansive

=

averaged

isometry

=,

nonexpansive

=

strictly

nonexpansive.

The

associated

function

d(.,

C)

X

]R

x

infcc

IIx

cll

IIx

efxll

is

called

the

distance

function

to

C;

it

is

easy

to

see

that

d

(.,

C)

is

convex

and

continuous

(hence

weakly

lower

semicontinuous).

A

good

reference

on

nonexpansive

mappings

is

Goebel

and

Kirk’s

recent

book

[58].

Many

results

on

projections

are

in

Zarantonello’s

109].

The

algorithms’

quality

of

convergence

will

be

discussed

in

terms

of

linear

convergence:

a

sequence

(Xn)

in

X

is

said

to

converge

linearly

to

its

limit

x

(with

rate

fl)

if

fl

[0,

1[

and

there

is

some

ot

>_

0

such

that

(s.t.)

IIx=

x

0//n

for

all

n.

PROPOSITION

1.6.

Suppose

(Xn)

is

a

sequence

in

X,

p

is

some

positive

integer,

and

x

is

a

point

in

X.

If

(Xpn)n

converges

linearly

to

x

and

(llXn

xll)n

is

decreasing,

then

the

entire

sequence

(Xn)

converges

linearly

to

x.

Proof

There

is

some

ot

>

0

and/

6

[0,

1[

s.t.

I[Xpn

X

0//n

for

all

n.

Now

fix

an

arbitrary

positive

integer

rn

and

divide

by

p

with

remainder;

i.e.,

write

rn=p.n+r,

wherer{0,1

p-l}.

We

estimate

)np

Ilxm

x

-<

IlXpn

X

--<

C

7

0l(i)

np-t-r

.

and

the

result

follows.

q

Finally,

we

recall

the

meaning

of

the

following.

If

S

and

Y

are

any

subsets

of

X,

then

span

S,

c--6-n-vs,

S,

inty

S,

icrS,

and

intS

denote,

respectively,

the

span

of

S,

the

closed

convex

hull

of

S,

the

closure

of

S,

the

interior

of

S

with

Downloaded 06/06/13 to 134.148.10.12. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

On Projection Algorithms for Solving Convex Feasibility Problems

Citations

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

Proximal Algorithms

Spatially Sparse Precoding in Millimeter Wave MIMO Systems

For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution

Proximal Splitting Methods in Signal Processing

References

Convex analysis and variational problems

Algebraic Reconstruction Techniques (ART) for three-dimensional electron microscopy and X-ray photography

Weak convergence of the sequence of successive approximations for nonexpansive mappings

Convex Functions

Topics in metric fixed point theory

Related Papers (5)

Convex Analysis and Monotone Operator Theory in Hilbert Spaces

The method of projections for finding the common point of convex sets

A multiprojection algorithm using Bregman projections in a product space

A unified treatment of some iterative algorithms in signal processing and image reconstruction

Iterative oblique projection onto convex sets and the split feasibility problem