How does Megiddo solve the minimum ratio cycle problem?

The partitioning problem for a tree is solved recursively, using the solution to the path partitioning problem to put the pieces together; Megiddo obtains a running time of O(n log3n); their improvement of the path partitioning algorithm reduces the running time to O(n log2n).

How does Megiddo solve the partitioning problem for a path?

Megiddo solves the partitioning problem for a path in time O(n log2n); his solution requires n binary searches to be done in parallel on a set of n items, where each comparison takes time O(n).

How long does Megiddo take to find the continuous pcenter?

Megiddo obtains a running time of O(n log3n) for finding the continuous pcenter; their improvement to the algorithm for the searching problem reduces thisSlowing Down Sorting Networks for Faster Sorting Algorithms 207to O(n log%).

How is the continuous p-center problem solved?

As in (7), the continuous p-center problem is solved recursively, using the solution to the searching problem to put the pieces together.

(Open Access) Slowing down sorting networks to obtain faster sorting algorithms (1987) | Richard Cole

Q: How long does Megiddo run the parallel algorithm?

So the authors achieve a running time of 0( T(n)C(n) log n) plus overheads for running the parallel computation and finding medians of sets of comparisons.

Q: What is the running time of the AKS network?

So the authors have a running time of O(C(n) log n + n log n); for C(n) = O(n) this is O(n log n), and for C(n) = O(n log n) it is O(n log*n).

Slowing Down Sorting Networks to Obtain

Faster Sorting Algorithms

RICHARD COLE

New York University, New York. New York

Abstract. Megiddo introduced a technique for using a parallel algorithm for one problem to construct

an efftcient serial algorithm for a second problem. This paper provides a general method that trims a

factor of O(log n) time (or more) for many applications of this technique.

Categories and Subject Descriptors: F. I .2 [Computation by Abstract Devices]: Modes of Computation-

parallelism: relations among modes;

F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnu-

merical Algorithms and Problems-comptrtations

on discrete structures; geometrical problems and

computations; sequencing and scheduling, sorting and searching

G.2. I [Discrete Mathematics]: Com-

binatorics-combinatorial algorithms;

(3.2.2 [Discrete Mathematics]: Graph

Theory-network prob-

lems; path and circuit problems; trees

General Terms: Algorithms, Theory

Additional Key Words and Phrases: Algorithms on irees, ham sandwich theorem, min-ratio cycle,

parallel algorithms, scheduling, sorting network, spanning tree

1. Introduction

We solve a somewhat unusual sorting problem optimally; it is described in Section

2. The sorting problem was motivated by and derived from its application. The

application is an improvement of Megiddo’s ingenious technique [9], which uses

an effkient parallel algorithm for one problem to produce an effkient serial

algorithm for a second problem. However, Megiddo’s technique is more general

than our improvement. That is, there are problems to which Megiddo’s technique

can be applied, but to which our improvement cannot be applied. Some of the

ideas involved in Megiddo’s technique were first presented in [8]. The general idea

of Megiddo’s technique is as follows: Suppose a certain problem A is solved in Tp

time units on a P-processor machine. Also suppose a serial algorithm for problem

A running in time T, could be applied to designing a serial algorithm for problem

B with a running time of O(CT,), where essentially the algorithm for problem B

carries out each step of the algorithm for A, taking time C per step. By using the

parallel algorithm for A as a serial algorithm, we would obtain a serial algorithm

for B running in time O(CT,P). Megiddo showed that in many cases problem B

can actually be solved in time O(CT,log P), if P is not too large. Our contribution

This work was supported in part by the National Science Foundation under grant DCR 84-01633 and

by an IBM Faculty Development Award.

Author’s address: Courant Institute of Mathematical Science, New York University, 25 I Mercer Street,

New York, New York 10012.

Permission to copy without fee all or part of this material is granted provided that the copies are not

made or distributed for direct commercial advantage, the ACM copyright notice and the title of the

publication and its date appear, and notice is given that copying is by permission of the Association for

Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.

0 1987

ACM 0004-541 l/87/0100-0200 $00.75

Journal ofthe Association for Computing Machinery, Vol. 34, No. I, January 1987, pp. 200-208.

Slowing Down Sorting Networks for Faster Sorting Algorithms

201

is to reduce this time to

O(CT,)

in many cases. We describe Megiddo’s technique

and the improvement in Section 3. Using the improvement, we trim the running

times of several algorithms, which we describe in Section 4.

Remark.

Although our technique trims the running times of several algorithms,

it is at the expense of using the Ajtai-Komlos-Szemeredi (AKS) sorting network

[2], which involves enormous constants.

2. The Sorting Problem

We describe the sorting problem as a game. There are two players: the sorter and

the adversary. The game is played on a sorting network for sorting

items; the

network has width

n/2

and depth f(n). The idea of the game is to carry out a sort

on the network. The game is played in turns. In a turn, first the sorter requests the

adversary to resolve certain comparisons (i.e., to determine which of two inputs is

the larger); then the adversary resolves some of these comparisons (we are more

specific below). By “sorting on the network” we mean that the sorter must obtain

the result of exactly those comparisons that arise when the network is used, and

no others (except those that can be deduced by transitivity); also, the sorter must

follow the ordering of the comparisons created by the network (that is, if an output

of comparator

is an input to comparator C, then the result of the comparison at

must be known before the comparison at C is attempted).

Before giving the precise rules for the game, we need a few definitions. We define

an input to a comparator C to be known, either if the input is a circuit input (i.e.,

the input wire for the input to comparator C is an input wire for the circuit), or if

the input was the output of some other comparator

and the comparison at

has been resolved. A comparator is defined to be

active

if both its inputs are known

and the order of the inputs has not yet been determined, a comparator is

inactive

if it is not active.

Now, we define the game precisely. The two players alternate their moves, which

are as follows.

(1) The sorter assigns weights to all the active comparators. Let the sum of the

weights assigned be W(the

active

weight). Let C be the comparison correspond-

ing to comparator c. If c is assigned weight w, we consider C to have been

assigned weight w also.

(2) The adversary is obliged to resolve sufficiently many of the weighted compar-

isons so that the sum of the weights of the resolved comparisons is at least

w/2.

The game ends when the sort is complete. The aim of the sorter is to end the

game quickly. A turn consists of one move of each player; we show the sorter can

end the game in O(log n + f(n)) turns. The sorter’s strategy is to assign weights to

the comparators according to the following rule. An active comparator at depth j

in the network is given weight 4-j. We prove the following invariant.

LEMMA

(3/4)k

At the start

the k +

1st

turn the active weight is bounded by

- n/2,

for

k r 0.

PROOF.

We prove the result by induction on

At the start of the first turn

there are

n/2

active comparators at depth 0, and all other comparators are inactive.

So for

= 0 the result holds. To prove the inductive step, it is sufficient to show

that at each turn the active weight is reduced by at least one quarter. We now show

this.

202

RICHARD COLE

Consider an active comparator zi of weight w, and suppose the corresponding

comparison C is resolved. Then c ceases to be active, and up to two comparators,

each of weight w/4, may become active. So the resolution of C reduces the active

weight by at least w/2. Let the active weight be FV. In one turn, we are guaranteed

that the comparisons resolved by the adversary have combined weight at least

W/2. Thus, in one turn, the active weight is reduced from W to at most 3 W/4.

We now show that this process terminates reasonably quickly.

LEMMA

2. For k L 5(j + l/2 log n), during the k + 1st turn there are no active

comparators at depth j.

PROOF.

At the start of the k + 1st turn the total active weight Wis bounded by

(3/4)5(j+1’210gn) - n/2 (by Lemma 1). We note (3/4)5 < l/4. So WC (1/4)i+‘/210g” .

n/2 = (l/4)’ - l/2. But an active comparator at depth j has weight (l/4)‘. So there

is no such comparator. Cl

COROLLARY.

The game ends after O(f(n) + log n) turns.

PROOF.

After 5(f(n) + l/2 log n) turns there are no active comparators (by

Lemma 2). That is, all the comparisons are resolved, so the sort is completed.

Hence the game ends after O(f(n) + log n) turns.

Remark. It is of interest to play this game on arbitrary directed acyclic graphs.

Lemma 1 and the corollary hold even in the case of unbounded fanin (respectively,

fanout) as long as the fanout (respectively, fanin) is bounded. It suffices to redefine

the weights as follows: Give each output a weight of 1, and let the weight of each

internal node be twice the sum of the weights of its immediate descendants (then

scale to make the initial weight equal to n/2). Using the same weight assignment,

it is easy to obtain a bound of O(log w + d log 6) turns for playing this game on

the class of graphs of width w, depth d, and minlmax fanin, max fanout) I 6.

3. Improving Megiddo’s Technique

In general terms, Megiddo’s technique provides a way to search a partially ordered

space, possibly of superpolynomial size, without actually constructing the space.

Instead an implicit binary search of the space is made. We use a sorting algorithm

to guide this search (Megiddo’s technique is not restricted to sorting algorithms).

Typically, a sorted order corresponds to a region of the space being searched. So

resolving a comparison in the sort corresponds to reducing the size of the space in

which a solution is known to lie. As might be expected, a single comparison is

expensive; but surprisingly, for some problems, several comparisons can be batched

relatively cheaply. This leads Megiddo to use a parallel sorting algorithm to guide

the search, for in the searching algorithm he can batch the comparisons that are

performed simultaneously in the parallel algorithm.

More precisely, suppose that we have a fast parallel algorithm for one problem,

problem A say, which is used to construct a fast serial algorithm for a second

problem, problem B say. Further, suppose that problem A is sorting and problem

B has the following unusual features.

(1) Problem B can be solved by “sorting” but each “comparison” is expensive,

that is, it takes time C(n) rather than time O(1). Typically C(n) is O(n) or

O(n log n).

(2) (The hatching rule.) If we consider m of these “comparisons,” C, , . . . , C,,,, we

can order the comparisons, CT(,) I - - - I C&j, in the following sense. (We

Slowing Down Sorting Networks for Faster Sorting Algorithms

203

think of a comparison C as being the question “Is cl < CZ?,” which has answer

either “yes” or “no.“) An answer of “yes” to C*(j) forces C’=(l), . . . , Cz(j-i) to

have answer “yes,” while an answer of “no” to C=o, forces Cr(j+l), . . . , C+)

to have answer “no.” Also we can determine the relative order of two compar-

isons fairly quickly, typically in time 0( 1).

We give an example to illustrate this definition (this example was previously

given in [9]). This example is not intended to be solved by the methods described

below; it is being given solely to illustrate the method. Suppose we are given

increasing, linear functions of x: A(x) = six +

bi, ai > 0,

1 5

i 5 n.

We define the

median function f(x) = median[ft (x), . . . ,

fn(x)]. We note f(x) is an increasing

function of x. The problem is to find the unique x* such that f(x*) = 0. One

(unorthodox) way of solving this problem is to find an

such that x(x*) = f(x*)

(without knowing x*). Having found

it is then easy to deduce x*.

How do we find

To do this we could compute an index

such that fi’(x*) =

median[J(x*), . . .

, f&x*)]. One way of finding this index is to sort the (symbolic)

values f; (x*), . . . ,

fn(x*), without knowing x*, and then pick out the median: Its

index is the

i we

seek.

So the question becomes: How do we sort the valuesfi(x*), . . . , fn(x*), without

knowing x*? We explain how to compare an arbitrary pair of these values, fi(x*)

and&x*), without knowing x*; then we can insert this comparison method as a

subroutine into any standard sorting algorithm.

Iffi(x) and&(x) represent parallel lines, then eitherf;(x) >&(x), orfi(x) =-L(x),

or f;(x) <f2(x), for all x, and in particular for x = x*. So supposef; and f2 are not

parallel; in fact, without loss of generality suppose

a2.

Then J(x) = h(x) for

some unique value x = x0; so for x > x0, fi(x) > fi(x), while for x < x0, fi(x) <

J(x). We conclude that, if x* > x0,5(x*) >&(x*), if x* = x0, fi(x*) =f2(x*), and

if x* c x0, f,(x*) < f2(x*). Thus resolving a comparison reduces to determining

whether x* > x0, x* <x0, or x*

= x0. To do this, we simply evaluatef(x0) (in

O(n)

time). Sincef(x) is an increasing function, we deduce that iff(xo) < 0, x* > x0, if

j(xo) = 0, x* = ~0, and iff(xo) > 0, x* < ~0. Thus in

O(n)

time

(=O(C(n))

time)

we can resolve a comparison.

In addition, we show that comparisons can be ordered. Suppose we have

comparisons Ci of the form “IsJ,(x*) <Jz(x*)?“, where ai, > ai,, 1 I

Each

comparison determines a value xi, for which determining whether x* > Xi, x* =

xi, or x* < xi resolves the comparison. We can order these values: x,(i) 5 . . . 5

x,(,,+ (We give the comparisons the corresponding order Czoj I . . . I C+,,,.) We

note that if x* > x,(j), then x* > x,(i), for

5 j, whereas if x* < x,(j), then x* <

x,(i), for

L j. Hence by obtaining the answer to C,(j), we also resolve either CL(,),

. ,

G(j-1)

G(j+l),

. - . , G(m).

More precisely, ifJ,(x*) c&(x*) (answer “yes”),

then J,(x*) < &(x*) for

< j (answer “yes”), whereas if x,(x*) > Jz(x*) (answer

“no”), then 5,(x*) >&(x*) for

> j (answer “no”). Finally, we observe that the

relative order of two comparisons can be determined in 0( 1) time.

Megiddo does not describe his technique as applying to the type of problem

formalized above, for his technique is more general, and in fact so is our improve-

ment. Nonetheless, we use this formulation both for simplicity and because many

of the problems we consider have this form. Next we describe how Megiddo solves

problem

and then we explain our improvement.

Suppose we were to use a standard efficient sorting algorithm (running time

O(n

log

n))

to solve

Then we would obtain an algorithm with running time

O(n

log

nC(n)).

Megiddo’s idea is to use a parallel sorting algorithm using

P(n)

204

RICHARD COLE

processors and parallel time T(n). At each time step of the parallel algorithm, we

have up to P(n) comparisons to resolve. Instead of evaluating them one by one, we

solve the median one. This immediately resolves half the comparisons. We repeat

log(P(n)) times (=O(log n) typically) and we thereby resolve all P(n) comparisons.

So we achieve a running time of 0( T(n)C(n) log n) plus overheads for running the

parallel computation and finding medians of sets of comparisons. The overheads

for running the parallel algorithm are O(P(n)T(n)). In fact, as Megiddo observed,

we can use a parallel algorithm which runs in time T(n) in Valiant’s model [ 161 so

long as the overheads can be performed efficiently in a

serial

simulation. To find

the median comparison, we use the fast median algorithm [ 1, pp. 97-991, running

in time O(P(n)), assuming ordering two comparisons takes time 0( 1). Since each

time we halve the size of the set of unresolved comparisons, the time taken to find

all log(P(n)) medians that we need is also O(P(n)). So over T(n) parallel time steps

we take time O(P(n)T(n)) to find medians.

Thus the total running time of the algorithm for problem

is O(P(n)T(n) +

T(n)C(n) log n). Typically, P(n) =

O(n

log

n), T(n) =

O(log

n) [

131, or

P(n) = O(n),

T(n)

= O(log

[2]. (In the latter case the constant is very large.) When

C(n) =

O(n) (O(n

log

n),

respectively) each of these sorting algorithms gives an algorithm

for problem

running in time

O(n log*n) (O(n log3n),

respectively).

Our improvement is to trim a factor of log

from these running times. We use

the network of [2]; instead of performing the comparisons as described above, we

play the game described in Section 2. We have to provide an adversary, which we

do as follows. When the adversary is required to resolve a weighted half of the

comparisons, we resolve the weighted median comparison, which by (2) above

immediately resolves a weighted half of the comparisons. (Finding a weighted

median of

items takes

O(n)

time [ 151.) It is easy to see that the time taken to

play the game, apart from the comparisons, is

O(n

log

n).

(The AKS network can

be built in deterministic time

O(n

log

[2, 6]-the constant is rather large,

however.) Since the depth of the AKS network [2] is O(log

n), we

need perform

only O(log

comparisons. So we have a running time of

O(C(n)

log

n + n log n);

for

C(n)

O(n)

this is

O(n log n),

and for

C(n) = O(n log n)

it is

O(n log*n).

In several of the applications problem A is not sorting; instead it is the problem

of finding the minimum or of finding the median. Here Megiddo would use

algorithms having O(log log

[15a, 161 and O((log log

n)*)

[3] parallel steps,

respectively, and using

O(n/log

log

and

O(n)

processors, respectively. These

yield running times of

O(C(n)log

log

and O(C(n)(log log n)*log

n +

n(log

log

n)*),

respectively, for problem

Instead, by using a sorting algorithm we

achieve a running time of O(C(n)log

log

for problem

in both cases.

(Note: We are making no assumptions about the size of

C(n),

though for all

problems considered so far

C(n) 2 O(n).)

In practice another approach can be taken. There are probabilistic parallel

algorithms, running in constant time on

O(n)

processors, for finding the minimum

and the median [ 141. Using these, and applying Megiddo’s technique, we would

solve problem

O(C(n) log n

probabilistic time. The constant is much

smaller than the one for the algorithm described in the previous paragraph, and in

addition the algorithm is considerably simpler. (I am indebted to Megiddo for

drawing this to my attention-I shall refer to this as Megiddo’s probabilistic

improvement.)

Remark.

At this point we can explain the title. By contrast with Megiddo’s

technique, our solution initially slows down the network, in that some comparisons

Slowing down sorting networks to obtain faster sorting algorithms

Citations

Discrete Geometric Shapes: Matching, Interpolation, and Approximation

Matching planar maps

Geometric pattern matching under Euclidean motion

Geometric range searching

More planar two-center algorithms

References

The Design and Analysis of Computer Algorithms

Linear-Time Algorithms for Linear Programming in $R^3 $ and Related Problems

Applying Parallel Computation Algorithms in the Design of Serial Algorithms

An 0(n log n) sorting network

Linear-time algorithms for linear programming in R3 and related problems

Related Papers (5)

Applying Parallel Computation Algorithms in the Design of Serial Algorithms

Linear-Time Algorithms for Linear Programming in $R^3 $ and Related Problems

Combinatorial Optimization with Rational Objective Functions

Linear Programming in Linear Time When the Dimension Is Fixed

Algorithms in Combinatorial Geometry

Frequently Asked Questions (9)

Q1. What have the authors contributed in "Slowing down sorting networks to obtain faster sorting algorithms" ?

Q2. How does Megiddo solve the minimum ratio cycle problem?

Q3. How does Megiddo solve the partitioning problem for a path?

Q4. What is the definition of a comparator?

Q5. What was the support for this work?

Q6. How long does Megiddo take to find the continuous pcenter?

Q7. How long does Megiddo run the parallel algorithm?

Q8. What is the running time of the AKS network?

Q9. How is the continuous p-center problem solved?