scispace - formally typeset
Open AccessJournal ArticleDOI

Slowing down sorting networks to obtain faster sorting algorithms

Richard Cole
- 01 Jan 1987 - 
- Vol. 34, Iss: 1, pp 200-208
Reads0
Chats0
TLDR
In this article, a general method that trims a factor of O(log n) time (or more) for many applications of the Megiddo technique is presented. But it is not suitable for the case of parallel algorithms.
Abstract
Megiddo introduced a technique for using a parallel algorithm for one problem to construct an efficient serial algorithm for a second problem. This paper provides a general method that trims a factor of O(log n) time (or more) for many applications of this technique.

read more

Content maybe subject to copyright    Report

Slowing Down Sorting Networks to Obtain
Faster Sorting Algorithms
RICHARD COLE
New York University, New York. New York
Abstract. Megiddo introduced a technique for using a parallel algorithm for one problem to construct
an efftcient serial algorithm for a second problem. This paper provides a general method that trims a
factor of O(log n) time (or more) for many applications of this technique.
Categories and Subject Descriptors: F. I .2 [Computation by Abstract Devices]: Modes of Computation-
parallelism: relations among modes;
F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnu-
merical Algorithms and Problems-comptrtations
on discrete structures; geometrical problems and
computations; sequencing and scheduling, sorting and searching
G.2. I [Discrete Mathematics]: Com-
binatorics-combinatorial algorithms;
(3.2.2 [Discrete Mathematics]: Graph
Theory-network prob-
lems; path and circuit problems; trees
General Terms: Algorithms, Theory
Additional Key Words and Phrases: Algorithms on irees, ham sandwich theorem, min-ratio cycle,
parallel algorithms, scheduling, sorting network, spanning tree
1. Introduction
We solve a somewhat unusual sorting problem optimally; it is described in Section
2. The sorting problem was motivated by and derived from its application. The
application is an improvement of Megiddo’s ingenious technique [9], which uses
an effkient parallel algorithm for one problem to produce an effkient serial
algorithm for a second problem. However, Megiddo’s technique is more general
than our improvement. That is, there are problems to which Megiddo’s technique
can be applied, but to which our improvement cannot be applied. Some of the
ideas involved in Megiddo’s technique were first presented in [8]. The general idea
of Megiddo’s technique is as follows: Suppose a certain problem A is solved in Tp
time units on a P-processor machine. Also suppose a serial algorithm for problem
A running in time T, could be applied to designing a serial algorithm for problem
B with a running time of O(CT,), where essentially the algorithm for problem B
carries out each step of the algorithm for A, taking time C per step. By using the
parallel algorithm for A as a serial algorithm, we would obtain a serial algorithm
for B running in time O(CT,P). Megiddo showed that in many cases problem B
can actually be solved in time O(CT,log P), if P is not too large. Our contribution
This work was supported in part by the National Science Foundation under grant DCR 84-01633 and
by an IBM Faculty Development Award.
Author’s address: Courant Institute of Mathematical Science, New York University, 25 I Mercer Street,
New York, New York 10012.
Permission to copy without fee all or part of this material is granted provided that the copies are not
made or distributed for direct commercial advantage, the ACM copyright notice and the title of the
publication and its date appear, and notice is given that copying is by permission of the Association for
Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.
0 1987
ACM 0004-541 l/87/0100-0200 $00.75
Journal ofthe Association for Computing Machinery, Vol. 34, No. I, January 1987, pp. 200-208.

Slowing Down Sorting Networks for Faster Sorting Algorithms
201
is to reduce this time to
O(CT,)
in many cases. We describe Megiddo’s technique
and the improvement in Section 3. Using the improvement, we trim the running
times of several algorithms, which we describe in Section 4.
Remark.
Although our technique trims the running times of several algorithms,
it is at the expense of using the Ajtai-Komlos-Szemeredi (AKS) sorting network
[2], which involves enormous constants.
2. The Sorting Problem
We describe the sorting problem as a game. There are two players: the sorter and
the adversary. The game is played on a sorting network for sorting
n
items; the
network has width
n/2
and depth f(n). The idea of the game is to carry out a sort
on the network. The game is played in turns. In a turn, first the sorter requests the
adversary to resolve certain comparisons (i.e., to determine which of two inputs is
the larger); then the adversary resolves some of these comparisons (we are more
specific below). By “sorting on the network” we mean that the sorter must obtain
the result of exactly those comparisons that arise when the network is used, and
no others (except those that can be deduced by transitivity); also, the sorter must
follow the ordering of the comparisons created by the network (that is, if an output
of comparator
D
is an input to comparator C, then the result of the comparison at
D
must be known before the comparison at C is attempted).
Before giving the precise rules for the game, we need a few definitions. We define
an input to a comparator C to be known, either if the input is a circuit input (i.e.,
the input wire for the input to comparator C is an input wire for the circuit), or if
the input was the output of some other comparator
D
and the comparison at
D
has been resolved. A comparator is defined to be
active
if both its inputs are known
and the order of the inputs has not yet been determined, a comparator is
inactive
if it is not active.
Now, we define the game precisely. The two players alternate their moves, which
are as follows.
(1) The sorter assigns weights to all the active comparators. Let the sum of the
weights assigned be W(the
active
weight). Let C be the comparison correspond-
ing to comparator c. If c is assigned weight w, we consider C to have been
assigned weight w also.
(2) The adversary is obliged to resolve sufficiently many of the weighted compar-
isons so that the sum of the weights of the resolved comparisons is at least
w/2.
The game ends when the sort is complete. The aim of the sorter is to end the
game quickly. A turn consists of one move of each player; we show the sorter can
end the game in O(log n + f(n)) turns. The sorter’s strategy is to assign weights to
the comparators according to the following rule. An active comparator at depth j
in the network is given weight 4-j. We prove the following invariant.
LEMMA
1.
(3/4)k
At the start
of
the k +
1st
turn the active weight is bounded by
- n/2,
for
k r 0.
PROOF.
We prove the result by induction on
k.
At the start of the first turn
there are
n/2
active comparators at depth 0, and all other comparators are inactive.
So for
k
= 0 the result holds. To prove the inductive step, it is sufficient to show
that at each turn the active weight is reduced by at least one quarter. We now show
this.

202
RICHARD COLE
Consider an active comparator zi of weight w, and suppose the corresponding
comparison C is resolved. Then c ceases to be active, and up to two comparators,
each of weight w/4, may become active. So the resolution of C reduces the active
weight by at least w/2. Let the active weight be FV. In one turn, we are guaranteed
that the comparisons resolved by the adversary have combined weight at least
W/2. Thus, in one turn, the active weight is reduced from W to at most 3 W/4.
Cl
We now show that this process terminates reasonably quickly.
LEMMA
2. For k L 5(j + l/2 log n), during the k + 1st turn there are no active
comparators at depth j.
PROOF.
At the start of the k + 1st turn the total active weight Wis bounded by
(3/4)5(j+1’210gn) - n/2 (by Lemma 1). We note (3/4)5 < l/4. So WC (1/4)i+‘/210g” .
n/2 = (l/4)’ - l/2. But an active comparator at depth j has weight (l/4)‘. So there
is no such comparator. Cl
COROLLARY.
The game ends after O(f(n) + log n) turns.
PROOF.
After 5(f(n) + l/2 log n) turns there are no active comparators (by
Lemma 2). That is, all the comparisons are resolved, so the sort is completed.
Hence the game ends after O(f(n) + log n) turns.
q
Remark. It is of interest to play this game on arbitrary directed acyclic graphs.
Lemma 1 and the corollary hold even in the case of unbounded fanin (respectively,
fanout) as long as the fanout (respectively, fanin) is bounded. It suffices to redefine
the weights as follows: Give each output a weight of 1, and let the weight of each
internal node be twice the sum of the weights of its immediate descendants (then
scale to make the initial weight equal to n/2). Using the same weight assignment,
it is easy to obtain a bound of O(log w + d log 6) turns for playing this game on
the class of graphs of width w, depth d, and minlmax fanin, max fanout) I 6.
3. Improving Megiddo’s Technique
In general terms, Megiddo’s technique provides a way to search a partially ordered
space, possibly of superpolynomial size, without actually constructing the space.
Instead an implicit binary search of the space is made. We use a sorting algorithm
to guide this search (Megiddo’s technique is not restricted to sorting algorithms).
Typically, a sorted order corresponds to a region of the space being searched. So
resolving a comparison in the sort corresponds to reducing the size of the space in
which a solution is known to lie. As might be expected, a single comparison is
expensive; but surprisingly, for some problems, several comparisons can be batched
relatively cheaply. This leads Megiddo to use a parallel sorting algorithm to guide
the search, for in the searching algorithm he can batch the comparisons that are
performed simultaneously in the parallel algorithm.
More precisely, suppose that we have a fast parallel algorithm for one problem,
problem A say, which is used to construct a fast serial algorithm for a second
problem, problem B say. Further, suppose that problem A is sorting and problem
B has the following unusual features.
(1) Problem B can be solved by “sorting” but each “comparison” is expensive,
that is, it takes time C(n) rather than time O(1). Typically C(n) is O(n) or
O(n log n).
(2) (The hatching rule.) If we consider m of these “comparisons,” C, , . . . , C,,,, we
can order the comparisons, CT(,) I - - - I C&j, in the following sense. (We

Slowing Down Sorting Networks for Faster Sorting Algorithms
203
think of a comparison C as being the question “Is cl < CZ?,” which has answer
either “yes” or “no.“) An answer of “yes” to C*(j) forces C’=(l), . . . , Cz(j-i) to
have answer “yes,” while an answer of “no” to C=o, forces Cr(j+l), . . . , C+)
to have answer “no.” Also we can determine the relative order of two compar-
isons fairly quickly, typically in time 0( 1).
We give an example to illustrate this definition (this example was previously
given in [9]). This example is not intended to be solved by the methods described
below; it is being given solely to illustrate the method. Suppose we are given
n
increasing, linear functions of x: A(x) = six +
bi, ai > 0,
1 5
i 5 n.
We define the
median function f(x) = median[ft (x), . . . ,
fn(x)]. We note f(x) is an increasing
function of x. The problem is to find the unique x* such that f(x*) = 0. One
(unorthodox) way of solving this problem is to find an
i
such that x(x*) = f(x*)
(without knowing x*). Having found
i,
it is then easy to deduce x*.
How do we find
i?
To do this we could compute an index
i
such that fi’(x*) =
median[J(x*), . . .
, f&x*)]. One way of finding this index is to sort the (symbolic)
values f; (x*), . . . ,
fn(x*), without knowing x*, and then pick out the median: Its
index is the
i we
seek.
So the question becomes: How do we sort the valuesfi(x*), . . . , fn(x*), without
knowing x*? We explain how to compare an arbitrary pair of these values, fi(x*)
and&x*), without knowing x*; then we can insert this comparison method as a
subroutine into any standard sorting algorithm.
Iffi(x) and&(x) represent parallel lines, then eitherf;(x) >&(x), orfi(x) =-L(x),
or f;(x) <f2(x), for all x, and in particular for x = x*. So supposef; and f2 are not
parallel; in fact, without loss of generality suppose
al
>
a2.
Then J(x) = h(x) for
some unique value x = x0; so for x > x0, fi(x) > fi(x), while for x < x0, fi(x) <
J(x). We conclude that, if x* > x0,5(x*) >&(x*), if x* = x0, fi(x*) =f2(x*), and
if x* c x0, f,(x*) < f2(x*). Thus resolving a comparison reduces to determining
whether x* > x0, x* <x0, or x*
= x0. To do this, we simply evaluatef(x0) (in
O(n)
time). Sincef(x) is an increasing function, we deduce that iff(xo) < 0, x* > x0, if
j(xo) = 0, x* = ~0, and iff(xo) > 0, x* < ~0. Thus in
O(n)
time
(=O(C(n))
time)
we can resolve a comparison.
In addition, we show that comparisons can be ordered. Suppose we have
m
comparisons Ci of the form “IsJ,(x*) <Jz(x*)?“, where ai, > ai,, 1 I
i
zz
m.
Each
comparison determines a value xi, for which determining whether x* > Xi, x* =
xi, or x* < xi resolves the comparison. We can order these values: x,(i) 5 . . . 5
x,(,,+ (We give the comparisons the corresponding order Czoj I . . . I C+,,,.) We
note that if x* > x,(j), then x* > x,(i), for
i
5 j, whereas if x* < x,(j), then x* <
x,(i), for
i
L j. Hence by obtaining the answer to C,(j), we also resolve either CL(,),
. ,
G(j-1)
or
G(j+l),
. - . , G(m).
More precisely, ifJ,(x*) c&(x*) (answer “yes”),
then J,(x*) < &(x*) for
i
< j (answer “yes”), whereas if x,(x*) > Jz(x*) (answer
“no”), then 5,(x*) >&(x*) for
i
> j (answer “no”). Finally, we observe that the
relative order of two comparisons can be determined in 0( 1) time.
Megiddo does not describe his technique as applying to the type of problem
formalized above, for his technique is more general, and in fact so is our improve-
ment. Nonetheless, we use this formulation both for simplicity and because many
of the problems we consider have this form. Next we describe how Megiddo solves
problem
B,
and then we explain our improvement.
Suppose we were to use a standard efficient sorting algorithm (running time
O(n
log
n))
to solve
B.
Then we would obtain an algorithm with running time
O(n
log
nC(n)).
Megiddo’s idea is to use a parallel sorting algorithm using
P(n)

204
RICHARD COLE
processors and parallel time T(n). At each time step of the parallel algorithm, we
have up to P(n) comparisons to resolve. Instead of evaluating them one by one, we
solve the median one. This immediately resolves half the comparisons. We repeat
log(P(n)) times (=O(log n) typically) and we thereby resolve all P(n) comparisons.
So we achieve a running time of 0( T(n)C(n) log n) plus overheads for running the
parallel computation and finding medians of sets of comparisons. The overheads
for running the parallel algorithm are O(P(n)T(n)). In fact, as Megiddo observed,
we can use a parallel algorithm which runs in time T(n) in Valiant’s model [ 161 so
long as the overheads can be performed efficiently in a
serial
simulation. To find
the median comparison, we use the fast median algorithm [ 1, pp. 97-991, running
in time O(P(n)), assuming ordering two comparisons takes time 0( 1). Since each
time we halve the size of the set of unresolved comparisons, the time taken to find
all log(P(n)) medians that we need is also O(P(n)). So over T(n) parallel time steps
we take time O(P(n)T(n)) to find medians.
Thus the total running time of the algorithm for problem
B
is O(P(n)T(n) +
T(n)C(n) log n). Typically, P(n) =
O(n
log
n), T(n) =
O(log
n) [
131, or
P(n) = O(n),
T(n)
= O(log
n)
[2]. (In the latter case the constant is very large.) When
C(n) =
O(n) (O(n
log
n),
respectively) each of these sorting algorithms gives an algorithm
for problem
B
running in time
O(n log*n) (O(n log3n),
respectively).
Our improvement is to trim a factor of log
n
from these running times. We use
the network of [2]; instead of performing the comparisons as described above, we
play the game described in Section 2. We have to provide an adversary, which we
do as follows. When the adversary is required to resolve a weighted half of the
comparisons, we resolve the weighted median comparison, which by (2) above
immediately resolves a weighted half of the comparisons. (Finding a weighted
median of
n
items takes
O(n)
time [ 151.) It is easy to see that the time taken to
play the game, apart from the comparisons, is
O(n
log
n).
(The AKS network can
be built in deterministic time
O(n
log
n)
[2, 6]-the constant is rather large,
however.) Since the depth of the AKS network [2] is O(log
n), we
need perform
only O(log
n)
comparisons. So we have a running time of
O(C(n)
log
n + n log n);
for
C(n)
=
O(n)
this is
O(n log n),
and for
C(n) = O(n log n)
it is
O(n log*n).
In several of the applications problem A is not sorting; instead it is the problem
of finding the minimum or of finding the median. Here Megiddo would use
algorithms having O(log log
n)
[15a, 161 and O((log log
n)*)
[3] parallel steps,
respectively, and using
O(n/log
log
n)
and
O(n)
processors, respectively. These
yield running times of
O(C(n)log
log
n
log
n
+
n)
and O(C(n)(log log n)*log
n +
n(log
log
n)*),
respectively, for problem
B.
Instead, by using a sorting algorithm we
achieve a running time of O(C(n)log
n
+
n
log
n)
for problem
B,
in both cases.
(Note: We are making no assumptions about the size of
C(n),
though for all
problems considered so far
C(n) 2 O(n).)
In practice another approach can be taken. There are probabilistic parallel
algorithms, running in constant time on
O(n)
processors, for finding the minimum
and the median [ 141. Using these, and applying Megiddo’s technique, we would
solve problem
B
in
O(C(n) log n
+
n)
probabilistic time. The constant is much
smaller than the one for the algorithm described in the previous paragraph, and in
addition the algorithm is considerably simpler. (I am indebted to Megiddo for
drawing this to my attention-I shall refer to this as Megiddo’s probabilistic
improvement.)
Remark.
At this point we can explain the title. By contrast with Megiddo’s
technique, our solution initially slows down the network, in that some comparisons

Citations
More filters
Book ChapterDOI

Discrete Geometric Shapes: Matching, Interpolation, and Approximation

TL;DR: This chapter surveys geometric techniques which have been used to measure the similarity or distance between shapes, as well as to approximate shapes, or interpolate between shapes.
Journal ArticleDOI

Matching planar maps

TL;DR: In this article, the authors define feasible distance measures that reflect how close a given pattern H is to some part of a larger pattern G. These distance measures are generalizations of the well-known Frechet distance for curves.
Journal ArticleDOI

Geometric pattern matching under Euclidean motion

TL;DR: Upper bounds are established on the combinatorial complexity of this subproblem in model-based computer vision, when the sets A and B contain points, line segments, or (filled-in) polygons.
Journal ArticleDOI

Geometric range searching

TL;DR: A survey of theoretical results and the main techniques in geometric range searching is presented, which can be used as subroutines in solutions to many seemingly unrelated problems.
Journal ArticleDOI

More planar two-center algorithms

TL;DR: A deterministic algorithm with running time O( nlog2nlog2logn), improving the previous O(nlog9n) bound of Sharir and almost matching the randomized O(logn2n) Bound of Eppstein.
References
More filters
Book

The Design and Analysis of Computer Algorithms

TL;DR: This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Journal ArticleDOI

Linear-Time Algorithms for Linear Programming in $R^3 $ and Related Problems

TL;DR: A linear-time algorithm is given for the classical problem of finding the smallest circle enclosing n given points in the plane, which disproves a conjecture by Shamos and Hoey that this problem requires Ω(n log n) time.
Journal ArticleDOI

Applying Parallel Computation Algorithms in the Design of Serial Algorithms

TL;DR: It is pointed out that analyses of parallelism in computational problems have practical implications even when multi-processor machines are not available, and a unified framework for cases like this is presented.
Proceedings ArticleDOI

An 0(n log n) sorting network

TL;DR: A sorting network of size 0(n log n) and depth 0(log n) is described, and a derived procedure (&egr;-nearsort) are described below, and the sorting network will be centered around these elementary steps.
Proceedings ArticleDOI

Linear-time algorithms for linear programming in R3 and related problems

TL;DR: A linear-time algorithm is given for the classical problem of finding the smallest circle enclosing n given points in the plane, which disproves a conjecture by Shamos and Hoey that this problem requires Ω(n log n) time.
Frequently Asked Questions (9)
Q1. What have the authors contributed in "Slowing down sorting networks to obtain faster sorting algorithms" ?

This paper provides a general method that trims a factor of O ( log n ) time ( or more ) for many applications of this technique. 

The partitioning problem for a tree is solved recursively, using the solution to the path partitioning problem to put the pieces together; Megiddo obtains a running time of O(n log3n); their improvement of the path partitioning algorithm reduces the running time to O(n log2n). 

Megiddo solves the partitioning problem for a path in time O(n log2n); his solution requires n binary searches to be done in parallel on a set of n items, where each comparison takes time O(n). 

A comparator is defined to be active if both its inputs are known and the order of the inputs has not yet been determined, a comparator is inactive if it is not active. 

This work was supported in part by the National Science Foundation under grant DCR 84-01633 and by an IBM Faculty Development Award. 

Megiddo obtains a running time of O(n log3n) for finding the continuous pcenter; their improvement to the algorithm for the searching problem reduces thisSlowing Down Sorting Networks for Faster Sorting Algorithms 207to O(n log%). 

So the authors achieve a running time of 0( T(n)C(n) log n) plus overheads for running the parallel computation and finding medians of sets of comparisons. 

So the authors have a running time of O(C(n) log n + n log n); for C(n) = O(n) this is O(n log n), and for C(n) = O(n log n) it is O(n log*n). 

As in (7), the continuous p-center problem is solved recursively, using the solution to the searching problem to put the pieces together.