scispace - formally typeset
Open AccessJournal ArticleDOI

Approximate parallel scheduling. Part I: the basic technique with applications to optimal parallel list ranking in logarithmic time

Richard Cole, +1 more
- 01 Feb 1988 - 
- Vol. 17, Iss: 1, pp 128-142
TLDR
This work defines a novel scheduling problem, which leads to the first optimal logarithmic time PRAM algorithm for list ranking, and shows how to apply these results to obtain improved PRAM upper bounds for a variety of problems on graphs.
Abstract
We define a novel scheduling problem; it is solved in parallel by repeated, rapid, approximate reschedulings. This leads to the first optimal logarithmic time PRAM algorithm for list ranking. Companion papers show how to apply these results to obtain improved PRAM upper bounds for a variety of problems on graphs, including the following: connectivity, biconnectivity, Euler tour and $st$-numbering, and a number of problems on trees.

read more

Content maybe subject to copyright    Report

Approximate parallel
scheduling. Part I: The basic
technique
with
applications
to optimal parallel list ranking in
logarithmic
time
by
Richard Colet
Uzi Vishkint
Ultracomputer Note
#110
Computer Science
Department Technical Report
#244
October, 1986
Ultracomputer
Research
Laboratory
1
(X
E-1
CO
o
u
(0
o
0)
i-l
U
(0
as
(0
04
C
a>
:=>
r-l
(U
4-)
(0
e
•H
X
o
u
CU
O
<:
New
York University
Courant
Institute of
Mathematical Sciences
Division
of Computer
Science
251
Mercer
Street, New York, NY 10012


Approximate parallel
scheduling.
Part I:
The
basic
technique
with
applications to
optimal
parallel list
ranking
in
logarithmic
time
by
Richard Colef
Uzi
Vishkint
Ultracomputer
Note
#110
Computer
Science
Department
Technical
Report
#244
October,
1986
t
This
research
was
supported in
part by NSF grant DCR-84-01633 and
by an IBM
Faculty
Development
Award.
iThis
research
was
supported in
part by
NSF grants NSF-DCR-8318874 and
NSF-DCR-
8413359,
ONR grant
N00014-85-K-0046
and by the Applied Mathematical
Science
subpro-
gram
of the
office of
Energy
Research,
U.S. Department of Energy under
contract
number
DE-AC02-76ER03077.


ABSTRACT
We
define a
novel
scheduling
problem; it is
solved in
parallel by
repeated,
rapid,
approximate reschedulings. This
leads to the
first
optimal
logarithmic time
PRAM
algorithm for list ranking.
Companion
papers
show how to
apply these results to
obtain
improved
PRAM upper
bounds for a
variety
of
problems on graphs, includ-
ing:
connectivity, biconnectivity,
minimum
spanning tree,
Euler
tour and st-
numbering, and
a
number of
problems on trees.
1. Introduction
The model of
parallel computation
used in this
paper is a
member of the
parallel ran-
dom access machine (PRAM)
family. A
PRAM employs
p
synchronous processors all
having access to a common memory.
In this paper
we
use
an
exclusive-read
exclusive-write
(EREW)
PRAM. The
EREW PRAM does
not
allow simultaneous access by more
than
one processor to the same
memory location for read or write
purposes. See
[Vi-83]
for a
survey
of results concerning
PRAMs.
Let Seq(n) be
the
fastest
known worst-case
running time of
a sequential algorithm,
where
n
is the
length of
the input for the problem at hand.
Obviously, the best upper
bound
on the parallel time
achievable
using
p
processors,
without improving the sequential
result, is of
the form 0(Seq(n)/p). A parallel algorithm
that achieves this running time is
said to
have optimal speed-up or more simply to be
optimal. A primary goal in parallel
computation is to design
optimal algorithms that also run as
fast as possible.
Most
of the problems
we
consider can be
solved
by parallel algorithms that obey the
following framework.
Given an input
of
size
n
the parallel
algorithm employs a
reducing
procedure to
produce
a
smaller instance of the same problem (of size
^
n/2, say).
The
smaller problem is
solved recursively
until this brings us below some threshold for the
size
of
the problem.
An
alternative procedure is
then used to complete the
parallel algorithm.
We refer the reader
to [CV-86d]
where this algorithmic
technique,
which is
called
accelerating
cascades, is
discussed.
Typically, we need
to reschedule the
processors
in
order
to apply the
reducing
procedure efficiently to the smaller sized problem.
Suppose
the input for a
problem of
size n
is
given
in an array of size n.
A
natural
approach
is
to
compress the
smaller
problem
instance into a smaller array, of size
s
n/2.
This
is
often
done
using a prefix
sum
algorithm
(it
takes 0(log
n) time on n/log n
processors to
compute
the
prefix sums
for n
inputs
stored
in an array). Thus if we need to
reschedule
the

Citations
More filters
Book ChapterDOI

Parallel algorithms for shared-memory machines

TL;DR: In this paper, the authors discuss parallel algorithms for shared-memory machines and discuss the theoretical foundations of parallel algorithms and parallel architectures, and present a theoretical analysis of the appropriate logical organization of a massively parallel computer.
Proceedings ArticleDOI

Sorting in linear time

TL;DR: In this paper, it was shown that a unit-cost RAM with a word length of bits can sort integers in the range in time, for arbitrary!, a significant improvement over the bound of " # $ achieved by the fusion trees of Fredman and Willard, provided that % &'( *),+., for some fixed /102, the sorting can even be accomplished in linear expected time with a randomized algorithm.
Journal ArticleDOI

Models of machines and computation for mapping in multicomputers

TL;DR: In this article, the authors classified the mapping strategies for distributed computation across the computation resources of multiprocessor systems and assessed the relevance of a new result to a particular problem.
Proceedings ArticleDOI

Towards a theory of nearly constant time parallel algorithms

TL;DR: It is demonstrated that randomization is an extremely powerful tool for designing very fast and efficient parallel algorithms and a running time of O(lg* n) (nearly-constant), with high probability, is achieved using n/lG* n (optimal speedup) processors for a wide range of fundamental problems.
Journal ArticleDOI

Planar separators and parallel polygon triangulation

TL;DR: The utility of such a separator decomposition is demonstrated by showing how it can be used in the design of a parallel algorithm for triangulating a simple polygon deterministically in O( log n) time using O(n/log n) processors on a CRCW PRAM.
Frequently Asked Questions (16)
Q1. What are the contributions in "Approximate parallel scheduling. part i: the basic technique with applications to optimal parallel list ranking in logarithmic time" ?

In this paper, the first optimal logarithmic time PRAM algorithm for list ranking was proposed, which is solved in parallel by repeated, rapid, approximate reschedulings. 

The main contributions of [CV-86a] were the deterministic coin tossing technique and a methodology for scheduling that used as few reschedulings as possible. 

One of the main contributions of this paper is to provide an algorithm for performing approximate rescheduling deterministically in 0(1) time. 

The authors remark that the task scheduling problem will be solved by redistributing the tasks,while the processor scheduling problem appears to require the redistribution of processors. 

To ensure all the trees are complete and distinct the authors need to partition T into complete subtrees (recall that T is the smallest tree associated with the collection; thus the authors are guaranteed that if the authors create distinct sized complete binary trees from T then no two trees in the collection will have the same size and the authors get a proper set of complete binary trees). 

Given an input of size n the parallel algorithm employs a reducing procedure to produce a smaller instance of the same problem (of size ^ n/2, say). 

Any synchronous parallel algorithm of time t that consists of a total of x elementary operations can be implemented by p processors in time [x/p] + t.Proof of Brent's theorem. 

the best upper bound on the parallel time achievable using p processors, without improving the sequential result, is of the form 0(Seq(n)/p). 

The Euler tour technique on trees, which is given in [TV-85,Vi-85], consists ofreducing a variety of tree functions into list ranking. 

Part 2 of this research (the paper [CV-86c]) shows how to apply the approximatescheduling method together with the new list ranking algorithm in order to derive improved PRAM upper bounds for a variety of problems on graphs, including: connectivity, biconnectivity, minimum spanning tree, Euler tour and st-numbering. 

Definition: A bipartite graph G = (Vi,V2, E), with |Vi| = IVjl, is a (d, e, —^)-expander graph if for any subset U C Vj, with |U | ^ e IVjl, the set N(U) of neighbors ofvertices in U has size |N(U) | S: ( ) |U |, and G has vertex degree d. 

since the size of the collections in each useful large set S; lie in the range [2', 2'"'"^), the authors conclude that the giving collections in Sj, counting multiplicities, have weight at least 1/4 wtCSj). 

More precisely, let 3 be the number of leaves in the largest tree of objects in Cj (3 = 2 '); if 3 ^ 8d then the number of objects transferred is 3/8d; otherwise, no objects are moved. 

The problem is to schedule the n tasks on an EREW PRAM of n/log n processors so that the tasks are completed in 0(log n) time; it is solved in Section 4. 

Further such an expander graph can be built in 0(log |Vi|) time using |Vi|/log |Vi| processors, for each fixed e, as the authors show in the following remark. 

There must be a chain of length x ^ log n + 1 nodes which ends at (and includes) u and satisfies the following: 1. The last x— 1 nodes of this chain are marked for removal simultaneously.