The power of parallel prefix

doi:10.1109/TC.1985.6312202

Home
/
Papers
/
The power of parallel prefix

Journal Article•DOI•

The power of parallel prefix

Clyde P. Kruskal¹, Larry Rudolph², Marc Snir³•Institutions (3)

University of Illinois at Urbana–Champaign¹, Carnegie Mellon University², Hebrew University of Jerusalem³

01 Oct 1985-IEEE Transactions on Computers (IEEE)-Vol. 34, Iss: 10, pp 965-968

TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.

read less

Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Techniques for parallel manipulation of sparse matrices

[...]

Clyde P. Kruskal¹, Larry Rudolph², Marc Snir³•Institutions (3)

University of Maryland, College Park¹, Hebrew University of Jerusalem², IBM³

01 May 1989-Theoretical Computer Science

TL;DR: New techniques are presented for the manipulation of sparse matrices on parallel MIMD computers that consider the following problems: matrix addition, matrix multiplication, row and column permutation, matrix transpose, matrix vector multiplication, and Gaussian elimination.

...read moreread less

26 citations

Cites background from "The power of parallel prefix"

...Conversely, a canonical representation can be converted into a full matricial form in time 0( m/p) if the matrix is already initialized to zero, and time O(qr/p) otherwise....
[...]

Journal Article•DOI•

Towards a single model of efficient computation in real parallel machines

[...]

Pilar de la Torre¹, Clyde P. Kruskal²•Institutions (2)

University of New Hampshire¹, University of Maryland, College Park²

01 Sep 1992-Future Generation Computer Systems

TL;DR: This work proposes a model of parallel computation, the YPRAM, that allows general parallel algorithms to be designed for a wide class of parallel models, and shows that this model predicts, reasonably accurately, the actual known performances of several basic parallel models when solving these problems.

...read moreread less

26 citations

Journal Article•DOI•

Optimal computation of prefix sums on a binary tree of processors

[...]

Henk Meijer¹, Selim G. Akl¹•Institutions (1)

Queen's University¹

01 Apr 1987-International Journal of Parallel Programming

TL;DR: This paper shows how the problem of computed sums of the forma0+a1+...+ai, fori=0, 1,...,n−1 can be solved on a simple network, namely abinary tree of processors and shows how to extend the solution to obtain an optimal-cost algorithm.

...read moreread less

Abstract: Givenn numbersa0,a1,...,an−1, it is required to compute all sums of the forma0+a1+...+ai, fori=0, 1,...,n−1. This problem arises in many applications and is trivial to solve sequentially in O(n) time. Besides its practical importance, the problem gains an additional theoretical interest in parallel computation. A technique known asrecursive doubling allows all sums to be computed in O(logn) time on a model of computation wheren processors communicate through aninverse perfect suffle interconnection network. In this paper we show how the problem can be solved on a simple network, namely abinary tree of processors. In addition, we show how to extend our solution to obtain an optimal-cost algorithm. The algorithm usesp processors and runs in O((n/p)+logp) time, for a cost of O(n+p logp). This cost is optimal whenp logp=O(n). Finally, two applications of our results are illustrated, namely job scheduling with deadlines and the knapsack problem.

...read moreread less

26 citations

Journal Article•DOI•

Optimal and Efficient Algorithms for Summing and Prefix Summing on Parallel Machines

[...]

Eunice E. Santos¹•Institutions (1)

Virginia Tech¹

01 Apr 2002-Journal of Parallel and Distributed Computing

TL;DR: The problem of designing efficient parallel algorithms for summing and prefix summing for certain classes of the LogP model is studied and it is shown that any optimal summing algorithm must have a certain inherent structure.

...read moreread less

26 citations

Journal Article•DOI•

An optimal parallel algorithm for the visibility of a simple polygon from a point

[...]

Mikhail J. Atallah¹, Hubert Wagener², Danny Z. Chen¹•Institutions (2)

Purdue University¹, Technical University of Berlin²

01 Jul 1991-Journal of the ACM

TL;DR: In this article, the authors presented a parallel algorithm for computing the visible portion of a simple polygonal chain with 7i vertices from a point in the plane in O(logn) time using 0{nf log n) processors in the CREW-PRAM computational model.

...read moreread less

Abstract: We present a parallel algorithm for computing the visible portion of a simple polygonal chain with 7i vertices from a point in the plane. The algorithm runs in O(logn) time using 0{nf log n) processors in the CREW-PRAM computational model, and hence is asymptotically optimal.

...read moreread less

26 citations