scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The power of parallel prefix

TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.
Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).
Citations
More filters
Journal ArticleDOI
TL;DR: A unified parallel algorithm for grid adaptation by local refinement/coarsening is presented, employed for dynamic adaptation of three-dimensional unstructured tetrahedral grids on a partitioned memory multiple-instruction multiple-data architecture.
Abstract: A unified parallel algorithm for grid adaptation by local refinement/coarsening is presented. It is designed to be independent from the type of the grid. This is achieved by employing a generic data template that can be configured to capture the data structures for any computational grid regardless of structure and dimensionality. Furthermore, the algorithm itself is specified in terms of generic parallel primitives that are completely independent of the underlying parallel architecture. The unified parallel algorithm is employed for dynamic adaptation of three-dimensional unstructured tetrahedral grids on a partitioned memory multiple-instruction multiple-data architecture. Performance results are presented for the Intel iPSC/860.

8 citations

Journal ArticleDOI
01 Feb 1999
TL;DR: Fork95 as discussed by the authors is a parallel programming language for the Parallel Random Access Machine (PRAM) model of parallel computation, which is used in the SB-PRAM project at the University of Saarbrucken.
Abstract: We investigate the well-known Parallel Random Access Machine (PRAM) model of parallel computation as a practical parallel programming model . The two components of this project are a general-purpose PRAM programming language, called Fork95, and a library, called PAD, of fundamental, efficiently implemented parallel algorithms and data structures. We outline the main features of Fork95 as they apply to the implementation of PAD, and describe the implementation of library procedures for prefix-sums and sorting. The Fork95 compiler generates code for the SB-PRAM, a hardware emulation of the PRAM, which is currently being completed at the University of Saarbrucken. Both language and library can immediately be used with this machine. The project is, however, of independent interest. The programming environment can help the algorithm designer to evaluate the practicality of new parallel algorithms, and can furthermore be used as a tool for teaching and communication of parallel algorithms.

8 citations

Journal ArticleDOI
TL;DR: New geometric observations are presented that lead to extremely simple and optimal algorithms for solving the problem of computing the shortest weakly visible subedge of ann-vertex simple polygon, both sequentially and in parallel.

8 citations


Cites background or methods from "The power of parallel prefix"

  • ...Steps (4) and (5) are easily handled by using parallel prefix [16, 17]....

    [...]

  • ...The parallel algorithm is also very straightforward and makes use of only simple EREW PRAM operations such as parallel prefix [16, 17]....

    [...]

  • ...Steps (2) and (3) are performed by using [5] and parallel prefix [16, 17]....

    [...]

Proceedings ArticleDOI
01 May 1990
TL;DR: It is shown that, Cr can be represented among the n nodes of a variant, and any n-tuple of variables can be accessed in O(log n(10glogn)~) time in t, he worst ca. of tlie mesh-of-t,rees using O((?n/77)polyIoy( )??/)I)) storage per node such tha.
Abstract: The problem of representing a set. tr k (2~1,. . . . II,,} of read-write variables on an n-node distribut,ed memory pa.rallel computer is considered. It is shown that, Cr can be represented among the n nodes of a variant. of tlie mesh-of-t,rees using O((?n/77)polyIoy( )??/)I)) storage per node such tha.t any n-tuple of variables ma.y be accessed in O(log n(10glogn)~) time in t,he worst ca.se for m polynomial in n.

8 citations

Journal ArticleDOI
TL;DR: This family of depth-size optimal, parallel prefix circuits with fan-out 2 is presented, which is easier to construct and more amenable to automatic synthesis than two other families of the same type, although the three families have the same minimum depth.
Abstract: Prefix computation is used in various areas and is considered as a primitive operation. Parallel prefix circuits are parallel prefix algorithms on the combinational circuit model. The depth of a prefix circuit is a measure of its processing time; smaller depth implies faster computation. The size of a prefix circuit is the number of operation nodes in it. Smaller size implies less power consumption, less VLSI area, and less cost. A prefix circuit with n inputs is depth-size optimal if its depth plus size equals 2n − 2. A circuit with a smaller fan-out is in general faster and occupies less VLSI area. To be of practical use, the depth and fan-out of a prefix circuit should be small. In this paper, a family of depth-size optimal, parallel prefix circuits with fan-out 2 is presented. This family of prefix circuits is easier to construct and more amenable to automatic synthesis than two other families of the same type, although the three families have the same minimum depth among all depth-size optimal prefix circuits with fan-out 2. The balanced structure of the new family is also a merit.

7 citations