scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The power of parallel prefix

TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.
Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).
Citations
More filters
Book ChapterDOI
06 Jul 1994
TL;DR: Modern computer systems usually have a complex memory system consisting of increasingly larger and slower memory, making it inappropriate for an accurate complexity analysis of algorithms on these types of architectures.
Abstract: Modern computer systems usually have a complex memory system consisting of increasingly larger and slower memory. Traditional computer models like the Random Access Machine (RAM) have no concept of memory hierarchy, making it inappropriate for an accurate complexity analysis of algorithms on these types of architectures.

15 citations


Cites methods from "The power of parallel prefix"

  • ...A solution to this problem has been presented by Kruskal et al. [ 9 ]....

    [...]

Journal ArticleDOI
Ed Merks1
TL;DR: An optimal parallel algorithm for triangulating an arbitrary set ofn points in the plane using a Concurrent-Read, Exclusive-Write Parallel RAM model and a parallel divide-and-conquer technique of subdividing a problem into subproblems is employed.
Abstract: This paper presents an optimal parallel algorithm for triangulating an arbitrary set ofn points in the plane. The algorithm runs inO(logn) time usingO(n) space andO(n) processors on a Concurrent-Read, Exclusive-Write Parallel RAM model (CREW PRAM). The parallel lower bound on triangulation is Ω(logn) time so the best possible linear speedup has been achieved. A parallel divide-and-conquer technique of subdividing a problem into\(\sqrt n \) subproblems is employed.

15 citations


Cites background from "The power of parallel prefix"

  • ...(11) In step 3 the calculation of the angles takes O(1) time with O(n) processors and the sorting in step4 takes O(logn) time O(n) processors....

    [...]

Journal ArticleDOI
TL;DR: These algorithms can be used for constructing and evaluating polynomials interpolating the function values and its derivatives of arbitrary order (Hermite interpolation) and improving the parallel time complexity of existing algorithms.

15 citations

Journal ArticleDOI
TL;DR: It is shown that the prefixes of n items can be computed in time 2τ√n + O(log n) on a square mesh with n processors, and that both algorithms are asymptotically optimal.
Abstract: Algorithms for efficient implementation of computation of prefix produce on mesh-connected processor arrays are presented. Assuming that an arithmetic operation takes unit time and communication/computation ratio for a single input item is τ, we show that the prefixes of n items can be computed in time 2τ√n + O(log n) on a square mesh with n processors. If n processors are configured as a disc with respect to the Manhattan metric, then the parallel time for the problem becomes √2τ√n + O(√τ 4√n. We show that both of these algorithms are asymptotically optimal.

15 citations

Journal ArticleDOI
Danny Z. Chen1
TL;DR: This work considers the problem of determining, for each element A(j), j=1, 2, ..., n, the unique element B(i), 0/spl les/i/ spl les/m, such that B( i)/splLes/A(j) > 2.
Abstract: Let A be a sorted array of n numbers and B a sorted array of m numbers, both in nondecreasing order, with n/spl les/m. We consider the problem of determining, for each element A(j), j=1, 2, ..., n, the unique element B(i), 0/spl les/i/spl les/m, such that B(i)/spl les/A(j) >

14 citations