The power of parallel prefix
TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.
Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).
Citations
More filters
••
06 Jul 1994TL;DR: Modern computer systems usually have a complex memory system consisting of increasingly larger and slower memory, making it inappropriate for an accurate complexity analysis of algorithms on these types of architectures.
Abstract: Modern computer systems usually have a complex memory system consisting of increasingly larger and slower memory. Traditional computer models like the Random Access Machine (RAM) have no concept of memory hierarchy, making it inappropriate for an accurate complexity analysis of algorithms on these types of architectures.
15 citations
Cites methods from "The power of parallel prefix"
...A solution to this problem has been presented by Kruskal et al. [ 9 ]....
[...]
••
TL;DR: An optimal parallel algorithm for triangulating an arbitrary set ofn points in the plane using a Concurrent-Read, Exclusive-Write Parallel RAM model and a parallel divide-and-conquer technique of subdividing a problem into subproblems is employed.
Abstract: This paper presents an optimal parallel algorithm for triangulating an arbitrary set ofn points in the plane. The algorithm runs inO(logn) time usingO(n) space andO(n) processors on a Concurrent-Read, Exclusive-Write Parallel RAM model (CREW PRAM). The parallel lower bound on triangulation is Ω(logn) time so the best possible linear speedup has been achieved. A parallel divide-and-conquer technique of subdividing a problem into\(\sqrt n \) subproblems is employed.
15 citations
Cites background from "The power of parallel prefix"
...(11) In step 3 the calculation of the angles takes O(1) time with O(n) processors and the sorting in step4 takes O(logn) time O(n) processors....
[...]
••
TL;DR: These algorithms can be used for constructing and evaluating polynomials interpolating the function values and its derivatives of arbitrary order (Hermite interpolation) and improving the parallel time complexity of existing algorithms.
15 citations
••
TL;DR: It is shown that the prefixes of n items can be computed in time 2τ√n + O(log n) on a square mesh with n processors, and that both algorithms are asymptotically optimal.
Abstract: Algorithms for efficient implementation of computation of prefix produce on mesh-connected processor arrays are presented. Assuming that an arithmetic operation takes unit time and communication/computation ratio for a single input item is τ, we show that the prefixes of n items can be computed in time 2τ√n + O(log n) on a square mesh with n processors. If n processors are configured as a disc with respect to the Manhattan metric, then the parallel time for the problem becomes √2τ√n + O(√τ 4√n. We show that both of these algorithms are asymptotically optimal.
15 citations
••
TL;DR: This work considers the problem of determining, for each element A(j), j=1, 2, ..., n, the unique element B(i), 0/spl les/i/ spl les/m, such that B( i)/splLes/A(j) > 2.
Abstract: Let A be a sorted array of n numbers and B a sorted array of m numbers, both in nondecreasing order, with n/spl les/m. We consider the problem of determining, for each element A(j), j=1, 2, ..., n, the unique element B(i), 0/spl les/i/spl les/m, such that B(i)/spl les/A(j) >
14 citations