Journal ArticleDOI
The power of parallel prefix
TLDR
This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.Abstract:
The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).read more
Citations
More filters
Journal ArticleDOI
Sorting with linear speedup on a pipelined hypercube
Peter Varman,K. Doshi +1 more
TL;DR: A coarse-grained parallel sorting algorithm that can be mapped efficiently onto a pipelined hypercube of P PEs, thereby achieving linear speedup when P is O(N/log N).
Journal ArticleDOI
A parallel method for fast and practical high-order Newton interpolation
TL;DR: The algorithm for the computation of the divided differences is shown to be numerically stable and does not require equidistant points, precomputation, or the fast Fourier transform, and can be very useful for very high-order interpolation.
Journal Article
Z4: a new depth-size optimal parallel prefix circuit with small depth
Yen-Chun Lin,Jian-Nan Chen +1 more
TL;DR: This paper presents a new depthsize optimal parallel prefix circuit, named Z4, with small depth and fan-out 4, and shows that for most values of n, there are no earlier depth-size optimal prefix circuits with bounded fan- out that have a depth less than that of Z4.
Journal ArticleDOI
An optimal PRAM algorithm for a spanning tree on trapezoid graphs
TL;DR: AnO(logn) time parallel algorithm withO(n/logn), processors on an EREW PRAM for constructing a spanning tree on trapezoid graphs.
Journal ArticleDOI
Parallel prefix computation on extended multi-mesh network
TL;DR: A parallel algorithm for prefix computation of N = n4 elements on an n × n extended multi-mesh network is presented, which takes O(N1/4) time on N processors.