scispace - formally typeset
Journal ArticleDOI

The power of parallel prefix

TLDR
This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.
Abstract
The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).

read more

Citations
More filters
Journal ArticleDOI

Sorting with linear speedup on a pipelined hypercube

TL;DR: A coarse-grained parallel sorting algorithm that can be mapped efficiently onto a pipelined hypercube of P PEs, thereby achieving linear speedup when P is O(N/log N).
Journal ArticleDOI

A parallel method for fast and practical high-order Newton interpolation

TL;DR: The algorithm for the computation of the divided differences is shown to be numerically stable and does not require equidistant points, precomputation, or the fast Fourier transform, and can be very useful for very high-order interpolation.
Journal Article

Z4: a new depth-size optimal parallel prefix circuit with small depth

TL;DR: This paper presents a new depthsize optimal parallel prefix circuit, named Z4, with small depth and fan-out 4, and shows that for most values of n, there are no earlier depth-size optimal prefix circuits with bounded fan- out that have a depth less than that of Z4.
Journal ArticleDOI

An optimal PRAM algorithm for a spanning tree on trapezoid graphs

TL;DR: AnO(logn) time parallel algorithm withO(n/logn), processors on an EREW PRAM for constructing a spanning tree on trapezoid graphs.
Journal ArticleDOI

Parallel prefix computation on extended multi-mesh network

TL;DR: A parallel algorithm for prefix computation of N = n4 elements on an n × n extended multi-mesh network is presented, which takes O(N1/4) time on N processors.