The power of parallel prefix

doi:10.1109/TC.1985.6312202

Journal Article•DOI•

The power of parallel prefix

Clyde P. Kruskal¹, Larry Rudolph², Marc Snir³•Institutions (3)

University of Illinois at Urbana–Champaign¹, Carnegie Mellon University², Hebrew University of Jerusalem³

01 Oct 1985-IEEE Transactions on Computers (IEEE)-Vol. 34, Iss: 10, pp 965-968

TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.

read less

Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Linking simulation model specification and parallel execution through UNITY

[...]

Marc D. Abrams¹, Ernest H. Page¹, Richard E. Nance¹•Institutions (1)

Virginia Tech¹

01 Dec 1991

TL;DR: The UNITY system is found to offer (or be amenable to) many of the features required to bridge the gap between the requirements and philosophy of parallel programming and a simulation model specification.

...read moreread less

Abstract: Presents the results of an attempt to bridge the gap between the requirements and philosophy of parallel programming (attention being focused on crucial efficiency-related implementation details) and the requirements and philosophy of a simulation model specification (attention being focused on correctly and simply describing the model behavior). K.M. Chandry and J. Misra's (1988) UNITY system is found to offer (or be amenable to) many of the features required to bridge this gap. UNITY can handle both a state-transition-based specification, which is a conventional parallel simulation program, and a data-flow-based specification. For the G/G/1 problem, the data-flow view leads to a more efficient solution. >

...read moreread less

9 citations

Cites methods from "The power of parallel prefix"

...The refinement, which is GLM4, uses the parallel prefix algorithm of Kruskal, Rudolph, and Snir (1985) to reduce the running time to O(log N) using O(N) processors....
[...]
...The refinement, which is GLM4, uses the parallel prefix algorithm of Kruskal, Rudolph, and Snir (1985) to reduce the running time to O(1ogN) using O(N) processors....
[...]

Journal Article•DOI•

Optimal Sequential And Parallel Algorithms To Compute A Steiner Tree On Permutation Graphs

[...]

Sukumar Mondal¹, Madhumangal Pal¹, Tapan Kumar Pal¹•Institutions (1)

Vidyasagar University¹

01 Aug 2003-International Journal of Computer Mathematics

TL;DR: This paper presents an optimal sequential and an optimal parallel algorithm to compute a minimum cardinality Steiner set and a Steiner tree on an EREW PRAM model.

...read moreread less

Abstract: This paper presents an optimal sequential and an optimal parallel algorithm to compute a minimum cardinality Steiner set and a Steiner tree. The sequential algorithm takes O ( n ) time and parallel algorithm takes O (log n ) time and O ( n /log n ) processors on an EREW PRAM model.

...read moreread less

9 citations

Cites methods from "The power of parallel prefix"

...We know that the parallel prefix computation can be done for n items in O(log n) time using O(n=log n) processors on an EREW PRAM [9, 10]....
[...]
...We know that the parallel prefix computation can be done for n items in O(log n) time using O(n=log n) processors on an EREW PRAM [9, 10]....
[...]
...From the BFS tree, the shortest path between s01 and t 0 1 can be computed in O(log n) time using O(n=log n) processors on an EREW PRAM....
[...]
...Therefore, all the steps of Algorithm PSST can be performed in O(log n) time using O(n=log n) processors on an EREW PRAM....
[...]
...A spanning tree of a permutation graph can be computed in parallel using the algorithm of Wang et al. in O(log n) time using O(n=log n) processors on an EREW PRAM [14]....
[...]

Journal Article•DOI•

The p-maxian problem on interval graphs

[...]

Yukun Cheng¹, Liying Kang²•Institutions (2)

Zhejiang University of Finance and Economics¹, Shanghai University²

01 Nov 2010-Discrete Applied Mathematics

TL;DR: A linear time algorithm which is similar to the algorithm of Bespamyatnikh for the unweighted 1-median problem on interval graphs to solve the weighted 1-maxian problem is proposed and shown that two intervals with the minimum right endpoint and the maximum left endpoint are an optimal solution.

...read moreread less

8 citations

Book Chapter•DOI•

Parallel Algorithms for Partitioning Sorted Sets and Related Problems

[...]

Danny Z. Chen¹, Wei Chen², Koichi Wada², Kimio Kawaguchi²•Institutions (2)

University of Notre Dame¹, Nagoya Institute of Technology²

25 Sep 1996

TL;DR: Efficient parallel algorithms for solving the partition problem and its applications are presented and the complexity bounds of these algorithms match those of the optimal EREW PRAM algorithms for merging, sorting, and finding an approximate median.

...read moreread less

Abstract: We consider the following partition problem: Given a set S of n elements that is organized as k sorted subsets of size n/k each and given a parameter h with 1/k≤h≤n/k, partition S into g=O(n/(hk)) subsets D1D2,..., D g of size Θ(hk) each, such that for any two indices i and j with 1≤i≤j≤g, no element in D1i is bigger than any element in D j . Note that with various combinations of the values of parameters h and k, several fundamental problems, such as merging, sorting,and finding an approximate median, can be formulated as or be reduced to this partition problem. The partition problem also finds applications in solving problems of parallel computing and computational geometry. In this paper, we present efficient parallel algorithms for solving the partition problem and its applications. Our parallel partition algorithm runs in O(log n) time using O(min{(n/h)*max{log h 1},n*max{log(1/h),1}}/log n) processors in the EREW PRAM model.The complexity bounds of our parallel partition algorithm on the respective special cases match those of the optimal EREW PRAM algorithms for merging, sorting, and finding an approximate median. Using our parallel partition algorithm, we are also able to obtain better complexity bounds (even possibly on a weaker parallel model) than the previously best known parallel algorithms for several important problems, including parallel multi-selection, parallel multi-ranking, and parallel sorting of k sorted subsets.

...read moreread less

8 citations

Cites background or methods from "The power of parallel prefix"

...[ 18 , 19]). Ifpj < klogn, then let E' = Ei U Ei+l U ... U Ej, and union E' with either the preceding or succeeding unmarked set of the block Rid in the sequence (e.g., let Ei-l = Ei-1 U E')....
[...]
...By using Chen's parallel binary search algorithm [6] and parallel prefix [ 18 , 19], each cohunn Ci can be partitioned with the....
[...]

Proceedings Article•DOI•

Improved parallel prefix computation on optical multi-trees

[...]

Prasanta K. Jana¹•Institutions (1)

Indian Institute of Technology Dhanbad¹

20 Dec 2004

TL;DR: A new and improved parallel algorithm for prefix computation on the same network and although the algorithm requires O(log n) electronic moves +4 optical moves using the same number of processors, the number of data points involved in the algorithm is n/sup 3/ in contrast to n/Sup 2/.

...read moreread less

Abstract: A parallel algorithm for prefix computation was reported on a recently proposed interconnection network called optical multi-trees (OMULT). Using 2n/sup 3/-n/sup 2/ processors, the algorithm was shown to run in O(log n)/sup A/ electronic moves +5 optical moves for n/sup 2/ data points. In this paper we present a new and improved parallel algorithm for prefix computation on the same network. Although the algorithm requires O(log n) electronic moves +4 optical moves using the same number of processors, the number of data points involved in our algorithm is n/sup 3/ in contrast to n/sup 2/.

...read moreread less

8 citations

Collapse

The power of parallel prefix

Citations

Cites methods from "The power of parallel prefix"

Cites methods from "The power of parallel prefix"

Cites background or methods from "The power of parallel prefix"

Related Papers (5)