scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The power of parallel prefix

TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.
Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).
Citations
More filters
Proceedings ArticleDOI
01 Dec 1991
TL;DR: The UNITY system is found to offer (or be amenable to) many of the features required to bridge the gap between the requirements and philosophy of parallel programming and a simulation model specification.
Abstract: Presents the results of an attempt to bridge the gap between the requirements and philosophy of parallel programming (attention being focused on crucial efficiency-related implementation details) and the requirements and philosophy of a simulation model specification (attention being focused on correctly and simply describing the model behavior). K.M. Chandry and J. Misra's (1988) UNITY system is found to offer (or be amenable to) many of the features required to bridge this gap. UNITY can handle both a state-transition-based specification, which is a conventional parallel simulation program, and a data-flow-based specification. For the G/G/1 problem, the data-flow view leads to a more efficient solution. >

9 citations


Cites methods from "The power of parallel prefix"

  • ...The refinement, which is GLM4, uses the parallel prefix algorithm of Kruskal, Rudolph, and Snir (1985) to reduce the running time to O(log N) using O(N) processors....

    [...]

  • ...The refinement, which is GLM4, uses the parallel prefix algorithm of Kruskal, Rudolph, and Snir (1985) to reduce the running time to O(1ogN) using O(N) processors....

    [...]

Journal ArticleDOI
TL;DR: This paper presents an optimal sequential and an optimal parallel algorithm to compute a minimum cardinality Steiner set and a Steiner tree on an EREW PRAM model.
Abstract: This paper presents an optimal sequential and an optimal parallel algorithm to compute a minimum cardinality Steiner set and a Steiner tree. The sequential algorithm takes O ( n ) time and parallel algorithm takes O (log n ) time and O ( n /log n ) processors on an EREW PRAM model.

9 citations


Cites methods from "The power of parallel prefix"

  • ...We know that the parallel prefix computation can be done for n items in O(log n) time using O(n=log n) processors on an EREW PRAM [9, 10]....

    [...]

  • ...We know that the parallel prefix computation can be done for n items in O(log n) time using O(n=log n) processors on an EREW PRAM [9, 10]....

    [...]

  • ...From the BFS tree, the shortest path between s01 and t 0 1 can be computed in O(log n) time using O(n=log n) processors on an EREW PRAM....

    [...]

  • ...Therefore, all the steps of Algorithm PSST can be performed in O(log n) time using O(n=log n) processors on an EREW PRAM....

    [...]

  • ...A spanning tree of a permutation graph can be computed in parallel using the algorithm of Wang et al. in O(log n) time using O(n=log n) processors on an EREW PRAM [14]....

    [...]

Journal ArticleDOI
TL;DR: A linear time algorithm which is similar to the algorithm of Bespamyatnikh for the unweighted 1-median problem on interval graphs to solve the weighted 1-maxian problem is proposed and shown that two intervals with the minimum right endpoint and the maximum left endpoint are an optimal solution.

8 citations

Book ChapterDOI
25 Sep 1996
TL;DR: Efficient parallel algorithms for solving the partition problem and its applications are presented and the complexity bounds of these algorithms match those of the optimal EREW PRAM algorithms for merging, sorting, and finding an approximate median.
Abstract: We consider the following partition problem: Given a set S of n elements that is organized as k sorted subsets of size n/k each and given a parameter h with 1/k≤h≤n/k, partition S into g=O(n/(hk)) subsets D1D2,..., D g of size Θ(hk) each, such that for any two indices i and j with 1≤i≤j≤g, no element in D1i is bigger than any element in D j . Note that with various combinations of the values of parameters h and k, several fundamental problems, such as merging, sorting,and finding an approximate median, can be formulated as or be reduced to this partition problem. The partition problem also finds applications in solving problems of parallel computing and computational geometry. In this paper, we present efficient parallel algorithms for solving the partition problem and its applications. Our parallel partition algorithm runs in O(log n) time using O(min{(n/h)*max{log h 1},n*max{log(1/h),1}}/log n) processors in the EREW PRAM model.The complexity bounds of our parallel partition algorithm on the respective special cases match those of the optimal EREW PRAM algorithms for merging, sorting, and finding an approximate median. Using our parallel partition algorithm, we are also able to obtain better complexity bounds (even possibly on a weaker parallel model) than the previously best known parallel algorithms for several important problems, including parallel multi-selection, parallel multi-ranking, and parallel sorting of k sorted subsets.

8 citations


Cites background or methods from "The power of parallel prefix"

  • ...[ 18 , 19]). Ifpj < klogn, then let E' = Ei U Ei+l U ... U Ej, and union E' with either the preceding or succeeding unmarked set of the block Rid in the sequence (e.g., let Ei-l = Ei-1 U E')....

    [...]

  • ...By using Chen's parallel binary search algorithm [6] and parallel prefix [ 18 , 19], each cohunn Ci can be partitioned with the....

    [...]

Proceedings ArticleDOI
20 Dec 2004
TL;DR: A new and improved parallel algorithm for prefix computation on the same network and although the algorithm requires O(log n) electronic moves +4 optical moves using the same number of processors, the number of data points involved in the algorithm is n/sup 3/ in contrast to n/Sup 2/.
Abstract: A parallel algorithm for prefix computation was reported on a recently proposed interconnection network called optical multi-trees (OMULT). Using 2n/sup 3/-n/sup 2/ processors, the algorithm was shown to run in O(log n)/sup A/ electronic moves +5 optical moves for n/sup 2/ data points. In this paper we present a new and improved parallel algorithm for prefix computation on the same network. Although the algorithm requires O(log n) electronic moves +4 optical moves using the same number of processors, the number of data points involved in our algorithm is n/sup 3/ in contrast to n/sup 2/.

8 citations