scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The power of parallel prefix

TL;DR: This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model) to solve the prefix computation problem, when the order of the elements is specified by a linked list.
Abstract: The prefix computation problem is to compute all n initial products a1* . . . *a1,i=1, . . ., n of a set of n elements, where * is an associative operation. An O(((logn) log(2n/p))XI(n/p)) time deterministic parallel algorithm using p≤n processors is presented to solve the prefix computation problem, when the order of the elements is specified by a linked list. For p≤O(n1-e)(e〉0 any constant), this algorithm achieves linear speedup. Such optimal speedup was previously achieved only by probabilistic algorithms. This study assumes the weakest PRAM model, where shared memory locations can only be exclusively read or written (the EREW model).
Citations
More filters
Journal ArticleDOI
TL;DR: This paper gives the first NC algorithm for recognizing the consecutive 1's property for rows of a (0, 1)-matrix, and shows that the maximum matching problem for arbitrary convex bipartite graphs can be solved within the same complexity bounds.

33 citations

Journal ArticleDOI
TL;DR: In this paper, the authors presented new and strict time-optimal parallel schedules for prefix computation with resource constraints under the concurrent read-exclusive-write (CREW) parallel random access machine (PRAM) model.
Abstract: Prefix computation is a basic operation at the core of many important applications, e.g., some of the Grand Challenge problems, circuit design, digital signal processing, graph optimizations, and computational geometry. In this paper, we present new and strict time-optimal parallel schedules for prefix computation with resource constraints under the concurrent-read-exclusive-write (CREW) parallel random access machine (PRAM) model. For prefix of N elements on p processors (p independent of N) when N>p(p+1)/2, we derive Harmonic Schedules that achieve the strict optimal time (steps), [2(N-1)/(p+1)]. We also derive Pipelined Schedules that have better program-space efficiency than the Harmonic Schedule, yet only require a small constant number of steps more than the optimal time achieved by the Harmonic Schedule, Both the Harmonic Schedules and the Pipelined Schedules are simple and easy to implement. For prefix of N elements on p processors (p independent of N) where N/spl les/p(p+1)/2, the Harmonic Schedules are not time-optimal. For these cases, we establish an optimization method for determining key parameters of time-optimal schedules, based on connections between the structure of parallel prefix and Pascal's triangle. Using the derived parameters, we devise an algorithm to construct such schedules. For a restricted class of values of N and p, we prove that the constructed schedules are strictly time-optimal. We also give strong empirical evidence that our algorithm constructs strict time optimal schedules for all cases where N/spl les/p(p+1)/2.

31 citations

01 Jan 1996
TL;DR: This paper presents new and strict time-optimal parallel schedules for prefix computation with resource constraints under the concurrent-read-exclusive-write (CREW) parallel random access machine (PRAM) model and establishes an optimization method for determining key parameters of time-Optimal schedules, based on connections between the structure of parallel prefix and Pascal's triangle.

31 citations

Journal ArticleDOI
TL;DR: A fast computation of staircase separators, and a scheme for partitioning the obstacles' boundaries in a way that ensures that the resulting path length matrices have a monotonicity property that is apparently absent before applying the partitioning scheme.
Abstract: Atallah, M.J. and D.Z. Chen, Parallel rectilinear shortest paths with rectangular obstacles, Computational Geometry: Theory and Applications 1 (1991) 79-113. Given a rectilinear convex polygon P having O(n) vertices and which contains n pairwise disjoint rectangular rectilinear obstacles, we compute, in parallel, a data structure that supports queries about shortest rectilinear obstacle-avoiding paths in P. That is, a query specifies a source and a destination, and the data structure enables efficient processing of the query. We construct the data structure in O(log2n) time, with O(n2/log2n) processors in the CREW- PRAM model if all queries are such that the source and the destination are on the boundary of P, with O(n2/logn) processors if the source is an obstacle vertex and the destination is on the boundary of P, and with O(n2) processors if both the source and destination are arbitrary points in the plane. The data structure we compute enables one processor to obtain the path length for any pair of query vertices (of obstacles or of P) in constant time, or O(⌈k/logn⌉) processors to retrieve the shortest path itself in logarithmic time, where k is the number of segments of that path. If the two query points are arbitrary rather than vertices, then one processor takes O(logn) time (instead of constant time) for finding the path length, while the complexity bounds for reporting an actual shortest path remain unchanged. A number of other related shortest paths problems are solved. The techniques we use involve a fast computation of staircase separators, and a scheme for partitioning the obstacles' boundaries in a way that ensures that the resulting path length matrices have a monotonicity property that is apparently absent before applying our partitioning scheme. Sequentially, the data structure can be built in O(n2) time.

31 citations

Journal ArticleDOI
TL;DR: These algorithms provide parallel analogues to well-known phenomena from sequential computational geometry, such as the fact that problems for polygons can oftentimes be solved more efficiently than point-set problems, and that nearest-neighbor problems can be solved without explicitly constructing a Voronoi diagram.
Abstract: In this paper we give parallel algorithms for a number of problems defined on point sets and polygons. All our algorithms have optimalT(n) * P(n) products, whereT(n) is the time complexity andP(n) is the number of processors used, and are for the EREW PRAM or CREW PRAM models. Our algorithms provide parallel analogues to well-known phenomena from sequential computational geometry, such as the fact that problems for polygons can oftentimes be solved more efficiently than point-set problems, and that nearest-neighbor problems can be solved without explicitly constructing a Voronoi diagram.

30 citations


Cites methods from "The power of parallel prefix"

  • ...All of these computations can be done in O(log n) time using O([Y(v)I/log n) processors by simple parallel prefix and parallel broadcast computations [ 15 ], [16]....

    [...]

  • ...... four Ni(p)'s for any i e {1, 2}. Since we assumed that the all-nearest-neighbor problem has already been solved for Sa and $2, we can construct, for i e {1, 2}, the sorted list S} that consists of all the points in S~ whose d~(p)-ball intersects L by compressing out all the points whose di(p)-ball does not intersect L. This can be done in O(log n) time using O(n/log n) processors in the EREW PRAM model by a parallel prefix computation ......

    [...]