scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Parallel and Distributed Computing in 1987"


Journal ArticleDOI
TL;DR: This paper shows how the image algebra suggests a general-purpose cellular pyramid array processor for real time image processing tasks and demonstrates how algebraic techniques can be used to develop systematic methods for deriving parallel algorithms.

227 citations


Journal ArticleDOI
TL;DR: A solution to the mapping problem when there are topological and cardinality variations for a commonly used class of parallel interconnection structures, which includes shuffle-exchange networks, hypercubes, square meshes, linear systolic arrays, cube-connected cycles, and complete binary trees is presented.

179 citations


Journal ArticleDOI
J. B. Sinclair1
TL;DR: A branch-and-bound-with-underestimates algorithm to reduce the size of the search tree, and its average time and space complexity for two underestimating functions through simulation, which shows the minimum independent assignment cost underestimate (MIACU), performs extremely well over a wide range of values of program model parameters.

110 citations


Journal ArticleDOI
TL;DR: Some characteristics of divide-and-conquer algorithms are examined, along with some of their implications for the design of machines and languages which can support the efficient programming and execution of divided algorithms.

60 citations


Journal ArticleDOI
TL;DR: This paper presents worst-case and average-case resource requirements for storing and retrieving familiar families of patterned matrices: packed, symmetric, triangular, Toeplitz, and banded.

40 citations


Journal ArticleDOI
TL;DR: This paper develops a general model for hypercube machines, and uses it to show how vision algorithms can be executed on hypercubes, and the steps in the problem of thick-film inspection are used as a concrete example.

31 citations


Journal ArticleDOI
TL;DR: There is also an extensive list of key image analysis algorithms that are supported by P 3 E, thus making it a profound and versatile tool for projection-based computer vision.

30 citations


Journal ArticleDOI
TL;DR: It is shown that n2/log n processors suffice for achieving O(n log n) time, which improves the best previous result by a factor of log n processors, and is asymptotically optimal.

27 citations


Journal ArticleDOI
TL;DR: A distributed diagnostic and structuring algorithm for the RECBAR is presented that enables the architecture to detect faults and structure itself accordingly within 2 · log2(L) + 1 time steps, thus making it a truly fault tolerant architecture.

27 citations


Journal ArticleDOI
TL;DR: A datallow graph model for the timing analysis of general (cyclic or acyclic), decision-free asynchronous architectures is introduced and it is shown how the results of this analysis can be used to synthesize optimal special-purpose hardware implementations of both general datallow arrays and regular wavefront arrays.

24 citations


Journal ArticleDOI
TL;DR: This paper illustrates that this arbitration policy discriminates against remote or less frequent requests because it rejects them most of the time.


Journal ArticleDOI
TL;DR: This analysis shows that this simple algorithm, which is known to be average case optimal, compares very favorably with all the other known algorithms as it requires O(n log n) messages with probability tending to one.

Journal ArticleDOI
TL;DR: Novel algorithmic techniques are described, such as vertical pipelining, subproblem partitioning, associative matching, and data duplication, that effectively exploit the massive parallelism available in fine-grained SIMD tree machines while avoiding communication bottlenecks.

Journal ArticleDOI
TL;DR: A group theoretic representation of these networks is given and it is shown that all existing cellular permutation arrays are the network realizations of the recursive coset decompositions of symmetric groups.

Journal ArticleDOI
TL;DR: A systolic array is derived which requires ( n + 1) 2 /4 processing cells and has a time complexity of O ( n ) for each sweep.

Journal ArticleDOI
TL;DR: Work on the implementation of a compiler for the ICL Distributed Array Processor (DAP), which has evolved from the Pascal-based parallel language Actus, is currently under way and some aspects of this implementation are described.

Journal ArticleDOI
TL;DR: Using a recently published parallel prefix sums algorithm the list-ranking algorithm can be adapted to run on a concurrent-read concurrent-write parallel random access machine (CRCW PRAM) almost surely in time O(n/p + log n) using p processors.

Journal ArticleDOI
TL;DR: The result of this analysis suggests that the new single-modulus complex RNS may be significantly superior to the alternative FFT design choices.

Journal ArticleDOI
TL;DR: A parallel procedure for the solution of first-order linear recurrence systems of size N when the number of processors p is small in relation to N is presented and achieves the lower bound 2 (N − 1) (p + 1) for solving the parallel prefix problem on a p processor machine.

Journal ArticleDOI
TL;DR: It is shown that in this model only one kind of dependence relation is needed, and this fact leads to a smaller set of dependence relations and, therefore, results in better parallel performance.


Journal ArticleDOI
TL;DR: In this article, an interconnection scheme based on a bus network consisting of high-speed time-sliced buses and interbus links of matching bandwidths is described, and two contrasting approaches to simulating such a machine are discussed.

Journal ArticleDOI
TL;DR: In this article, the authors discuss basic theoretical issues concerning systolic programming methodology and demonstrate simple techniques which can be used to structure communication and show how the complexity of two simple algorithms is adversely affected by the cost of data movement in a parallel system.

Journal ArticleDOI
TL;DR: MP is a manager for systems of cooperating processes in a local area network of engineering workstations that exhibits realtime behaviors.

Journal ArticleDOI
TL;DR: A VLSI design of a multiprecision matrix multiplier and polynomial evaluator that addresses the issues of implementation is described, consisting of a two-dimensional array of bit-serial multiply and accumulate cells, each implemented as an accumulator and a pair of registers.

Journal ArticleDOI
TL;DR: Several algorithms for interval arithmetic block cyclic reduction for efficient application to vector computers under the condition that interval arithmetic inclusion properties be preserved are introduced.

Journal ArticleDOI
TL;DR: An optimal parallel algorithm to decompose prefix-coded messages and uniquely decipherable- coded messages in O( n P ) time, using O(P) processors on the weakest version of parallel random access machines in which concurrent read and concurrent write to a cell in the common memory are not allowed.

Journal ArticleDOI
TL;DR: A protocol for process termination handling that guarantees the consistency of distributed structures and aims at optimizing the rate of communications among virtual processors is presented.

Journal ArticleDOI
TL;DR: A parallel algorithm for the NP-hard problem test-and-treatment is presented for a machine whose number of connections is 3p(2 squared), where p is the number of processing elements (PEs), and where the PEs are simple enough such that a machine with 2 to the 20th power PEs is currently implementable and to the 30th powerPE machine is feasible.