Showing papers in "Journal of Parallel and Distributed Computing in 1991"

PDF

Open Access

Journal Article•DOI•

Tolerating latency through software-controlled prefetching in shared-memory multiprocessors

[...]

Todd C. Mowry¹, Anoop Gupta¹•Institutions (1)

01 Jun 1991-Journal of Parallel and Distributed Computing

TL;DR: The results show that for applications with regular data access patterns—the authors evaluate a particle-based simulator used in aeronautics and an LU-decomposition application—prefetching can be very effective, and the performance of a distributed-time logic simulation application that made extensive use of pointers and linked lists could be increased by only 30%.

...read moreread less

318 citations

Journal Article•DOI•

The power of reconfiguration

[...]

Yosi Ben-Asher¹, David Peleg², Rajiv Ramaswami³, Assaf Schuster¹•Institutions (3)

Hebrew University of Jerusalem¹, Weizmann Institute of Science², IBM³

01 Oct 1991-Journal of Parallel and Distributed Computing

TL;DR: It is shown that there are reconfigurable machines based on simple network topologies that are capable of solving large classes of problems in constant time, depending on the kinds of switches assumed for the network nodes.

...read moreread less

175 citations

Journal Article•DOI•

The data alignment phase in compiling programs for distributed-memory machines

[...]

Jingke Li¹, Marina Chen²•Institutions (2)

Portland State University¹, Yale University²

01 Oct 1991-Journal of Parallel and Distributed Computing

TL;DR: The alignment technique presented here focuses on minimizing the data movement between processors due to cross-references between multiple distributed arrays, and simplifies the task of data partition and communication generation in the context of a parallelizing compiler for distributed-memory machines.

...read moreread less

153 citations

Journal Article•DOI•

Optimal communication algorithms for hypercubes

[...]

Dimitri P. Bertsekas¹, C. Özveren¹, George D. Stamoulis¹, Paul Tseng¹, John N. Tsitsiklis¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

02 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: The algorithms proposed for these basic communication problems in a hypercube network of processors are optimal in terms of execution time and communication resource requirements; that is, they require the minimum possible number of time steps and packet transmissions.

...read moreread less

146 citations

Journal Article•DOI•

A two-dimensional buddy systems for dynamic resource allocation in a partitionable mesh connected system

[...]

Keqin Li¹, Kam-Hoi Cheng¹•Institutions (1)

University of Houston¹

01 May 1991-Journal of Parallel and Distributed Computing

TL;DR: A two-dimensional buddy system (2DBS) is proposed as a partitioning scheme for dynamic resource allocation in a PMCS and internal fragmentation of the proposed 2DBS under various probability distributions of job sizes and processing times is analyzed.

...read moreread less

138 citations

Journal Article•DOI•

A methodology for solving Markov models of parallel systems

[...]

Brigitte Plateau, Jean-Michel Fourneau¹•Institutions (1)

University of Paris¹

01 Aug 1991-Journal of Parallel and Distributed Computing

TL;DR: It is shown that this formulation of the Markov chain matrix can be expressed in terms of generalized tensor product, using the modularity of the SAN models, and allows the matrix to be stored with considerable memory savings.

...read moreread less

135 citations

Journal Article•DOI•

The twisted cube topology for multiprocessors: a study in network asymmetry

[...]

Seth Abraham¹, Krishnan Padmanabhan²•Institutions (2)

Purdue University¹, Bell Labs²

01 Sep 1991-Journal of Parallel and Distributed Computing

TL;DR: It is found that the twisted cube delivers an improvement in performance over the hypercube, but not nearly as much as the reduction in diameter.

...read moreread less

133 citations

Journal Article•DOI•

Pipelined communications in optically interconnected arrays

[...]

Zicheng Guo¹, Rami Melhem¹, Richard W. Hall¹, Donald M. Chiarulli¹, Steven P. Levitan¹ - Show less +1 more•Institutions (1)

University of Pittsburgh¹

01 Jul 1991-Journal of Parallel and Distributed Computing

TL;DR: Two synchronous multiprocessor architectures based on pipelined optical bus interconnections based on a two-dimensional architecture and a linear pipeline with enhanced control strategies are presented, which appear to be good candidates for a new generation of hybrid optical-electronic parallel computers.

...read moreread less

114 citations

Journal Article•DOI•

Experimental analysis of a mixed-mode parallel architecture using bitonic sequence sorting

[...]

Samuel A. Fineberg¹, Thomas L. Casavant¹, Howard Jay Siegel²•Institutions (2)

University of Iowa¹, Purdue University²

01 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: Experimentation aimed at determining the potential benefit of mixed-mode SIMD/MIMD parallel architectures is reported, based on timing measurements made on the PASM system prototype at Purdue utilizing carefully coded synthetic variations of a well-known algorithm.

...read moreread less

105 citations

Journal Article•DOI•

The DINO parallel programming language

[...]

Matthew Rosing¹, Robert B. Schnabel¹, Robert P. Weaver¹•Institutions (1)

University of Colorado Boulder¹

01 Sep 1991-Journal of Parallel and Distributed Computing

TL;DR: The syntax and semantics of the DINO language is described, examples of DINO programs are given, a critique of theDINO language features are presented, and the performance of code generated by the Dino compiler is discussed.

...read moreread less

101 citations

Journal Article•DOI•

Performance of multicomputer networks under Pin-out constraints

[...]

Seth Abraham¹, Krishnan Padmanabhan²•Institutions (2)

Purdue University¹, Bell Labs²

01 Jul 1991-Journal of Parallel and Distributed Computing

TL;DR: This paper evaluates the performance of the family of multidimensional mesh topologies (which includes the hypercube) under the constant pin-out constraint and shows that higher dimensionality is more important than wider channel width under this constraint.

...read moreread less

Journal Article•DOI•

Designing fault-tolerant systems using automorphisms

[...]

Shantanu Dutt¹, John P. Hayes²•Institutions (2)

University of Minnesota¹, University of Michigan²

01 Jul 1991-Journal of Parallel and Distributed Computing

TL;DR: A general theory for modeling and designing fault-tolerant multiprocessor systems in a systematic and efficient manner is presented and the resulting designs are shown to be far superior to those proposed in previous work.

...read moreread less

Journal Article•DOI•

Load balancing on message passing architectures

[...]

Reinhard von Hanxleden¹, L. Ridgway Scott²•Institutions (2)

Rice University¹, University of Houston²

01 Nov 1991-Journal of Parallel and Distributed Computing

TL;DR: This paper describes the implementation of a testbed for load balancing techniques, used for different static and dynamic strategies for balancing the work load of an iPSC/2 Implementation of a simple simulation of population evolution.

...read moreread less

Journal Article•DOI•

Scalability of parallel algorithms for the all-pairs shortest-path problem

[...]

Vipin Kumar¹, Vineet Singh•Institutions (1)

University of Minnesota¹

01 Oct 1991-Journal of Parallel and Distributed Computing

TL;DR: In this article, the authors use the isoefficiency metric to analyze the scalability of parallel algorithms for finding shortest paths between all pairs of nodes in a densely connected graph, and find the classic trade-offs of hardware cost vs scalability and memory vs time to be represented here as tradeoffs of HPCs vs. scalability.

...read moreread less

Journal Article•DOI•

From control flow to dataflow

[...]

Micah Beck¹, Richard Johnson¹, Keshav Pingali¹•Institutions (1)

Cornell University¹

01 Jun 1991-Journal of Parallel and Distributed Computing

TL;DR: This paper shows how imperative language programs can be translated into dataflow graphs and executed on a dataflow machine like Monsoon, and suggests that data flow graphs can serve as an executable intermediate representation in parallelizing compilers.

...read moreread less

Journal Article•DOI•

PRA * : massively parallel heuristic search

[...]

Matthew P. Evett¹, James A. Hendler¹, Ambuj Mahanti¹, Dana S. Nau¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 1991-Journal of Parallel and Distributed Computing

TL;DR: A variant of A* search designed to run on the massively parallel, SIMD Connection Machine (CM-2), called PRA* (for Parallel Retraction A*), is designed to maximize use of the Connection Machine′s memory and processors.

...read moreread less

Journal Article•DOI•

Chare kernel—a runtime support system for parallel computations

[...]

Wei Shu¹, Laxmikant V. Kale²•Institutions (2)

University at Buffalo¹, University of Illinois at Urbana–Champaign²

01 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: The chare kernel is a collection of primitive functions that manage chares, manipulate messages, invoke atomic computations, and coordinate concurrent activities that supports parallel computations with irregular structure.

...read moreread less

Journal Article•DOI•

Parallelism in graph-partitioning

[...]

John E. Savage¹, Markus G. Wloka¹•Institutions (1)

Brown University¹

01 Nov 1991-Journal of Parallel and Distributed Computing

TL;DR: A new parallel heuristic is described that on the 32K-processor CM-2 Connection Machine handles graphs with more than two million edges and gives in 9-min partitions that are within 2% of the best ever found.

...read moreread less

Journal Article•DOI•

A vertically layered allocation scheme for data flow systems

[...]

B. Lee¹, Ali R. Hurson¹, Tse-Yun Feng¹•Institutions (1)

Pennsylvania State University¹

01 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: This paper proposes a method called the vertically layered allocation scheme which utilizes heuristic rules in finding a compromise between computation and communication costs in a static data flow environment.

...read moreread less

Journal Article•DOI•

Optimal matrix transposition of bit reversal on hypercubes: all-to-personalized communication

[...]

Alan Edelman¹•Institutions (1)

University of California, Berkeley¹

02 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: An optimal algorithm for performing the communication described by exchanging the bits of the node address with that of the local address is described, typically in both matrix transposition and bit reversal for the fast Fourier transform.

...read moreread less

Journal Article•DOI•

Guaranteeing serializable results in synchronous parallel production systems

[...]

James G. Schmolze¹•Institutions (1)

Tufts University¹

01 Dec 1991-Journal of Parallel and Distributed Computing

TL;DR: This work presents a formal solution to the problem of guaranteeing serializable behavior in synchronous parallel production systems that execute many rules simultaneously, and presents a variety of algorithms that implement this solution.

...read moreread less

Journal Article•DOI•

Embedding mesh of trees in the hypercube

[...]

Kemal Efe¹•Institutions (1)

Sewanee: The University of the South¹

01 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: Methods for embedding one-, two-, and three-dimensional mesh of trees in the hypercube are described, which have significant practical importance in enhancing the capabilities of thehypercube.

...read moreread less

Journal Article•DOI•

Implementation and evaluation of Hough transform algorithms on a shared-memory multiprocessor

[...]

Alok Choudhary¹, Ravi Ponnusamy¹•Institutions (1)

Syracuse University¹

01 Jun 1991-Journal of Parallel and Distributed Computing

TL;DR: It is observed that for nonuniform images uniform partitioning does not perform well, whereas static and dynamic partitioning strategies perform well and comparably in most cases.

...read moreread less

Journal Article•DOI•

Merging multiple lists on hierarchical-memory multiprocessors

[...]

Peter Varman¹, Scott D. Scheufler¹, Balakrishna R. Iyer², Gary Ross Ricard²•Institutions (2)

Rice University¹, IBM²

01 Jun 1991-Journal of Parallel and Distributed Computing

TL;DR: An efficient multiprocessor algorithm to merge m, m ⩾ 2, sorted lists containing N elements is described, which substantially reduces the data access costs in comparison with traditional schemes that successively merge the lists two at a time.

...read moreread less

Journal Article•DOI•

Page placement policies for NUMA multiprocessors

[...]

Richard P. LaRowe¹, Carla Schlatter Ellis¹•Institutions (1)

Duke University¹

02 Jan 1991-Journal of Parallel and Distributed Computing

TL;DR: Under certain workload assumptions, results show that placement algorithms that are strongly biased toward local frame allocation but are able to borrow remote frames can reduce the number of page faults over strictly local allocation.

...read moreread less

Journal Article•DOI•

The design of a scalable, fixed-time computer benchmark

[...]

John L. Gustafson¹, Diane T. Rover¹, Stephen T. Elbert¹, Michael Carter¹•Institutions (1)

Iowa State University¹

01 Aug 1991-Journal of Parallel and Distributed Computing

TL;DR: The design of a benchmark is presented, SLALOM{trademark}, that scales automatically to the computing power available, and corrects several deficiencies in various existing benchmarks: it is highly scalable, it solves a real problem, it includes input and output times, and it can be run on parallel machines of all kinds, using any convenient language.

...read moreread less

Journal Article•DOI•

Optimal expression evaluation for data parallel architectures

[...]

John R. Gilbert¹, Robert Schreiber²•Institutions (2)

PARC¹, Research Institute for Advanced Computer Science²

01 Sep 1991-Journal of Parallel and Distributed Computing

TL;DR: This work gives an efficient algorithm to find the minimum-cost way to evaluate an expression, for several different data parallel architectures, and applies to any architecture in which the metric describing the cost of moving an array has a property the authors call “robustness".

...read moreread less

Journal Article•DOI•

Ordered fast Fourier transforms on a massively parallel hypercube multiprocessor

[...]

Charles H. Tong¹, Paul N. Swarztrauber²•Institutions (2)

Sandia National Laboratories¹, National Center for Atmospheric Research²

01 May 1991-Journal of Parallel and Distributed Computing

TL;DR: This work examines design alternatives for ordered radix-2 DIF (decimation-in-frequency) FFT algorithms on massively parallel hypercube multiprocessors such as the Connection Machine and combines the order and computational phases of the FFT and also uses sequence to processor maps that reduce communication.

...read moreread less

Journal Article•DOI•

An analytical approach to performance/cost modeling of parallel computers

[...]

John B. Andrews¹, Constantine D. Polychronopoulos¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Aug 1991-Journal of Parallel and Distributed Computing

TL;DR: It is argued that multiprocessors based on a fast global control unit capable of fast execution of serial code, and capable of managing an ensemble of slower processors, offer a performance/ cost ratio significantly better than any comparable homogeneous multipROcessor with distributed control.

...read moreread less

Journal Article•DOI•

Processor-efficient hypercube algorithms for the knapsack problem

[...]

Jianhua Lin¹, James A. Storer¹•Institutions (1)

Brandeis University¹

01 Nov 1991-Journal of Parallel and Distributed Computing

TL;DR: The proposed processor-efficient parallel algorithm for the 0/1 knapsack problem has optimal time speedup and processor efficiency over the best known sequential algorithm and performs very well for a wide range of input sizes.

...read moreread less