Showing papers in &quot;IEEE Transactions on Computers in 1982&quot;

Elections in a Distributed Computing System

TL;DR: An isomorphism between the behavior of Petri nets with exponentially distributed transition rates and Markov processes is presented and this work solves for the steady state average message delay and throughput on a communication link when the alternating bit protocol is used for error recovery.

...read moreread less

Abstract: An isomorphism between the behavior of Petri nets with exponentially distributed transition rates and Markov processes is presented. In particular, k-bounded Petri nets are isomorphic to finite Markov processes and can be solved by standard techniques if k is not too large. As a practical example, we solve for the steady state average message delay and throughput on a communication link when the alternating bit protocol is used for error recovery.

...read moreread less

1,090 citations

Journal Article•DOI•

[...]

Garcia-Molina¹•Institutions (1)

Princeton University¹

Design for Testability—A Survey

TL;DR: This paper discusses elections and reorganizations of active nodes in a distributed computing system after a failure, and two types of reasonable failure environments are studied.

...read moreread less

Abstract: After a failure occurs in a distributed computing system, it is often necessary to reorganize the active nodes so that they can continue to perform a useful task. The first step in such a reorganization or reconfiguration is to elect a coordinator node to manage the operation. This paper discusses such elections and reorganizations. Two types of reasonable failure environments are studied. For each environment assertions which define the meaning of an election are presented. An election algorithm which satisfies the assertions is presented for each environment.

...read moreread less

647 citations

Journal Article•DOI•

[...]

Williams¹, Parker•Institutions (1)

IBM¹

On k-Nearest Neighbor Voronoi Diagrams in the Plane

TL;DR: The different techniques of design for testability are discussed in detail, including techniques which can be applied to today's technologies and techniques which have been recently introduced and will soon appear in new designs.

...read moreread less

Abstract: This paper discusses the basics of design for testability. A short review of testing is given along with some reasons why one should test. The different techniques of design for testability are discussed in detail. These include techniques which can be applied to today's technologies and techniques which have been recently introduced and will soon appear in new designs.

...read moreread less

428 citations

Journal Article•DOI•

[...]

Der-Tsai Lee¹•Institutions (1)

Northwestern University¹

Concurrent Error Detection in ALU's by Recomputing with Shifted Operands

TL;DR: It is shown that the k-nearest neighbor problem and other seemingly unrelated problems can be solved efficiently with the Voronoi diagram.

...read moreread less

Abstract: The notion of Voronoi diagram for a set of N points in the Euclidean plane is generalized to the Voronoi diagram of order k and an iterative algorithm to construct the generalized diagram in 0(k2N log N) time using 0(k2(N − k)) space is presented. It is shown that the k-nearest neighbor problem and other seemingly unrelated problems can be solved efficiently with the diagram.

...read moreread less

361 citations

Journal Article•DOI•

[...]

Patel¹, Fung¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

A Task Allocation Model for Distributed Computing Systems

TL;DR: It is shown that for most practical ALU implementations, including the carry-lookahead adders, the RESO technique will detect all errors caused by faults in a bit-slice or a specific subcircuit of the bit slice.

...read moreread less

Abstract: A new method of concurrent error detection in the Arithmetic and Logic Units (ALU's) is proposed. This method, called "Recomputing with Shifted Operands" (RESO), can detect errors in both the arithmetic and logic operations. RESO uses the principle of time redundancy in detecting the errors and achieves its error detection capability through the use of the already existing replicated hardware in the form of identical bit slices. It is shown that for most practical ALU implementations, including the carry-lookahead adders, the RESO technique will detect all errors caused by faults in a bit-slice or a specific subcircuit of the bit slice. The fault model used is more general than the commonly assumed stuck-at fault model. Our fault model assumes that the faults are confined to a small area of the circuit and that the precise nature of the faults is not known. This model is very appropriate for the VLSI circuits.

...read moreread less

344 citations

Journal Article•DOI•

[...]

Perng-Yi Richard Ma¹, Lee, Tsuchiya•Institutions (1)

TRW Inc.¹

The Extra Stage Cube: A Fault-Tolerant Interconnection Network for Supersystems

TL;DR: In this paper, a task allocation model that allocates application tasks among processors in distributed computing systems satisfying minimum interprocessor communication cost, balanced utilization of each processor, and all engineering application requirements is presented.

...read moreread less

Abstract: This paper presents a task allocation model that allocates application tasks among processors in distributed computing systems satisfying: 1) minimum interprocessor communication cost, 2) balanced utilization of each processor, and 3) all engineering application requirements.

...read moreread less

328 citations

Journal Article•DOI•

[...]

Adams¹, Siegel¹•Institutions (1)

Purdue University¹

Wavefront Array Processor: Language, Architecture, and Applications

TL;DR: It is shown that the ESC provides fault tolerance for any single failure, and the network can be controlled even when it has a failure, using a simple modification of a routing tag scheme proposed for the Generalized Cube.

...read moreread less

Abstract: The Extra Stage Cube (ESC) interconnection network, a fault-tolerant structure, is proposed for use in large-scale parallel and distributed supercomputer systems. It has all of the interconnecting capabilities of the multistage cube-type networks that have been proposed for many supersystems. The ESC is derived from the Generalized Cube network by the addition of one stage of interchange boxes and a bypass capability for two stages. It is shown that the ESC provides fault tolerance for any single failure. Further, the network can be controlled even when it has a failure, using a simple modification of a routing tag scheme proposed for the Generalized Cube. Both one-to-one and broadcast connections under routing tag control are performable by the faulted ESC. The ability of the ESC to operate with multiple faults is examined. The ways in which the ESC can be partitioned and permute data are described.

...read moreread less

328 citations

Journal Article•DOI•

[...]

Sun-Yuan Kung¹, Arun¹, Gal-Ezer¹, Bhaskar D. Rao¹•Institutions (1)

University of Southern California¹

Closed-Form Solutions of Performability

TL;DR: In this paper, the authors describe the development of a wavefront-based language and architecture for a programmable special-purpose multiprocessor array (POMP) based on the notion of computational wavefront.

...read moreread less

Abstract: This paper describes the development of a wavefront-based language and architecture for a programmable special-purpose multiprocessor array. Based on the notion of computational wavefront, the hardware of the processor array is designed to provide a computing medium that preserves the key properties of the wavefront. In conjunction, a wavefront language (MDFL) is introduced that drastically reduces the complexity of the description of parallel algorithms and simulates the wavefront propagation across the computing network. Together, the hardware and the language lead to a programmable wavefront array processor (WAP). The WAP blends the advantages of the dedicated systolic array and the general-purpose data-flow machine, and provides a powerful tool for the high-speed execution of a large class of matrix operations and related algorithms which have widespread applications.

...read moreread less

263 citations

Journal Article•DOI•

[...]

Meyer¹•Institutions (1)

University of Michigan¹

Exception Handling and Software Fault Tolerance

TL;DR: This paper considers the modeling of a degradable buffer/multiprocessor system whose performance Y is the (normalized) average throughput rate realized during a bounded interval of time and shows that a closed-form solution of performability can indeed be obtained.

...read moreread less

Abstract: If computing system performance is degradable, then as recognized in a number of recent studies, system evaluation must deal simultaneously with aspects of both performance and reliability. One approach is the evaluation of a system's "performability," which relative to a specified performance variable Y, generally requires solution of the probability distribution function of Y. In this paper we examine the feasibility of closed-form solutions of performability when Y is continuous. In particular, we consider the modeling of a degradable buffer/multiprocessor system whose performance Y is the (normalized) average throughput rate realized during a bounded interval of time. Employing an approximate decomposition of the model, we show that a closed-form solution can indeed be obtained.

...read moreread less

213 citations

Journal Article•DOI•

[...]

Cristian¹•Institutions (1)

Newcastle University¹

A Fault-Tolerant Communication Architecture for Distributed Systems

TL;DR: A unified point of view on programmed exception handling and default exception handling based on automatic backward recovery based onautomatic backward recovery is constructed, finding a class of faults for which default exception Handling can provide effective fault tolerance.

...read moreread less

Abstract: Some basic concepts underlying the issue of fault-tolerant software design are investigated. Relying on these concepts, a unified point of view on programmed exception handling and default exception handling based on automatic backward recovery is constructed. The cause–effect relationship between software design faults and failure occurrences is explored and a class of faults for which default exception handling can provide effective fault tolerance is characterized. It is also shown that there exists a second class of design faults which cannot be tolerated by using default exception handling. The role that software verification methods can play in avoiding the production of such faults is discussed.

...read moreread less

Journal Article•DOI•

[...]

Pradhan¹, Reddy•Institutions (1)

Oakland University¹

The Prime Memory System for Array Access

TL;DR: A communication architecture for distributed processors is presented, based on a new topolgy developed, one which interconnects n nodes by using rn links where the maximum internode distance is logrn, and where each node has, at most, 2r, I/O ports.

...read moreread less

Abstract: A communication architecture for distributed processors is presented here. This architecture is based on a new topolgy we have developed, one which interconnects n nodes by using rn links where the maximum internode distance is logrn, and where each node has, at most, 2r, I/O ports. It is also shown that this network is fault-tolerant, being able to tolerate up to (r − 1) node failures.

...read moreread less

Journal Article•DOI•

[...]

Lawrie¹, Vora•Institutions (1)

University of Illinois at Urbana–Champaign¹

Parallel Algorithms to Set Up the Benes Permutation Network

TL;DR: A memory system designed for parallel array access based on the use of a prime nwnber of memories and a powerful combination of indexing hardware and data alignment switches is described.

...read moreread less

Abstract: In this paper we describe a memory system designed for parallel array access. The system is based on the use of a prime nwnber of memories and a powerful combination of indexing hardware and data alignment switches. Particular emphasis is placed on the indexing equations and their implementation.

...read moreread less

Journal Article•DOI•

[...]

Nassimi¹, Sahni²•Institutions (2)

Northwestern University¹, University of Minnesota²

01 Feb 1982-IEEE Transactions on Computers

TL;DR: A parallel algorithm to determine the switch settings for a Benes permutation network is developed and runs in 0(N½) time on an N½-mesh-connected computer and 0(log⁴) time on both a cube connected and a perfect shuffle computer with N processing elements.

...read moreread less

Abstract: A parallel algorithm to determine the switch settings for a Benes permutation network is developed. This algorithm can determine the switch settings for an N input/output Benes network in 0(log2N) time when a fully interconnected parallel computer with N processing elements is used. The algorithm runs in 0(N½) time on an N½× N½mesh-connected computer and 0(log4N) time on both a cube connected and a perfect shuffle computer with N processing elements. It runs in 0(k log3N) time on cube connected and perfect shuffle computers with N1+1/kprocessing elements.

...read moreread less

Journal Article•DOI•

Watchdog Processors and Structural Integrity Checking

[...]

Lu¹•Institutions (1)

Stanford University¹

On the Analysis and Synthesis of VLSI Algorithms

TL;DR: The use of watchdog processors in the implementation of Structural Integrity Checking (SIC) is described and a model for ideal SIC is given in terms of formal languages and automata.

...read moreread less

Abstract: The use of watchdog processors in the implementation of Structural Integrity Checking (SIC) is described. A model for ideal SIC is given in terms of formal languages and automata. Techniques for use in implementing SIC are presented. The modification of a Pascal compiler into an SIC Pascal preprocessor is summarized.

...read moreread less

Journal Article•DOI•

[...]

D. I. Moldovan¹•Institutions (1)

University of Southern California¹

Derivation and Calibration of a Transient Error Reliability Model

TL;DR: This correspondence is concerned with the development of algorithms for special-purpose VLSI arrays and the approach used is to identify algorithm transformations which modify favorably the index set and the data dependences, but perserve the ordering imposed on theindex set by the data dependsences.

...read moreread less

Abstract: This correspondence is concerned with the development of algorithms for special-purpose VLSI arrays. The approach used in this correspondence is to identify algorithm transformations which modify favorably the index set and the data dependences, but perserve the ordering imposed on the index set by the data dependences. Conditions for the existance of such transformations are given for a class of algorithms. Also, a methodology is proposed for the synthesis of VLSI algorithms.

...read moreread less

Journal Article•DOI•

[...]

Castillo, McConnel¹, Siewiorek²•Institutions (2)

Georgia Institute of Technology¹, Carnegie Mellon University²

Optimal BPC Permutations on a Cube Connected SIMD Computer

TL;DR: A new modeling methodology to characterize failure processes in digital computers due to hardware transients is presented, and models of common fault-tolerant redundant structures are developed using decreasing hazard function distributions.

...read moreread less

Abstract: In this paper a new modeling methodology to characterize failure processes in digital computers due to hardware transients is presented. The basic assumption made is that system sensitivity to hardware transient errors is a function of critical resources usage. The failure rate of a given resource is approximated by a deterministic function of time, depending on the average workload of that resource, plus a Gaussian process. The probability density function of the time to failure obtained under this assumption has a decreasing hazard function, explaining why decreasing hazard function densities such as the Weibull fit experimental data so well. Data on transient errors obtained from several systems are analyzed. Statistical tests confirm the good fit between decreasing hazard distributions and actual data. Finally, models of common fault-tolerant redundant structures are developed using decreasing hazard function distributions. The analysis indicates significant differences between reliability predictions based on the exponential distribution and those based on decreasing hazard function distributions. Reliability differences of 0.2 and factors greater than 2 in Mission Time Improvement are seen in model results. System designers should be aware of these differences.

...read moreread less

Journal Article•DOI•

[...]

Nassimi¹, Sahni²•Institutions (2)

Northwestern University¹, University of Minnesota²

01 Apr 1982-IEEE Transactions on Computers

TL;DR: This correspondence develops an algorithm to perform BPC permutations on a cube connected SIMD computer that is shown to be optimal in the sense that it uses the fewest possible number of unit routes to accomplish any B PC permutation.

...read moreread less

Abstract: In this correspondence we develop an algorithm to perform BPC permutations on a cube connected SIMD computer. The class of BPC permutations includes many of the frequently occurring permutations such as matrix transpose, vector reversal, bit shuffle, and perfect shuffle. Our algorithm is shown to be optimal in the sense that it uses the fewest possible number of unit routes to accomplish any BPC permutation.

...read moreread less

Journal Article•DOI•

The Complexity of Fault Detection Problems for Combinational Logic Circuits

[...]

Fujiwara¹, Toida²•Institutions (2)

Osaka University¹, University of Waterloo²

Theory of Unidirectional Error Correcting/Detecting Codes

TL;DR: This correspondence analyzes the computational complexity of fault detection problems for combinational circuits and proposes an approach to design for testability, and shows that for k-level (k ≥ 3) monotone/unate circuits these problems are still NP-complete, but that these are solvable in polynomial time for 2-level monot one/ unate circuits.

...read moreread less

Abstract: In this correspondence we analyze the computational complexity of fault detection problems for combinational circuits and propose an approach to design for testability. Although major fault detection problems have been known to be in general NP-complete, they were proven for rather complex circuits. In this correspondence we show that these are still NP-complete even for monotone circuits, and thus for unate circuits. We show that for k-level (k ≥ 3) monotone/unate circuits these problems are still NP-complete, but that these are solvable in polynomial time for 2-level monotone/unate circuits. A class of circuits for which these fault detection problems are solvable in polynomial time is presented. Ripple-carry adders, decoder circuits, linear circuits, etc., belong to this class. A design approach is also presented in which an arbitrary given circuit is changed to such an easily testable circuit by inserting a few additional test-points.

...read moreread less

Journal Article•DOI•

[...]

Bose¹, Rao•Institutions (1)

Oregon State University¹

Connectivity of Random Logic

TL;DR: This paper defines symmetric, asymmetric, and unidirectional error classes and derives the necessary and sufficient conditions for a binary code to be uniddirectional error correcting/detecting.

...read moreread less

Abstract: In this paper we present some basic theory on unidirectional error correcting/detecting codes. We define symmetric, asymmetric, and unidirectional error classes and proceed to derive the necessary and sufficient conditions for a binary code to be unidirectional error correcting/detecting.

...read moreread less

Journal Article•DOI•

[...]

Feuer¹•Institutions (1)

IBM¹

Effects of Cache Coherency in Multiprocessors

TL;DR: This paper develops a relation between the partitioning properties of computer logic and the distribution of connection lengths and finds that an exponential partitioning function leads to an inverse power law length distribution.

...read moreread less

Abstract: This paper develops a relation between the partitioning properties of computer logic and the distribution of connection lengths. The computation of length distributions is important for wirability analysis and delay estimation. The principal result is that an exponential partitioning function leads to an inverse power law length distribution.

...read moreread less

Journal Article•DOI•

[...]

Dubois, Briggs

Bit-Serial Parallel Processing Systems

TL;DR: In this article, an analytical model for the program behavior of a multitasked system is introduced, including the behavior of each process and the interactions between processes with regard to the sharing of data blocks.

...read moreread less

Abstract: In many commercial multiprocessor systems, each processor accesses the memory through a private cache. One problem that could limit the extensibility of the system and its performance is the enforcement of cache coherence. A mechanism must exist which prevents the existence of several different copies of the same data block in different private caches. In this paper, we present an in-depth analysis of the effects of cache coherency in multiprocessors. A novel analytical model for the program behavior of a multitasked system is introduced. The model includes the behavior of each process and the interactions between processes with regard to the sharing of data blocks. An approximation is developed to derive the main effects of the cache coherency contributing to degradations in system performance.

...read moreread less

Journal Article•DOI•

[...]

Batcher

Bandwidth of Crossbar and Multiple-Bus Connections for Multiprocessors

TL;DR: Two bit-serial parallel processing systems are developed: an airborne associative processor and a ground based massively parallel processor.

...read moreread less

Abstract: About a decade ago, a bit-serial parallel processing system STARAN®1 was developed. It used standard integrated circuits that were available at that time. Now, with the availability of VLSI, a much greater processing capability can be packed in a unit volume. This has led to the recent development of two bit-serial parallel processing systems: an airborne associative processor and a ground based massively parallel processor.

...read moreread less

Journal Article•DOI•

[...]

Lang¹, Valero, Alegre•Institutions (1)

University of California, Berkeley¹

01 Dec 1982-IEEE Transactions on Computers

TL;DR: The effective bandwidth in a multiprocessor with shared memory with N processors and N memory modules is compared using as interconnection networks the crossbar or the multiple-bus.

...read moreread less

Abstract: In this paper we compare the effective bandwidth in a multiprocessor with shared memory using as interconnection networks the crossbar or the multiple-bus. We consider a system with N processors and N memory modules, in which the processor requests to the memory modules are independent and uniformly distributed random variables. We consider two cases: in the first the processor makes another request immediately after a memory service, and in the second there is some internal processing time.

...read moreread less

Journal Article•DOI•

The Burroughs Scientific Processor (BSP)

[...]

Kuck¹, Stokes•Institutions (1)

University of Illinois at Urbana–Champaign¹

Markov Models for Multiple Bus Multiprocessor Systems

TL;DR: The Burroughs Scientific Processor (BSP) as mentioned in this paper was a high-performance computer system that performed the Department of Energy LLL loops at roughly the speed of the CRAY-1.

...read moreread less

Abstract: The Burroughs Scientific Processor (BSP), a high-performance computer system, performed the Department of Energy LLL loops at roughly the speed of the CRAY-1. The BSP combined parallelism and pipelining, performing memory-to-memory operations. Seventeen memory units and two crossbar switch data alignment networks provided conflict-free access to most indexed arrays. Fast linear recurrence algorithms provided good performance on constructs that some machines execute serially. A system manager computer ran the operating system and a vectorizing Fortran compiler. An MOS file memory system served as a high bandwidth secondary memory.

...read moreread less

Journal Article•DOI•

[...]

Marsan¹, Gerla•Institutions (1)

Polytechnic University of Turin¹

01 Mar 1982-IEEE Transactions on Computers

TL;DR: Markovian models are developed for the performance analysis of multiprocessor systems intercommunicating via a set of buses and are found to be surprisingly accurate for a wide range of configurations.

...read moreread less

Abstract: Markovian models are developed for the performance analysis of multiprocessor systems intercommunicating via a set of buses. The performance index is the average number of active processors, called processing power. From processing power a variety of other performance measures can be derived as dictated by the specific processor application. Exact models are first introduced and are illustrated with a simple example. The computational complexity of the exact models is shown to increase very rapidly with system size, thus making the exact analysis impractical even for medium size systems. To overcome the complexity of computation, several approximate models are introduced. The approximate results are compared with the exact ones and found to be surprisingly accurate for a wide range of configurations. Simulation is used to validate the analytic models and to test their robustness.

...read moreread less

Journal Article•DOI•

Fast Algorithms for the 2-D Discrete Cosine Transform

[...]

Kamangar¹, Rao•Institutions (1)

University of Texas at Austin¹

Hardware Specification with Temporal Logic: An Example

TL;DR: Two types of efficient algorithms for fast implementation of the 2-D discrete cosine transform (2-D DCT) are developed and they reduce the number of multiplications significantly compared to the fast algorithm developed by Chen et al.

...read moreread less

Abstract: Two types of efficient algorithms for fast implementation of the 2-D discrete cosine transform (2-D DCT) are developed. One involves recursive structure which implies that the algorithm for (M/2 X N/2) block be extended to (M X N/2) (M/2 X M) and (M X N) blocks (M and N are integer powers of two). The second algorithm is nonrecursive and therefore it has to be tailored for each block size. Both algorithms involve real arithmetic and they reduce the number of multiplications significantly compared to the fast algorithm developed by Chen et al. [8], while the number of additions remain unchanged.

...read moreread less

Journal Article•DOI•

[...]

Bochmann¹•Institutions (1)

Université de Montréal¹

01 Mar 1982-IEEE Transactions on Computers

TL;DR: While traditional logic is useful for specifying combinational circuits, it is shown how the extensions of temporal logic apply to the specification of memory, as well as the safeness and liveness properties of active circuits representing processes.

...read moreread less

Abstract: The use of temporal logic for the specification of hardware modules is explored. Temporal logic is an extension of conventional logic. While traditional logic is useful for specifying combinational circuits, it is shown how the extensions of temporal logic apply to the specification of memory, as well as the safeness and liveness properties of active circuits representing processes. These ideas are demonstrated by the example of a self-timed arbiter. An implementation of the arbiter is also given, and its formal verification by a kind of reachability analysis is discussed. This verification approach is also useful for finding design errors, as demonstrated by an example.

...read moreread less

Journal Article•DOI•

On Embedding Rectangular Grids in Square Grids

[...]

Aleliunas¹, Rosenberg•Institutions (1)

University of Waterloo¹

Queueing Network Models for Parallel Processing with Asynchronous Tasks

TL;DR: The main results in this paper demonstrate that there exist pairs of integers 〈E, D〉 such that any n-vertex rectangular grid can be embedded into a square grid having at most En vertices, in such a way that images in the square grid of vertices that are adjacent in the rectangular grid are at most distance D apart.

...read moreread less

Abstract: The main results in this paper demonstrate that there exist pairs of integers 〈E, D〉 (for "area Expansion" and "edge Dilation," respectively) such that any n-vertex rectangular grid can be embedded into a square grid having at most En vertices, in such a way that images in the square grid of vertices that are adjacent in the rectangular grid are at most distance D apart. Several techniques for "squaring"-up rectangular grids are presented; sample values for the parameter-pair 〈E, D〉 are: 〈E = 1.2, D = 15〉, 〈E = 1.45, D = 9〉, 〈E = 1.8, D = 3〉. Note that these values of E and D hold for all rectangular grids, independent of number of vertices. The quest for these results was motivated by the question of whether or not one could automatically "square up" circuit layouts having aspect ratios very far from unity, without compromising the efficiency of the layout (in terms of area and length of the longest run of wire). The results reported here yield an affirmative answer to this question, at least in an idealized setting. One corollary of the embeddings presented here is that the side-2n½ square "king's-move" grid contains as a subgraph every n-vertex rectangular grid. Another way to think of this result is that this embellished grid can be "programmed" or "personalized," by setting switches, to represent any n-vertex rectangular grid.

...read moreread less

Journal Article•DOI•

[...]

Heidelberger¹, Trivedi•Institutions (1)

IBM¹