Showing papers in &quot;IEEE Transactions on Computers in 1993&quot;

Stochastic well-formed colored nets and symmetric modeling applications

TL;DR: An object recognition system based on the dynamic link architecture, an extension to classical artificial neural networks (ANNs), is presented and the implementation on a transputer network achieved recognition of human faces and office objects from gray-level camera images.

...read moreread less

Abstract: An object recognition system based on the dynamic link architecture, an extension to classical artificial neural networks (ANNs), is presented. The dynamic link architecture exploits correlations in the fine-scale temporal structure of cellular signals to group neurons dynamically into higher-order entities. These entities represent a rich structure and can code for high-level objects. To demonstrate the capabilities of the dynamic link architecture, a program was implemented that can recognize human faces and other objects from video images. Memorized objects are represented by sparse graphs, whose vertices are labeled by a multiresolution description in terms of a local power spectrum, and whose edges are labeled by geometrical distance vectors. Object recognition can be formulated as elastic graph matching, which is performed here by stochastic optimization of a matching cost function. The implementation on a transputer network achieved recognition of human faces and office objects from gray-level camera images. The performance of the program is evaluated by a statistical analysis of recognition results from a portrait gallery comprising images of 87 persons. >

...read moreread less

1,973 citations

Journal Article•DOI•

[...]

Giovanni Chiola, C. Dutheillet¹, Giuliana Franceschinis², Serge Haddad¹•Institutions (2)

University of Paris¹, University of Turin²

01 Nov 1993-IEEE Transactions on Computers

TL;DR: It turns out that SWN's allow the representation of any color function in a structured form, so that any unconstrained high-level net can be transformed into a well-formed net.

...read moreread less

Abstract: The class of stochastic well-formed colored nets (SWN's) was defined as a syntactic restriction of stochastic high-level nets. The interest of the introduction of restrictions in the model definition is the possibility of exploiting the symbolic reachability graph (SRG) to reduce the complexity of Markovian performance evaluation with respect to classical Petri net techniques. It turns out that SWN's allow the representation of any color function in a structured form, so that any unconstrained high-level net can be transformed into a well-formed net. Moreover, most constructs useful for the modeling of distributed computer systems and architectures directly match the "well-formed" restriction, without any need of transformation. A nontrivial example of the usefulness of the technique in the performance modeling and evaluation of multiprocessor architectures is included. >

...read moreread less

340 citations

Journal Article•DOI•

A reduced-area scheme for carry-select adders

[...]

Akhilesh Tyagi¹•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Oct 1993-IEEE Transactions on Computers

TL;DR: The author introduces a scheme to generate carry bits with block-carry-in 1 from the carries of a block withBlock- Carry-in 0 to derive a more area-efficient implementation for both the carry-select and parallel-prefix adders.

...read moreread less

Abstract: The carry-select or conditional-sum adders require carry-chain evaluations for each block for both the values of block-carry-in, 0 and 1. The author introduces a scheme to generate carry bits with block-carry-in 1 from the carries of a block with block-carry-in 0. This scheme is then applied to carry-select and parallel-prefix adders to derive a more area-efficient implementation for both the cases. The proposed carry-select scheme is assessed relative to carry-ripple, classical carry-select, and carry-skip adders. The analytic evaluation is done with respect to the gate-count model for area and gate-delay units for time. >

...read moreread less

263 citations

Journal Article•DOI•

Parallel computations on reconfigurable meshes

[...]

Russ Miller¹, V.K. Prasanna-Kumar², Dionysios Reisis³, Quentin F. Stout⁴•Institutions (4)

State University of New York System¹, University of Southern California², National Technical University of Athens³, University of Michigan⁴

Finite precision error analysis of neural network hardware implementations

TL;DR: Algorithms that are efficient for solving a variety of problems involving graphs and digitized images are introduced that are asymptotically superior to those previously obtained for the mesh, the mesh with multiple broadcasting, the meshes with multiple buses, theMesh-of-trees, and the pyramid computer.

...read moreread less

Abstract: The mesh with reconfigurable bus is presented as a model of computation. The reconfigurable mesh captures salient features from a variety of sources, including the CAAPP, CHiP, polymorphic-torus network, and bus automation. It consists of an array of processors interconnected by a reconfigurable bus system that can be used to dynamically obtain various interconnection patterns between the processors. A variety of fundamental data-movement operations for the reconfigurable mesh are introduced. Based on these operations, algorithms that are efficient for solving a variety of problems involving graphs and digitized images are also introduced. The algorithms are asymptotically superior to those previously obtained for the aforementioned reconfigurable architectures, as well as to those previously obtained for the mesh, the mesh with multiple broadcasting, the mesh with multiple buses, the mesh-of-trees, and the pyramid computer. The power of reconfigurability is illustrated by solving some problems, such as the exclusive OR, more efficiently on the reconfigurable mesh than is possible on the programmable random-access memory (PRAM). >

...read moreread less

261 citations

Journal Article•DOI•

[...]

J.L. Holi, Jenq-Neng Hwang¹•Institutions (1)

University of Washington¹

Hardware implementation of Montgomery's modular multiplication algorithm

TL;DR: A theoretical analysis of error due to finite precision computation was undertaken to determine the necessary precision for successful forward retrieving and back-propagation learning in a multilayer perceptron.

...read moreread less

Abstract: Through parallel processing, low precision fixed point hardware can be used to build a very high speed neural network computing engine where the low precision results in a drastic reduction in system cost. The reduced silicon area required to implement a single processing unit is taken advantage of by implementing multiple processing units on a single piece of silicon and operating them in parallel. The important question which arises is how much precision is required to implement neural network algorithms on this low precision hardware. A theoretical analysis of error due to finite precision computation was undertaken to determine the necessary precision for successful forward retrieving and back-propagation learning in a multilayer perceptron. This analysis can easily be further extended to provide a general finite precision analysis technique by which most neural network algorithms under any set of hardware constraints may be evaluated. >

...read moreread less

245 citations

Journal Article•DOI•

[...]

Stephen Eldridge¹, Colin D. Walter¹•Institutions (1)

University of Manchester¹

Systolic modular multiplication

TL;DR: Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985), showing that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures.

...read moreread less

Abstract: Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985). Comparison with previous techniques shows that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures. The gain in speed arises from the faster clock that results from simpler combinational logic. >

...read moreread less

238 citations

Journal Article•DOI•

[...]

Colin D. Walter¹•Institutions (1)

University of Manchester¹

Fault injection and dependability evaluation of fault-tolerant systems

TL;DR: A systolic array for modular multiplication is presented using the ideally suited algorithm of P.L. Montgomery (1985), where its main use would be where many consecutive multiplications are done, as in RSA cryptosystems.

...read moreread less

Abstract: A systolic array for modular multiplication is presented using the ideally suited algorithm of P.L. Montgomery (1985). Throughput is one modular multiplication every clock cycle, with a latency of 2n+2 cycles for multiplicands having n digits. Its main use would be where many consecutive multiplications are done, as in RSA cryptosystems. >

...read moreread less

229 citations

Journal Article•DOI•

[...]

Jean Arlat¹, A. Costes¹, Yves Crouzet¹, Jean-Claude Laprie¹, David Powell¹ - Show less +1 more•Institutions (1)

Centre national de la recherche scientifique¹

Meshes with reconfigurable buses

TL;DR: A dependability evaluation method based on fault injection that establishes the link between the experimental evaluation of the fault tolerance process and the fault occurrence process is described and their interactions are analyzed.

...read moreread less

Abstract: The authors describe a dependability evaluation method based on fault injection that establishes the link between the experimental evaluation of the fault tolerance process and the fault occurrence process. The main characteristics of a fault injection test sequence aimed at evaluating the coverage of the fault tolerance process are presented. Emphasis is given to the derivation of experimental measures. The various steps by which the fault occurrence and fault tolerance processes are combined to evaluate dependability measures are identified and their interactions are analyzed. The method is illustrated by an application to the dependability evaluation of the distributed fault-tolerant architecture of the Esprit Delta-4 Project. >

...read moreread less

227 citations

Journal Article•

[...]

Russ Miller, V. K. Prasanna Kumar, Dionisios I. Reisis, Quentin F. Stout

01 Jan 1993-IEEE Transactions on Computers

201 citations

Journal Article•DOI•

The CORDIC algorithm: new results for fast VLSI implementation

[...]

Jean Duprat¹, Jean-Michel Muller¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Feb 1993-IEEE Transactions on Computers

TL;DR: After a brief survey of the CORDic algorithm, some new results that allow fast and easy signed-digit implementation of CORDIC, without modifying the basic iteration step are given.

...read moreread less

Abstract: After a brief survey of the CORDIC algorithm, some new results that allow fast and easy signed-digit implementation of CORDIC, without modifying the basic iteration step are given. A slight modification would make it possible to use a carry-save representation of numbers, instead of a signed-digit one. The method, called the branching CORDIC method, consists of performing in parallel two classic CORDIC rotations. It gives a constant normalization factor. An online implementation of the algorithm is proposed with an online delay equal to 5 for the sine and cosine functions. >

...read moreread less

171 citations

Journal Article•DOI•

On computing multiplicative inverses in GF(2/sup m/)

[...]

H. Brunner¹, A. Curiger¹, M. Hofstetter¹•Institutions (1)

ETH Zurich¹

Time-division optical communications in multiprocessor arrays

TL;DR: The design of a modular standard basis inversion for Galois fields GF(2/sup m/) based on Euclid's algorithm for computing the greatest common divisor of two polynomials is presented, resulting in an AT-complexity of O(m/sup 2/).

...read moreread less

Abstract: The design of a modular standard basis inversion for Galois fields GF(2/sup m/) based on Euclid's algorithm for computing the greatest common divisor of two polynomials is presented. The asymptotic complexity is linear with m both in computation time and area requirement, thus resulting in an AT-complexity of O(m/sup 2/). This is a significant improvement over the best previous proposal which achieves AT-complexity of only O(m/sup 3/). >

...read moreread less

Journal Article•DOI•

[...]

Chunming Qiao¹, Rami Melhem¹•Institutions (1)

University of Pittsburgh¹

01 May 1993-IEEE Transactions on Computers

TL;DR: An optical communication structure is proposed for multiprocessor arrays which exploits the high communication bandwidth of optical waveguides and time-division multiplexing of messages has the same effect as message pipelining on opticalWaveguides.

...read moreread less

Abstract: An optical communication structure is proposed for multiprocessor arrays which exploits the high communication bandwidth of optical waveguides. The structure takes advantage of two properties of optical signal transmissions on waveguides, namely, unidirectional propagation and predictable propagation delays per unit length. Because of these two properties, time-division multiplexing (TDM) of messages has the same effect as message pipelining on optical waveguides. Two TDM approaches are proposed, and the combination of the two is used in the design of the optical communication structure. Analysis and simulation results are given to demonstrate the communication effectiveness of the system. A clock distribution method is proposed to address potential synchronization problems. Feasibility issues with current and future technologies are discussed. >

...read moreread less

Journal Article•DOI•

A modified Massey-Omura parallel multiplier for a class of finite fields

[...]

M.A. Hasan¹, M.Z. Wang², Vijay K. Bhargava²•Institutions (2)

Victoria University, Australia¹, University of Victoria²

01 Oct 1993-IEEE Transactions on Computers

TL;DR: By removing the redundancy, a modified parallel multiplier is presented which is modular and has a lower circuit complexity.

...read moreread less

Abstract: A Massey-Omura parallel multiplier of finite fields GF(2/sup m/) contains m identical blocks whose inputs are cyclically shifted versions of one another. It is shown that for fields GF(2/sup m/) generated by irreducible all one polynomials, a portion of the block is independent of the input cyclic shift; hence, the multiplier contains redundancy. By removing the redundancy, a modified parallel multiplier is presented which is modular and has a lower circuit complexity. >

...read moreread less

Journal Article•DOI•

Growing and pruning neural tree networks

[...]

A. Sakar¹, Richard J. Mammone²•Institutions (2)

Bell Labs¹, Rutgers University²

Branch target buffer design and optimization

TL;DR: It is shown that this method has better performance in terms of minimizing the number of classification errors than the squared error minimization method used in backpropagation.

...read moreread less

Abstract: A pattern classification method called neural tree networks (NTNs) is presented. The NTN consists of neural networks connected in a tree architecture. The neural networks are used to recursively partition the feature space into subregions. Each terminal subregion is assigned a class label which depends on the training data routed to it by the neural networks. The NTN is grown by a learning algorithm, as opposed to multilayer perceptrons (MLPs), where the architecture must be specified before learning can begin. A heuristic learning algorithm based on minimizing the L1 norm of the error is used to grow the NTN. It is shown that this method has better performance in terms of minimizing the number of classification errors than the squared error minimization method used in backpropagation. An optimal pruning algorithm is given to enhance the generalization of the NTN. Simulation results are presented on Boolean function learning tasks and a speaker independent vowel recognition task. The NTN compares favorably to both neural networks and decision trees. >

...read moreread less

Journal Article•DOI•

[...]

Chris H. Perleberg¹, Alan Jay Smith¹•Institutions (1)

University of California, Berkeley¹

01 Apr 1993-IEEE Transactions on Computers

TL;DR: A branch target buffer (BTB) can reduce the performance penalty of branches in pipelined processors by predicting the path of the branch and caching information used by the branch as discussed by the authors, but it requires a large number of bits allocated to the BTB implementation.

...read moreread less

Abstract: A branch target buffer (BTB) can reduce the performance penalty of branches in pipelined processors by predicting the path of the branch and caching information used by the branch. Two major issues in the design of BTBs that achieves maximum performance with a limited number of bits allocated to the BTB implementation are discussed. The first is BTB management. A method for discarding branches from the BTB is examined. This method discards the branch with the smallest expected value for improving performance; it outperforms the least recently used (LRU) strategy by a small margin, at the cost of additional complexity. The second issue is the question of what information to store in the BTB. A BTB entry can consist of one or more of the following: branch tag, prediction information, the branch target address, and instructions at the branch target. Various BTB designs, with one or more of these fields, are evaluated and compared. >

...read moreread less

Journal Article•DOI•

An angle recoding method for CORDIC algorithm implementation

[...]

Yu Hen Hu¹, S. Naganathan•Institutions (1)

University of Wisconsin-Madison¹

01 Jan 1993-IEEE Transactions on Computers

TL;DR: A greedy algorithm which takes only O(n/sup 2/) operations is developed to perform CORDIC angle recoding, and it is proven that this algorithm is able to reduce the total number of required elementary rotation angles by at least 50% without affecting the computational accuracy.

...read moreread less

Abstract: The coordinate rotation digital computer (CORDIC), an iterative arithmetic algorithm for computing generalized vector rotations without performing multiplications, is discussed. For applications where the angle of rotation is known in advance, a method to speed up the execution of the CORDIC algorithm by reducing the total number of iterations is presented. This is accomplished by using a technique called angle recoding, which encodes the desired rotation angle as a linear combination of very few elementary rotation angles. Each of these elementary rotation angles takes one CORDIC iteration to compute. The fewer the number of elementary rotation angles, the fewer the number of iterations are required. A greedy algorithm which takes only O(n/sup 2/) operations is developed to perform CORDIC angle recoding. It is proven that this algorithm is able to reduce the total number of required elementary rotation angles by at least 50% without affecting the computational accuracy. >

...read moreread less

Journal Article•DOI•

Signed digit representations of minimal Hamming weight

[...]

S. Arno, F.S. Wheeler

Numerical accuracy and hardware tradeoffs for CORDIC arithmetic for special-purpose processors

TL;DR: The authors give an online algorithm for computing a canonical signed digit representation of minimal Hamming weight for any integer n and shows that E(K/sub r/) approximately (r-1)k/(r+1) as k to infinity.

...read moreread less

Abstract: The authors give an online algorithm for computing a canonical signed digit representation of minimal Hamming weight for any integer n. Using combinatorial techniques, the probability distributions Pr(K/sub r/=h), where K/sub r/ is taken to be a random variable on the uniform probability space of k-digit integers is computed. Also, using a Markov chain analysis, it is shown that E(K/sub r/) approximately (r-1)k/(r+1) as k to infinity . >

...read moreread less

Journal Article•DOI•

[...]

Kishore Kota¹, Joseph R. Cavallaro¹•Institutions (1)

Rice University¹

01 Jul 1993-IEEE Transactions on Computers

TL;DR: Two approaches for tackling the numerical accuracy problem of fixed-point CORDIC are described and arguments to support the use of such an architecture in certain special-purpose arrays are presented.

...read moreread less

Abstract: The coordinate rotation digital computer (CORDIC) algorithm is used in numerous special-purpose systems for real-time signal processing applications. An analysis of fixed-point CORDIC in the Y-reduction mode, which allows computation of the inverse tangent function, shows that unnormalized input values can result in large numerical errors. The authors describe two approaches for tackling the numerical accuracy problem. The first approach builds on a fixed-point CORDIC unit and eliminates the problem by including additional hardware for normalization. A method for integrating the normalization operation with the CORDIC iterations for efficient implementation in O(n/sup 1.5/) hardware is provided. The second solution to the accuracy problem is to use a floating-point CORDIC unit but reduce the implementation complexity by using a hybrid architecture. Arguments to support the use of such an architecture in certain special-purpose arrays are presented. >

...read moreread less

Journal Article•DOI•

An efficient algorithm for sequential circuit test generation

[...]

T.P. Kelsey¹, Kewal K. Saluja², S.Y. Lee²•Institutions (2)

Bell Labs¹, University of Wisconsin-Madison²

01 Nov 1993-IEEE Transactions on Computers

TL;DR: An efficient sequential circuit automatic test generation algorithm based on PODEM and uses a nine-valued logic model that saves both the good and the faulty machine states after finding a test to aid in subsequent test generation.

...read moreread less

Abstract: This paper presents an efficient sequential circuit automatic test generation algorithm. The algorithm is based on PODEM and uses a nine-valued logic model. Among the novel features of the algorithm are use of Initial Timeframe Algorithm and correct implementation of a solution to the Previous State Information Problem. The Initial Timeframe Algorithm, one of the most important aspects of the test generator, determines the number of timeframes required to excite the fault for which a test is to be derived and the number of timeframes required to observe the excited fault. Correct determination of the number of timeframes in which the fault should be excited (activated) and observed saves the test generator from performing unnecessary search in the input space. Test generation is unidirectional, i.e., it is done strictly in forward time, and flip-flops in the initial timeframe are never assigned a state that needs to be justified later. The algorithm saves both the good and the faulty machine states after finding a test to aid in subsequent test generation. The Previous State Information Problem, which has often been ignored by existing test generators, is presented and discussed in the paper. Experimental results are presented to demonstrate the effectiveness of the algorithm. >

...read moreread less

Journal Article•DOI•

Vector space theoretic analysis of additive cellular automata and its application for pseudoexhaustive test pattern generation

[...]

A.K. Das¹, P. Pal Chaudhuri²•Institutions (2)

Cadence Design Systems¹, Indian Institute of Technology Kharagpur²

An efficient Jacobi-like algorithm for parallel eigenvalue computation

TL;DR: A novel scheme for utilizing the regular structure of three neighborhood additive cellular automata (CAs) for pseudoexhaustive test pattern generation is introduced.

...read moreread less

Abstract: A novel scheme for utilizing the regular structure of three neighborhood additive cellular automata (CAs) for pseudoexhaustive test pattern generation is introduced. The vector space generated by a CA can be decomposed into several cyclic subspaces. A cycle corresponding to an m-dimensional cyclic subspace has been shown to pseudoexhaustively test an n-input circuit (n>or=m). Such a cycle is shown to supply a (m-1) bit exhaustive pattern including the all-zeros (m-1)-tuple. Schemes have been reported specifying how one or more subsets of (m-1) cell positions of an n-cell CA can be identified to generate exhaustive patterns in an m-dimensional cyclic subspace. >

...read moreread less

Journal Article•DOI•

[...]

Jurgen Gotze, Steffen Paul, M. Sauer

A unified negative-binomial distribution for yield analysis of defect-tolerant circuits

TL;DR: A very fast Jacobi-like algorithm for the parallel solution of symmetric eigenvalue problems is proposed, although only linear convergence is obtained for the most simple version of the new algorithm, the overall operation count decreases dramatically.

...read moreread less

Abstract: A very fast Jacobi-like algorithm for the parallel solution of symmetric eigenvalue problems is proposed. It becomes possible by not focusing on the realization of the Jacobi rotation with a CORDIC processor, but by applying approximate rotations and adjusting them to single steps of the CORDIC algorithm, i.e., only one angle of the CORDIC angle sequence defines the Jacobi rotation in each step. This angle can be determined by some shift, add and compare operations. Although only linear convergence is obtained for the most simple version of the new algorithm, the overall operation count (shifts and adds) decreases dramatically. A slow increase of the number of involved CORDIC angles during the runtime retains quadratic convergence. >

...read moreread less

Journal Article•DOI•

[...]

Israel Koren, Zahava Koren, C.H. Stepper

Classification of faults in synchronous sequential circuits

TL;DR: The addition of a new parameter, the block size, to the two existing parameters of the fault distribution is proposed, which allows the unification of the existing models and, at the same time, adds a whole range of medium-size clustering models.

...read moreread less

Abstract: It has been recognized that the yield of fault-tolerant VLSI circuits depends on the size of the fault clusters. Consequently, models for yield analysis have been proposed for large-area clustering and small-area clustering, based on the two-parameter negative-binomial distribution. The addition of a new parameter, the block size, to the two existing parameters of the fault distribution is proposed. This parameter allows the unification of the existing models and, at the same time, adds a whole range of medium-size clustering models. Thus, the flexibility in choosing the appropriate yield model is increased. Methods for estimating the newly defined block size are presented and the approach is validated through simulation and empirical data. >

...read moreread less

Journal Article•DOI•

[...]

Irith Pomeranz¹, Sudhakar M. Reddy¹•Institutions (1)

University of Iowa¹

A new technique for fast number comparison in the residue number system

TL;DR: Undetectable and redundant faults in synchronous sequential circuits are analyzed and a distinction is drawn between undetectable faults and faults that are never manifested as output errors.

...read moreread less

Abstract: Undetectable and redundant faults in synchronous sequential circuits are analyzed. A distinction is drawn between undetectable faults and faults that are never manifested as output errors. The latter are classified as redundant. It is shown that there are faults for which a test sequence does not exist; however, under certain initial conditions (or initial states) of the circuit, faulty behavior may be observed. Such faults are called partially detectable faults. A partially detectable fault is undetectable, but is not redundant, as it affects circuit operation under some conditions. The author observes that the notion of redundancy cannot be separated from the mode of operation of the circuit. Two modes of operation are considered, representative of common modes, called the synchronization mode and the free mode. Accordingly, the identification of redundant faults calls for different test generation strategies. Two test strategies to generate tests for detectable faults and partial tests for partially detectable faults are defined, called the restricted test strategy and the unrestricted test strategy. >

...read moreread less

Journal Article•DOI•

[...]

Giovanni Dimauro, Sebastiano Impedovo, Giuseppe Pirlo

01 May 1993-IEEE Transactions on Computers

TL;DR: In the final implementation of the technique the extra modulus has been inserted in the set of moduli of the residue system, avoiding redundancy.

...read moreread less

Abstract: A technique for number comparison in the residue number system is presented, and its theoretical validity is proved. The proposed solution is based on using a diagonal function to obtain a magnitude order of the numbers. In a first approach the function is computed using a suitable extra modulus. In the final implementation of the technique the extra modulus has been inserted in the set of moduli of the residue system, avoiding redundancy. The technique is compared with other approaches. >

...read moreread less

Journal Article•DOI•

Fault-tolerant meshes and hypercubes with minimal numbers of spares

[...]

Jehoshua Bruck¹, Robert Cypher¹, Ching-Tien Ho¹•Institutions (1)

IBM¹

Combinatorial analysis of the fault-diameter of the n-cube

TL;DR: The authors optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor.

...read moreread less

Abstract: This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. The approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. The authors optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k=1, they present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. They also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, they give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes. >

...read moreread less

Journal Article•DOI•

[...]

Shahram Latifi¹•Institutions (1)

University of Nevada, Las Vegas¹

01 Jan 1993-IEEE Transactions on Computers

TL;DR: It is shown that the diameter of an n-dimensional hypercube can only increase by an additive constant of 1 when (n-1) faulty processors are present and it is proven that all the n-cubes with a fault-diameter of (n+2) are isomorphic.

...read moreread less

Abstract: It is shown that the diameter of an n-dimensional hypercube can only increase by an additive constant of 1 when (n-1) faulty processors are present. Based on the concept of forbidden faulty sets, which guarantees the connectivity of the cube in the presence of up to (2n-3) faulty processors. It is shown that the diameter of the n-cube increases to (n-2) as a result of (2n-3) processor failures. It is also shown that only those nodes whose Hamming distance is (n-2) have the potential to be located at two ends of the diameter of the damaged cube. It is proven that all the n-cubes with (2n-3) faulty processors and a fault-diameter of (n+2) are isomorphic. A generalization to the subject study is presented. >

...read moreread less

Journal Article•DOI•

Embedding of cycles in arrangement graphs

[...]

Khaled Day¹, Anand Tripathi²•Institutions (2)

University of Bahrain¹, University of Minnesota²

Computational complexity issues in operative diagnosis of graph-based systems

TL;DR: The authors study these graphs by proving the existence of Hamiltonian cycles in any arrangement graph and proving that an arrangement graph contains cycles of all lengths ranging between 3 and the size of the graph.

...read moreread less

Abstract: Arrangement graphs have been proposed as an attractive interconnection topology for large multiprocessor systems. The authors study these graphs by proving the existence of Hamiltonian cycles in any arrangement graph. They also prove that an arrangement graph contains cycles of all lengths ranging between 3 and the size of the graph. They show that an arrangement graph can be decomposed into node disjoint cycles in many different ways. >

...read moreread less

Journal Article•DOI•

[...]

Nageswara S. V. Rao¹•Institutions (1)

Oak Ridge National Laboratory¹

01 Apr 1993-IEEE Transactions on Computers

TL;DR: Systems that can be modeled as graphs, such that nodes represent the components and the edges represent the fault propagation between the components, are considered, and the problem of detecting multiple faults is shown to be NP-complete.

...read moreread less

Abstract: Systems that can be modeled as graphs, such that nodes represent the components and the edges represent the fault propagation between the components, are considered. Some components are equipped with alarms that ring in response to faulty conditions. In these systems, two types of problem are studies: fault diagnosis and alarm placement. The fault diagnosis problems deal with computing the set of all potential failure sources that correspond to a set of ringing alarms. Single faults, where exactly one component can become faulty at any time, are primarily considered. Systems are classified into zero-time and non-zero-time systems on the basis of fault propagation time. The latter are further classified on the basis of knowledge of propagation times. For each of these classes algorithms are presented for single fault diagnosis. The problem of detecting multiple faults is shown to be NP-complete. An alarm placement problem that requires a single fault to be uniquely diagnosed is examined. >

...read moreread less

Journal Article•DOI•

Accumulator-based compaction of test responses

[...]

Janusz Rajski¹, Jerzy Tyszer¹•Institutions (1)

McGill University¹

The effect of code expanding optimizations on instruction cache design

TL;DR: An accumulator-based compaction (ABC) scheme for parallel compaction of test responses is introduced and it is proven that the asymptotic coverage drop in ABC with binary adders is 2/sup -k/, where k is the number of bits in the adder that the fault can reach.

...read moreread less

Abstract: An accumulator-based compaction (ABC) scheme for parallel compaction of test responses is introduced. The asymptotic and transient coverage drop introduced by accumulators with binary and 1's complement adders is studied using Markov chain models. It is proven that the asymptotic coverage drop in ABC with binary adders is 2/sup -k/, where k is the number of bits in the adder that the fault can reach. In ABC with 1's complement adders, the asymptotic coverage drop for a fairly general class of faults is (2n-1)/sup -1/, where n is the total number of bits. The analysis of transient behavior relates the coverage drop with the probability of fault injection, the size of the accumulator, and the length of the test experiment. The process is characterized by damping factors derived for various values of these parameters. >

...read moreread less

Journal Article•DOI•

[...]

W.Y. Chen¹, Pohua P. Chang¹, Thomas M. Conte¹, Wen-mei W. Hwu¹•Institutions (1)

University of Illinois at Urbana–Champaign¹