Showing papers in "IEEE Transactions on Very Large Scale Integration Systems in 1994"

PDF

Open Access

Journal Article•DOI•

Power analysis of embedded software: a first step towards software power minimization

[...]

Vivek Tiwari¹, Sharad Malik¹, Andrew Wolfe¹•Institutions (1)

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A power analysis technique is developed that has been applied to two commercial microprocessors and can be employed to evaluate the power cost of embedded software and can help in verifying if a design meets its specified power constraints.

...read moreread less

Abstract: Embedded computer systems are characterized by the presence of a dedicated processor and the software that runs on it Power constraints are increasingly becoming the critical component of the design specification of these systems At present, however, power analysis tools can only be applied at the lower levels of the design-the circuit or gate level It is either impractical or impossible to use the lower level tools to estimate the power cost of the software component of the system This paper describes the first systematic attempt to model this power cost A power analysis technique is developed that has been applied to two commercial microprocessors-Intel 486DX2 and Fujitsu SPARClite 934 This technique can be employed to evaluate the power cost of embedded software This can help in verifying if a design meets its specified power constraints Further, it can also be used to search the design space in software power optimization Examples with power reduction of up to 40%, obtained by rewriting code using the information provided by the instruction level power model, illustrate the potential of this idea >

...read moreread less

1,055 citations

Journal Article•DOI•

A survey of power estimation techniques in VLSI circuits

[...]

Farid N. Najm¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A review of the power estimation techniques that have recently been proposed for very large scale integrated (VLSI) circuits is presented.

...read moreread less

Abstract: With the advent of portable and high-density microelectronic devices, the power dissipation of very large scale integrated (VLSI) circuits is becoming a critical concern. Accurate and efficient power estimation during the design phase is required in order to meet the power specifications without a costly redesign process. In this paper, we present a review of the power estimation techniques that have recently been proposed. >

...read moreread less

696 citations

Journal Article•DOI•

Low-power digital systems based on adiabatic-switching principles

[...]

William C. Athas¹, Lars Svensson¹, J.G. Koller¹, N. Tzartzanis¹, E. Ying-Chin Chou¹ - Show less +1 more•Institutions (1)

University of Southern California¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The dissipation of the adiabatic amplifier is compared to that of conventional switching circuits, both for the case of a fixed voltage swing and the case when the voltage swing can be scaled to reduce power dissipation.

...read moreread less

Abstract: Adiabatic switching is an approach to low-power digital circuits that differs fundamentally from other practical low-power techniques. When adiabatic switching is used, the signal energies stored on circuit capacitances may be recycled instead of dissipated as heat. We describe the fundamental adiabatic amplifier circuit and analyze its performance. The dissipation of the adiabatic amplifier is compared to that of conventional switching circuits, both for the case of a fixed voltage swing and the case when the voltage swing can be scaled to reduce power dissipation. We show how combinational and sequential adiabatic-switching logic circuits may be constructed and describe the timing restrictions required for adiabatic operation. Small chip-building experiments have been performed to validate the techniques and to analyse the associated circuit overhead. >

...read moreread less

609 citations

Journal Article•DOI•

Precomputation-based sequential logic optimization for low power

[...]

Mazhar M. Alidina¹, José Monteiro¹, Srinivas Devadas¹, Abhijit Ghosh, Marios C. Papaefthymiou - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This work presents a powerful sequential logic optimization method based on selectively precomputing the output logic values of the circuit one clock cycle before they are required, and using the precomputed values to reduce internal switching activity in the succeeding clock cycle.

...read moreread less

Abstract: We address the problem of optimizing logic-level sequential circuits for low power We present a powerful sequential logic optimization method that is based on selectively precomputing the output logic values of the circuit one clock cycle before they are required, and using the precomputed values to reduce internal switching activity in the succeeding clock cycle We present two different precomputation architectures which exploit this observation The primary optimization step is the synthesis of the precomputation logic, which computes the output values for a subset of input conditions If the output values can be precomputed, the original logic circuit can be "turned off" in the next clock cycle and will have substantially reduced switching activity The size of the precomputation logic determines the power dissipation reduction, area increase and delay increase relative to the original circuit Given a logic-level sequential circuit, we present an automatic method of synthesizing precomputation logic so as to achieve maximal reductions in power dissipation We present experimental results on various sequential circuits Up to 75% reductions in average switching activity and power dissipation are possible with marginal increases in circuit area and delay >

...read moreread less

326 citations

Journal Article•DOI•

Low-power operation using self-timed circuits and adaptive scaling of the supply voltage

[...]

L.S. Nielsen¹, C. Niessen², Jens Sparsø, K. van Berkel•Institutions (2)

University of Copenhagen¹, Philips²

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The combination of supply scaling and self-timed circuitry which has some unique advantages, and the thorough analysis of the power savings that are possible using this technique are described.

...read moreread less

Abstract: Recent research has demonstrated that for certain types of applications like sampled audio systems, self-timed circuits can achieve very low power consumption, because unused circuit parts automatically turn into a stand-by mode. Additional savings may be obtained by combining the self-timed circuits with a mechanism that adaptively adjusts the supply voltage to the smallest possible, while maintaining the performance requirements. This paper describes such a mechanism, analyzes the possible power savings, and presents a demonstrator chip that has been fabricated and tested. The idea of voltage scaling has been used previously in synchronous circuits, and the contributions of the present paper are: 1) the combination of supply scaling and self-timed circuitry which has some unique advantages, and 2) the thorough analysis of the power savings that are possible using this technique. >

...read moreread less

255 citations

Journal Article•DOI•

An algorithmic and novel design of a leading zero detector circuit: comparison with logic synthesis

[...]

Vojin G. Oklobdzija¹•Institutions (1)

University of California, Davis¹

01 Mar 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A novel way of implementing the leading zero detector (LZD) circuit is presented based on an algorithmic approach resulting in a modular and scalable circuit for any number of bits.

...read moreread less

Abstract: A novel way of implementing the leading zero detector (LZD) circuit is presented The implementation is based on an algorithmic approach resulting in a modular and scalable circuit for any number of bits We designed a 32 and 64 bit leading zero detector circuit in CMOS and ECL technology The CMOS version was designed using both: logic synthesis and an algorithmic approach The algorithmic implementation is compared with the results obtained using modern logic synthesis tools in the same 06 /spl mu/m CMOS technology The implementation based on an algorithmic approach showed an advantage compared to the results produced by the logic synthesis ECL implementation of the 64 bit LZD circuit was simulated to perform in under 200 ps for nominal speed >

...read moreread less

167 citations

Journal Article•DOI•

Low power design using double edge triggered flip-flops

[...]

R. Hossain¹, L.D. Wronski¹, A. Albicki¹•Institutions (1)

University of Rochester¹

01 Jun 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The results are extremely encouraging, indicating that double edge triggered flip-flops are capable of significant energy savings, for only a small overhead in complexity.

...read moreread less

Abstract: In this paper we study the power savings possible using double edge triggered (DET), instead of, conventional single edge triggered (SET) flip-flops. We begin the paper by introducing a set of novel D-type double edge triggered flip-flops which can be implemented with fewer transistors than any previous design. The power dissipation in these flip-flops and single edge triggered flip-flops is compared via architectural level studies, analytical considerations and simulations. The analysis includes an implementation independent study on the effect of input sequences in the energy dissipation of single and double edge triggered flip-flops. The system level energy savings possible by using registers consisting of double edge triggered flip-flops, instead of single edge triggered flip-flops, is subsequently explored. The results are extremely encouraging, indicating that double edge triggered flip-flops are capable of significant energy savings, for only a small overhead in complexity. >

...read moreread less

149 citations

Journal Article•DOI•

On area/depth trade-off in LUT-based FPGA technology mapping

[...]

Jason Cong¹, Yuzheng Ding¹•Institutions (1)

University of California, Los Angeles¹

01 Jun 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A polynomial time optimal algorithm is developed for computing an area-minimum mapping solution without node duplication for a K-bounded general Boolean network, which makes a significant step towards complete understanding of the general area minimization problem in FPGA technology mapping.

...read moreread less

Abstract: In this paper, we study the area and depth trade-off in lookup-table (LUT) based FPGA technology mapping. Starting from a depth-optimal mapping solution, we perform a sequence of depth relaxation operations and area-minimizing mapping procedures to produce a set of mapping solutions for a given design with smooth area and depth trade-off. As the core of the area minimization step, we have developed a polynomial time optimal algorithm for computing an area-minimum mapping solution without node duplication for a K-bounded general Boolean network, which makes a significant step towards complete understanding of the general area minimization problem in FPGA technology mapping. The experimental results on MCNC benchmark circuits show that our solution sets outperform the solutions produced by most existing mapping algorithms in terms of both area and depth minimization. >

...read moreread less

144 citations

Journal Article•DOI•

RSYN: a system for automated synthesis of reliable multilevel circuits

[...]

Kaushik De¹, C. Natarajan¹, D. Nair¹, Prithviraj Banerjee¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jun 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Three schemes for concurrent error detection in a multilevel circuit are proposed in this paper, using which all the single stuck at faults in the circuit can be detected concurrently.

...read moreread less

Abstract: Conventional logic synthesis systems are targeted towards reducing the area required by a logic block, as measured by the literal count or gate count; or, improving the performance in terms of gate delays; or, improving the testability of the synthesized circuit, as measured by the irredundancy of the resultant circuit. In this paper, we address the problem of developing reliability driven logic synthesis algorithms for multilevel logic circuits, which are integrated within the MIS synthesis system. Our procedures are based on concurrent error detection techniques that have been proposed in the past for two level circuits, and adapting those techniques to multilevel logic synthesis algorithms. Three schemes for concurrent error detection in a multilevel circuit are proposed in this paper, using which all the single stuck at faults in the circuit can be detected concurrently. The first scheme uses duplication of a given multilevel circuit with the addition of a totally self-checking comparator. The second scheme proposes a procedure to generate the multilevel circuit from a two level representation under some constraint such that, the Berger code of the output vector can be used to detect any single fault inside the circuit, except at the inputs. A constrained technology mapping procedure is also presented in this paper. The third scheme is based on parity codes on the outputs. The outputs are partitioned using a novel partitioning algorithm, and each partition is implemented using a multilevel circuit. Some additional parity coded outputs are generated. In all three schemes, all the necessary checkers are generated automatically and the whole circuit is placed and routed using the Timberwolf layout package. The area overheads for several benchmark examples are reported in this paper. The entire procedure is integrated into a new system called RSYN. >

...read moreread less

116 citations

Journal Article•DOI•

Simultaneous driver and wire sizing for performance and power optimization

[...]

Jason Cong¹, Cheng-Kok Koh¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper studies the simultaneous driver and wire sizing problem under two objective functions: i) delay minimization only, or ii) combined delay and power dissipation minimization, and implements polynomial time algorithms and efficient algorithms for computing optimal SDWS solutions under the two objectives.

...read moreread less

Abstract: In this paper, we study the simultaneous driver and wire sizing (SDWS) problem under two objective functions: i) delay minimization only, or ii) combined delay and power dissipation minimization. We present general formulations of the SDWS problem under these two objectives based on the distributed Elmore delay model with consideration of both capacitive power dissipation and short-circuit power dissipation. We show several interesting properties of the optimal SDWS solutions under the two objectives, including an important result which reveals the relationship between driver sizing and optimal wire sizing. These results lead to polynomial time algorithms for computing the lower and upper bounds of optimal SDWS solutions under the two objectives, and efficient algorithms for computing optimal SDWS solutions under the two objectives. We have implemented these algorithms and compared them with existing design methods for driver sizing only or independent driver and wire sizing. Accurate SPICE simulation shows that our methods reduce the delay by up to 12%-49% and power dissipation by 26%-63% compared with existing design methods. >

...read moreread less

114 citations

Journal Article•DOI•

Field programmable gate arrays and floating point arithmetic

[...]

Barry Fagin¹, C. Renard²•Institutions (2)

United States Air Force Academy¹, University of Nantes²

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: An assessment of the strengths and weaknesses of using FPGA's for floating-point arithmetic.

...read moreread less

Abstract: We present empirical results describing the implementation of an IEEE Standard 754 compliant floating-point adder/multiplier using field programmable gate arrays. The use of FPGA's permits fast and accurate quantitative evaluation of a variety of circuit design tradeoffs for addition and multiplication. PPGA's also permit accurate assessments of the area and time costs associated with various features of the IEEE floating-point standard, including rounding and gradual underflow. These costs are analyzed, along with the effects of architectural correlation, a phenomenon that occurs when the cost of combining architectural features exceeds the sum of separate implementation. We conclude with an assessment of the strengths and weaknesses of using FPGA's for floating-point arithmetic. >

...read moreread less

Journal Article•DOI•

The yield enhancement of field-programmable gate arrays

[...]

N.J. Howard¹, Andy M. Tyrrell¹, Nigel M. Allinson¹•Institutions (1)

Universities UK¹

01 Mar 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The inability to contain faults within single cells and the need for fast reconfiguration are identified as the key obstacles to obtaining a significant increase in yield.

...read moreread less

Abstract: The fine granularity and reconfigurable nature of field-programmable gate arrays (FPGA's) suggest that defect-tolerant methods can be readily applied to these devices in order to increase their maximum economic sizes, through increased yield. This paper identifies the inability to contain faults within single cells and the need for fast reconfiguration as the key obstacles to obtaining a significant increase in yield. Monte Carlo defect modeling of the photolithographic layers of VLSI FPGA's is used as a foundation for the yield modeling of various defect-tolerant architectures. Results suggest that a medium-grain architecture is the best solution, offering a substantial increase in size without significant side effects. This architecture is shown to produce greater gate densities than the alternative approach of realizing ultralarge scale FPGA's-multichip modules. >

...read moreread less

Journal Article•DOI•

Analyzing and exploiting the structure of the constraints in the ILP approach to the scheduling problem

[...]

S. Chaudhuri¹, Robert A. Walker¹, John E. Mitchell•Institutions (1)

Rensselaer Polytechnic Institute¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper provides the first in-depth formal analysis of the structure of the constraints in high-level synthesis, and shows how to exploit that structure in a well-designed ILP formulation by adding new valid inequalities.

...read moreread less

Abstract: In integer linear programming (ILP), formulating a "good" model is of crucial importance to solving that model. In this paper, we begin with a mathematical analysis of the structure of the assignment, timing, and resource constraints in high-level synthesis, and then evaluate the structure of the scheduling polytope described by these constraints. We then show how the structure of the constraints can be exploited to develop a well-structured ILP formulation, which can serve as a solid theoretical foundation for future improvement. As a start in that direction, we also present two methods to further tighten the formulation. The contribution of this paper is twofold: 1) it provides the first in-depth formal analysis of the structure of the constraints, and it shows how to exploit that structure in a well-designed ILP formulation, and 2) it shows how to further improve a well-structured formulation by adding new valid inequalities. >

...read moreread less

Journal Article•DOI•

Reliability of majority voting based VLSI fault-tolerant circuits

[...]

Charles E. Stroud¹•Institutions (1)

University of Kentucky¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Analysis of the reliability model indicates that, for some circuits, the reliability obtained with majority voting techniques is significantly greater than predicted by any previous model.

...read moreread less

Abstract: The effect of compensating module faults on the reliability of majority voting based VLSI fault-tolerant circuits is investigated using a fault injection simulation method. This simulation method facilitates consideration of multiple faults in the replicated circuit modules as well as the majority voting circuits to account for the fact that, in VLSI implementations, the majority voting circuits are constructed from components of the same reliability as those used to construct the circuit modules. From the fault injection simulation, a survivability distribution is obtained which, when combined with an area overhead expression, leads to a more accurate reliability model for majority voting based VLSI fault-tolerant circuits. The new model is extended to facilitate the calculation of reliability of fault-tolerant circuits which have sustained faults but continue to operate properly. Analysis of the reliability model indicates that, for some circuits, the reliability obtained with majority voting techniques is significantly greater than predicted by any previous model. >

...read moreread less

Journal Article•DOI•

Clairvoyant: a synthesis system for production-based specification

[...]

Andrew Seawright¹, Forrest Brewer¹•Institutions (1)

University of California, Santa Barbara¹

01 Jun 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper describes a new high-level synthesis system based on the hierarchical production based specification (PBS) that automatically constructs a controlling machine from the PBS and this process is not impacted by the possibly exponentially larger deterministic state space of the designs.

...read moreread less

Abstract: This paper describes a new high-level synthesis system based on the hierarchical production based specification (PBS). Advantages of this form of specification are that the designer does not describe the control flow in terms of explicit states or control variables, and that the designer does not describe a particular form of implementation. The production-based specification also separates the specification of the control aspects and data-flow aspects of the design. The control is implicitly described via the production hierarchy, while the data-flow is described as action computations. This approach is a hardware analog of popular software engineering techniques. The Clairvoyant system automatically constructs a controlling machine from the PBS and this process is not impacted by the possibly exponentially larger deterministic state space of the designs. The encodings generated by the constructions compare favorably to encodings derived using graph-based state encoding techniques in terms of logic complexity and logic depth. These construction techniques utilize recent advances in BDD techniques. >

...read moreread less

Journal Article•DOI•

A Gaussian synapse circuit for analog VLSI neural networks

[...]

Joongho Choi¹, Bing J. Sheu¹, Jui-Ming Chang¹•Institutions (1)

University of Southern California¹

01 Mar 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A compact analog synapse cell which is not biased in the subthreshold region for fully-parallel operation is presented, which can approximate a Gaussian function with accuracy around 98% in the ideal case.

...read moreread less

Abstract: Back-propagation neural networks with Gaussian function synapses have better convergence property over those with linear-multiplying synapses. In digital simulation, more computing time is spent on Gaussian function evaluation. We present a compact analog synapse cell which is not biased in the subthreshold region for fully-parallel operation. This cell can approximate a Gaussian function with accuracy around 98% in the ideal case. Device mismatch induced by fabrication process will cause some degradation to this approximation. The Gaussian synapse cell can also be used in unsupervised learning. Programmability of the proposed Gaussian synapse cell is achieved by changing the stored synapse weight W/sub ji/, the reference current and the sizes of transistors in the differential pair. >

...read moreread less

Journal Article•DOI•

Diagnosing scan chain faults

[...]

Sandip Kundu¹•Institutions (1)

IBM¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A diagnosis system that can diagnose faults in a scan chain so that the manufacturing process or physical design can be filed to improve yield is described.

...read moreread less

Abstract: Testing screens for good chips. However, when test fall out is high (low yield) it becomes necessary to diagnose faults so that the manufacturing process or physical design can be filed to improve yield. Several scan based diagnostic schemes are used in industry. They work when the scan chain itself is fault free. In this paper we describe a diagnosis system that can diagnose faults in a scan chain. >

...read moreread less

Journal Article•DOI•

Power-delay characteristics of CMOS adders

[...]

C. Nagendra¹, Robert Michael Owens¹, Mary Jane Irwin¹•Institutions (1)

Pennsylvania State University¹

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: It is shown that by sizing transistors judiciously it is possible to gain significant speed improvements at the cost of only a slight increase in power and hence a better power-delay product.

...read moreread less

Abstract: An approach to designing CMOS adders for both high speed and low power is presented by analyzing the performance of three types of adders - linear time adders, logN time adders and constant time adders. The representative adders used are a ripple carry adder, a blocked carry lookahead adder and several signed-digit adders, respectively. Some of the tradeoffs that are possible during the logic design of an adder to improve its power-delay product are identified. An effective way of improving the speed of a circuit is by transistor sizing which unfortunately increases power dissipation to a large extent. It is shown that by sizing transistors judiciously it is possible to gain significant speed improvements at the cost of only a slight increase in power and hence a better power-delay product. Perflex, an in-house performance driven layout generator, is used to systematically generate sized layouts. >

...read moreread less

Journal Article•DOI•

Coactive scheduling and checkpoint determination during high level synthesis of self-recovering microarchitectures

[...]

Alex Orailoglu¹, Ramesh Karri²•Institutions (2)

University of California, San Diego¹, University of Massachusetts Amherst²

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: An Integer Linear Programming model for the self-recovering microarchitecture synthesis problem is presented and the resulting ILP formulation can minimize either the number of voters or the overall hardware, subject to constraints on the numbers of clock cycles the retry period, and thenumber of checkpoints.

...read moreread less

Abstract: The growing trend towards VLSI implementation of crucial tasks in critical applications has increased both the demand for and the scope of fault-tolerant VLSI systems. In this paper, we present a self-recovering microarchitecture synthesis system. In a self-recovering microarchitecture, intermediate results are compared at regular intervals, and if correct saved in registers (checkpointing). On the other hand, on detecting a fault, the self-recovering microarchitecture rolls back to a previous checkpoint and retries. The proposed synthesis system comprises of a heuristic and an optimal subsystem. The heuristic synthesis subsystem has two components. Whereas the checkpoint insertion algorithm identifies good checkpoints by successively eliminating clock cycle boundaries that either have a high checkpoint overhead or violate the retry period constraint, the novel edge-based schedule, assigns edges to clock cycle boundaries, in addition to scheduling nodes to clock cycles. Also, checkpoint insertion and edge-based scheduling are intertwined using a flexible synthesis methodology. We additionally show an Integer Linear Programming model for the self-recovering microarchitecture synthesis problem. The resulting ILP formulation can minimize either the number of voters or the overall hardware, subject to constraints on the number of clock cycles the retry period, and the number of checkpoints. >

...read moreread less

Journal Article•DOI•

Logic design error diagnosis and correction

[...]

Pi-Yu Chung¹, Yi-Min Wang¹, Ibrahim N. Hajj¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper describes a formal method for the diagnosis and correction of logic design errors in an incorrect gate-level implementation, which is robust and covers all, simple design errors described by Abadir et al. (1988).

...read moreread less

Abstract: Logic verification tools are often used to verify a gate-level implementation of a digital system in terms of its functional specification. If the implementation is found not to be functionally equivalent to the specification, it is important to correct the implementation automatically. This paper describes a formal method for the diagnosis and correction of logic design errors in an incorrect gate-level implementation. We use Boolean equation techniques to search for potential error locations. An efficient search and pruning algorithm is developed by introducing the notion of immediate dominator set. Two correction procedures are proposed. Gate correction corrects errors such as wrong gate type, missing inverters, etc.; line correction corrects errors such as missing wires and wrong connections. Our method is robust and covers all, simple design errors described by Abadir et al. (1988). Experimental results for a set of ISCAS and MCNC benchmark circuits demonstrate the effectiveness of the proposed techniques. >

...read moreread less

Journal Article•DOI•

On broad-side delay test

[...]

Jacob Savir¹, S. Patil¹•Institutions (1)

IBM¹

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper concentrates on generation of broadside delay test vectors; shows the results of experiments conducted on the ISCAS sequential benchmarks, and discusses some concerns of the broad-sidedelay test strategy.

...read moreread less

Abstract: A broad-side delay test is a form of a scan-based delay test, where the first vector of the pair is scanned into the chain, and the second vector of the pair is the combinational circuit's response to this first vector. This delay test form is called "broad-side" since the second vector of the delay test pair is provided in a broad-side fashion, namely through the logic. This paper concentrates on generation of broadside delay test vectors; shows the results of experiments conducted on the ISCAS sequential benchmarks, and discusses some concerns of the broad-side delay test strategy. >

...read moreread less

Journal Article•DOI•

A new, cellular automaton-based, nearest neighbor pattern classifier and its VLSI implementation

[...]

Panagiotis Tzionas¹, Ph. Tsalides¹, Adonios Thanailakis¹•Institutions (1)

Democritus University of Thrace¹

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A new, parallel, nearest-neighbor (NN) pattern classifier, based on a 2D Cellular Automaton (CA) architecture, is presented, which produces piece-wise linear discriminant curves between clusters of points of complex shape (nonlinearly separable).

...read moreread less

Abstract: A new, parallel, nearest-neighbor (NN) pattern classifier, based on a 2D Cellular Automaton (CA) architecture, is presented in this paper. The proposed classifier is both time and space efficient, when compared with already existing NN classifiers, since it does not require complex distance calculations and ordering of distances, and storage requirements are kept minimal since each cell stores information only about its nearest neighborhood. The proposed classifier produces piece-wise linear discriminant curves between clusters of points of complex shape (nonlinearly separable) using the computational geometry concept known as the Voronoi diagram, which is established through CA evolution. These curves are established during an "off-line" operation and, thus, the subsequent classification of unknown patterns is achieved very fast. The VLSI design and implementation of a nearest neighborhood processor of the proposed 2D CA architecture is also presented in this paper. >

...read moreread less

Journal Article•DOI•

A statistical study of defect maps of large area VLSI IC's

[...]

Israel Koren, Zahava Koren¹, C.H. Stapper²•Institutions (2)

University of Massachusetts Amherst¹, IBM²

01 Jun 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The commonly employed models, most notably, the large area clustering negative binomial distribution do not provide a sufficiently good match for these large area VLSI IC's and only the recently proposed medium size clustering model is close enough to the empirical distribution.

...read moreread less

Abstract: Defect maps of 57 wafers containing large area VLSI IC's were analyzed in order to find a good match between the empirical distribution of defects and a theoretical model. Our main result is that the commonly employed models, most notably, the large area clustering negative binomial distribution, do not provide a sufficiently good match for these large area IC's. Only the recently proposed medium size clustering model is close enough to the empirical distribution. An even better match can be obtained either by combining two theoretical distributions or by a "censoring" procedure in which the worst chips are ignored. Another goal of the study was to find out whether certain portions of either the chip or the wafer had more defects than the others. >

...read moreread less

Journal Article•DOI•

Certified timing verification and the transition delay of a logic circuit

[...]

Srinivas Devadas¹, Kurt Keutzer², Sharad Malik³, Angie Wang²•Institutions (3)

Massachusetts Institute of Technology¹, Synopsys², Princeton University³

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper examines the transition delay of a circuit and provides a procedure for directly calculating the transitionDelays, which outputs a vector sequence that may be timing simulated to certify static timing verification.

...read moreread less

Abstract: Most research in timing verification has implicitly assumed a single vector floating mode computation of delay which is an approximation of the multivector transition delay. In this paper we examine the transition delay of a circuit and demonstrate that the transition delay of a circuit can differ from the floating delay of a circuit. We then provide a procedure for directly calculating the transition delay of a circuit. The most practical benefit of this procedure is the fact that it not only results in a delay calculation but outputs a vector sequence that may be timing simulated to certify static timing verification. >

...read moreread less

Journal Article•DOI•

MULTIPAR: behavioral partition for synthesizing multiprocessor architectures

[...]

Yunn-Yen Chen, Yu-Chin Hsu, Chung-Ta King¹•Institutions (1)

University of California, Riverside¹

01 Mar 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper presents methods for scheduling and partitioning behavioral descriptions (e.g., CDFG's) in order to synthesize application-specific multiprocessor systems and proposes an iterative partitioning heuristic to solve.

...read moreread less

Abstract: In this paper, we present methods for scheduling and partitioning behavioral descriptions (e.g., CDFG's) in order to synthesize application-specific multiprocessor systems. Our target application domain is digital signal processing (DSP). In order to meet the user given constraints (such as timing), maximizing the system throughput and minimizing the amount of communication between processors are important. A model of a target processor and the communication device (i.e., bus, FIFO and delay element) is defined as a basis for the synthesis. We use an integer linear programming formulating to solve the partitioning and scheduling problems simultaneously. The optimization complexity for large applications can be reduced by using a simplified formulation. For even larger applications, we propose an iterative partitioning heuristic to solve. Finally, the formulations are extended to take into account of conditional branches, loops, and critical signals. >

...read moreread less

Journal Article•DOI•

Novel sorting network-based architectures for rank order filters

[...]

Chaitali Chakrabarti¹, Li-Yu Wang¹•Institutions (1)

Arizona State University¹

01 Dec 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper presents two novel sorting network-based architectures for computing high sample rate nonrecursive rank order filters that are based on bubble-sort and Batcher's odd-even merge sort.

...read moreread less

Abstract: This paper presents two novel sorting network-based architectures for computing high sample rate nonrecursive rank order filters. The proposed architectures consist of significantly fewer comparators than existing sorting network-based architectures that are based on bubble-sort and Batcher's odd-even merge sort. The reduction in the number of comparators is obtained by sorting the columns of the window only once, and by merging the sorted columns in a way such that the number of candidate elements for the output is very small. The number of comparators per output is reduced even further by processing a block of outputs at a time. Block processing procedures that exploit the computational overlap between consecutive windows are developed for both the proposed networks. >

...read moreread less

Journal Article•DOI•

A VLSI architecture for a real-time code book generator and encoder of a vector quantizer

[...]

K. Tsang¹, B.W.Y. Wei¹•Institutions (1)

San Jose State University¹

01 Sep 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper describes a VLSI architecture for a real-time dynamic code book generator and encoder of 512/spl times/512 images at 30 frames/s for image compression applications.

...read moreread less

Abstract: Image compression applications use vector quantization (VQ) for its high compression ratio and image quality. The current VQ hardware employs static instead of dynamic code book generation as the latter demands intensive computation and corresponding expensive hardware even though it offers better image quality. This paper describes a VLSI architecture for a real-time dynamic code book generator and encoder of 512/spl times/512 images at 30 frames/s. The four-chip 0.8 /spl mu/m CMOS design implements a tree of Kohonen self-organizing maps, and consists of two VQ processors and two image buffer memory chips. The pipelined VQ processor contains a computational core for both code book generation and encoding, and is scalable to processing larger frames. >

...read moreread less

Journal Article•DOI•

VLSI systolic binary tree-searched vector quantizer for image compression

[...]

Wai-Chi Fang¹, Chi-Yung Chang², Bing J. Sheu¹, Oscal T.-C. Chen¹, J.C. Curlander² - Show less +1 more•Institutions (2)

University of Southern California¹, California Institute of Technology²

01 Mar 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Simulation results show that this high-speed image compression VLSI processor based on the systolic architecture of difference-codebook binary tree-searched vector quantization is applicable to many types of image data and capable of producing good reconstructed data quality at high compression ratios.

...read moreread less

Abstract: A high-speed image compression VLSI processor based on the systolic architecture of difference-codebook binary tree-searched vector quantization has been developed to meet the increasing demands on large-volume data communication and storage requirements. Simulation results show that this design is applicable to many types of image data and capable of producing good reconstructed data quality at high compression ratios. Various design aspects of the binary tree-searched vector quantizer including the algorithm, architecture, and detailed functional design are thoroughly investigated for VLSI implementation. An 8-level difference-codebook binary tree-searched vector quantizer can be implemented on a custom VLSI chip that includes a systolic array of eight identical processors and a hierarchical memory of eight subcodebook memory banks. The total transistor count is about 300000 and the die size is about 8.67/spl times/7.72 mm/sup 2/ in a 1.0 /spl mu/m CMOS technology. The throughput rate of this high-speed VLSI compression system is approximately 25 Mpixels per second and its equivalent computation power is 600 million instructions per second. >

...read moreread less

Journal Article•DOI•

An optimization approach to the synthesis of multichip architectures

[...]

Catherine H. Gebotys¹•Institutions (1)

University of Waterloo¹

01 Mar 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This research breaks new ground by guaranteeing globally optimal architectures for multichip systems for a specific objective function, and supporting interchip communication delay, interchip bus allocation, and other complex interface constraints.

...read moreread less

Abstract: An optimization approach to the high level synthesis of VLSI multichip architectures is presented in this paper. This research is important for industry since it is well known that these early high level decisions have the greatest impact on the final VLSI implementation. Optimal application-specific architectures are synthesized here to minimize latency given constraints on chip area, I/O pin count and interchip communication delays. A mathematical integer programming (IP) model for simultaneously partitioning, scheduling, and allocating hardware (functional units, I/O pins, and interchip busses) is formulated. By exploiting the problem structure, using polyhedral theory, the size of the search space is decreased and a new variable selection strategy is introduced based on the branch and bound algorithm. Multichip optimal architectures for several examples are synthesized in practical cpu times. Execution times are comparable to previous heuristic approaches, however there are significant improvements in optimal schedules and allocations of multichips. This research breaks new ground by 1) simultaneously partitioning, scheduling, and allocating in practical cpu times, 2) guaranteeing globally optimal architectures for multichip systems for a specific objective function, and 3) supporting interchip communication delay, interchip bus allocation, and other complex interface constraints. >

...read moreread less

Journal Article•DOI•

Optimal and heuristic algorithms for solving the binding problem

[...]

M. Rim¹, A. Mujumdar¹, Rajiv Jain¹, R. De Leone¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jun 1994-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: An optimal and a heuristic approach to solve the binding problem which occurs in high-level synthesis of digital systems, based on a network flow model and also considers floorplanning during the design process to minimize the interconnection area.

...read moreread less

Abstract: In this paper we present an optimal and a heuristic approach to solve the binding problem which occurs in high-level synthesis of digital systems. The optimal approach is based on an integer linear programming formulation. Given that such an approach is not practical for large problems, we then derive a heuristic from the ILP formulation which produces very good solutions in order of seconds. The heuristic is based on a network flow model and also considers floorplanning during the design process to minimize the interconnection area. >

...read moreread less