Showing papers in "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems in 1999"

PDF

Open Access

Journal Article•DOI•

Policy optimization for dynamic power management

[...]

Luca Benini¹, Alessandro Bogliolo¹, Giuseppe A. Paleologo², G. De Micheli²•Institutions (2)

University of Bologna¹, Stanford University²

01 Jun 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A finite-state, abstract system model for power-managed systems based on Markov decision processes is introduced and the problem of finding policies that optimally tradeoff performance for power can be cast as a stochastic optimization problem and solved exactly and efficiently.

...read moreread less

Abstract: Dynamic power management schemes (also called policies) reduce the power consumption of complex electronic systems by trading off performance for power in a controlled fashion, taking system workload into account. In a power-managed system it is possible to set components into different states, each characterized by performance and power consumption levels. The main function of a power management policy is to decide when to perform component state transitions and which transition should be performed, depending on system history, workload, and performance constraints. In the past, power management policies have been formulated heuristically. The main contribution of this paper is to introduce a finite-state, abstract system model for power-managed systems based on Markov decision processes. Under this model, the problem of finding policies that optimally tradeoff performance for power can be cast as a stochastic optimization problem and solved exactly and efficiently. The applicability and generality of the approach are assessed by formulating the Markov model and optimizing power management policies for several systems.

...read moreread less

459 citations

Journal Article•DOI•

Hierarchical finite state machines with multiple concurrency models

[...]

Alain Girault¹, Bilung Lee¹, Edward A. Lee¹•Institutions (1)

University of California, Berkeley¹

01 Jun 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper studies the semantics of hierarchical finite state machines that are composed using various concurrency models, particularly dataflow, discrete-events, and synchronous/reactive modeling, and argues that all three combinations are useful, and that the concurrency model can be selected independently of the decision to use hierarchical FSM's.

...read moreread less

Abstract: This paper studies the semantics of hierarchical finite state machines (FSM's) that are composed using various concurrency models, particularly dataflow, discrete-events, and synchronous/reactive modeling. It is argued that all three combinations are useful, and that the concurrency model can be selected independently of the decision to use hierarchical FSM's. In contrast, most formalisms that combine FSM's with concurrency models, such as statecharts (and its variants) and hybrid systems, tightly integrate the FSM semantics with the concurrency semantics. An implementation that supports three combinations is described.

...read moreread less

349 citations

Journal Article•DOI•

A new algorithm for elimination of common subexpressions

[...]

R. Pasko, Patrick Schaumont¹, Veerle Derudder¹, Serge Vernalde¹, D. Durackova² - Show less +1 more•Institutions (2)

Katholieke Universiteit Leuven¹, Slovak University of Technology in Bratislava²

01 Jan 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A new solution of the multiple constant multiplication problem based on the common subexpression elimination technique is presented and it is shown that the number of add/subtract operations can be reduced significantly this way.

...read moreread less

Abstract: The problem of an efficient hardware implementation of multiplications with one or more constants is encountered in many different digital signal-processing areas, such as image processing or digital filter optimization. In a more general form, this is a problem of common subexpression elimination, and as such it also occurs in compiler optimization and many high-level synthesis tasks. An efficient solution of this problem can yield significant improvements in important design parameters like implementation area or power consumption. In this paper, a new solution of the multiple constant multiplication problem based on the common subexpression elimination technique is presented. The performance of our method is demonstrated primarily on a finite-duration impulse response filter design. The idea is to implement a set of constant multiplications as a set of add-shift operations and to optimize these with respect to the common subexpressions afterwards. We show that the number of add/subtract operations can be reduced significantly this way. The applicability of the presented algorithm to the different high-level synthesis tasks is also indicated. Benchmarks demonstrating the algorithm's efficiency are included as well.

...read moreread less

297 citations

Journal Article•DOI•

Power optimization of variable-voltage core-based systems

[...]

Inki Hong, Darko Kirovski¹, Gang Qu¹, Miodrag Potkonjak¹, Mani Srivastava¹ - Show less +1 more•Institutions (1)

University of California, Los Angeles¹

01 Dec 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This work developed the design methodology for the low-power core-based real-time SOC based on dynamically variable voltage hardware and proposes a nonpreemptive scheduling heuristic, which results in solutions very close to optimal ones for many test cases.

...read moreread less

Abstract: The growing class of portable systems, such as personal computing and communication devices, has resulted in a new set of system design requirements, mainly characterized by dominant importance of power minimization and design reuse. The energy efficiency of systems-on-a-chip (SOC) could be much improved if one were to vary the supply voltage dynamically at run time. We developed the design methodology for the low-power core-based real-time SOC based on dynamically variable voltage hardware. The key challenge is to develop effective scheduling techniques that treat voltage as a variable to be determined, in addition to the conventional task scheduling and allocation. Our synthesis technique also addresses the selection of the processor core and the determination of the instruction and data cache size and configuration so as to fully exploit dynamically variable voltage hardware, which results in significantly lower power consumption for a set of target applications than existing techniques. The highlight of the proposed approach is the nonpreemptive scheduling heuristic, which results in solutions very close to optimal ones for many test cases. The effectiveness of the approach is demonstrated on a variety of modern industrial strength multimedia and communication applications.

...read moreread less

270 citations

Journal Article•DOI•

Models and algorithms for bounds on leakage in CMOS circuits

[...]

Mark C. Johnson¹, Dinesh Somasekhar², Kaushik Roy²•Institutions (2)

Rose-Hulman Institute of Technology¹, Purdue University²

01 Jun 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Methods for estimating leakage at the circuit level are outlined and a heuristic and exact algorithms to accomplish the same task for random combinational logic are proposed.

...read moreread less

Abstract: Subthreshold leakage current in deep submicron MOS transistors is becoming a significant contributor to power dissipation in CMOS circuits as threshold voltages and channel lengths are reduced. Consequently, estimation of leakage current and identification of minimum and maximum leakage conditions are becoming important, especially in low power applications. In this paper we outline methods for estimating leakage at the circuit level and then propose heuristic and exact algorithms to accomplish the same task for random combinational logic. In most cases the heuristic is found to obtain bounds on leakage that are close and often identical to bounds determined by a complete branch and bound search. Methods are also demonstrated to show how estimation accuracy can be traded off against execution time. The proposed algorithms have potential application in power management applications or quiescent current (I/sub D/DQ) testing if one wished to control leakage by application of appropriate input vectors. For a variety of benchmark circuits, leakage was found to vary by as much as a factor of six over the space of possible input vectors.

...read moreread less

199 citations

Journal Article•DOI•

Crosstalk in VLSI interconnections

[...]

Ashok Vittal, L.H. Chen, Malgorzata Marek-Sadowska, Kai-Ping Wang, S. Yang - Show less +1 more

01 Dec 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper provides easily computable expressions for crosstalk amplitude and pulse width in resistive, capacitively coupled lines and these expressions hold for nets with arbitrary number of pins and of arbitrary topology under any specified input excitation.

...read moreread less

Abstract: We address the problem of crosstalk computation and reduction using circuit and layout techniques in this paper. We provide easily computable expressions for crosstalk amplitude and pulse width in resistive, capacitively coupled lines. The expressions hold for nets with arbitrary number of pins and of arbitrary topology under any specified input excitation. Experimental results show that the average error is about 10% and the maximum error is less than 20%. The expressions are used to motivate circuit techniques, such as transistor sizing, and layout techniques, such as wire ordering and wire width optimization to reduce crosstalk.

...read moreread less

165 citations

Journal Article•DOI•

Buffer insertion for noise and delay optimization

[...]

Charles J. Alpert¹, Anirudh Devgan, Stephen T. Quay•Institutions (1)

IBM¹

01 Nov 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: It is shown that optimizing delay alone cannot fix all of the noise violations and that the performance penalty induced by optimizing both delay and noise as opposed to only delay is less than 2%.

...read moreread less

Abstract: Interconnect-driven optimization is an increasingly important step in high-performance design. Algorithms for buffer insertion have been successfully utilized to reduce delay in global interconnect paths; however, existing techniques only optimize delay and timing slack, With the continually increasing ratio of coupling capacitance to total capacitance and the use of aggressive dynamic logic circuit families, noise analysis and avoidance is becoming a major design bottleneck. Hence, timing and noise must be simultaneously optimized to achieve maximum performance. This paper presents comprehensive buffer insertion techniques for noise and delay optimization. Three algorithms are presented, the first for noise avoidance for single sink trees, the second for avoidance for multiple sink trees, and the last for simultaneous noise and delay optimization. We prove the optimality of each algorithm (under various assumptions) and present other theoretical results as well. We ran experiments on a high-performance microprocessor design and show that our approach fixes all noise violations, Our approach was separately verified by a detailed, simulation-based noise analysis tool. Further, we show that optimizing delay alone cannot fix all of the noise violations and that the performance penalty induced by optimizing both delay and noise as opposed to only delay is less than 2%.

...read moreread less

135 citations

Journal Article•DOI•

High-level area and power estimation for VLSI circuits

[...]

Mahadevamurty Nemani¹, Farid N. Najm²•Institutions (2)

Intel¹, University of Illinois at Urbana–Champaign²

01 Jun 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The proposed area model is based on transforming the given, multi-output Boolean function description into an equivalent single-output function, and is empirical, and results demonstrating its feasibility and utility are presented.

...read moreread less

Abstract: High-level power estimation, when given only a high-level design specification such as a functional or register-transfer level (RTL) description, requires high-level estimation of the circuit average activity and total capacitance. Considering that total capacitance is related to circuit area, this paper addresses the problem of computing the "area complexity" of multi-output combinational logic given only their functional description, i.e., Boolean equations, where area complexity refers to the number of gates required for an optimal multilevel implementation of the combinational logic. The proposed area model is based on transforming the multi-output Boolean function description into an equivalent single output function. The area model is empirical and results demonstrating its feasibility and utility are presented. Also, a methodology for converting the gate count estimates, obtained from the area model, into capacitance estimates is presented. High-level power estimates based on the total capacitance estimates and average activity estimates are also presented.

...read moreread less

119 citations

Journal Article•DOI•

Harmony: static noise analysis of deep submicron digital integrated circuits

[...]

Kenneth L. Shepard¹, Vinod Narayanan, R. Rose²•Institutions (2)

Columbia University¹, IBM²

01 Aug 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A metric for noise immunity is defined, and a static noise analysis methodology based on this noise-stability metric is introduced to demonstrate how noise can be analyzed systematically on a full-chip basis using simulation-based transistor-level analysis.

...read moreread less

Abstract: As technology scales into the deep submicron regime, noise immunity is becoming a metric of comparable importance to area, timing, and power for the analysis and design of very large scale integrated (VLSI) systems. A metric for noise immunity is defined, and a static noise analysis methodology based on this noise-stability metric is introduced to demonstrate how noise can be analyzed systematically on a full-chip basis using simulation-based transistor-level analysis. We then describe Harmony, a two-level (macro and global) hierarchical implementation of static noise analysis. At the macro level, simplified interconnect models and timing assumptions guide efficient analysis. The global level involves a careful combination of static noise analysis, static timing analysis, and detailed interconnect macromodels based on reduced-order modeling techniques. We describe how the interconnect macromodels are practically employed to perform coupling analysis and how timing constraints can be used to limit pessimism in the analysis.

...read moreread less

113 citations

Journal Article•DOI•

Design error diagnosis and correction via test vector simulation

[...]

Andreas Veneris¹, Ibrahim N. Hajj•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A test vector simulation-based approach for multiple design error diagnosis and correction in digital VLSI circuits that is applicable to circuits with no global binary decision diagram representation.

...read moreread less

Abstract: With the increase in the complexity of digital VLSI circuit design, logic design errors can occur during synthesis. In this paper, we present a test vector simulation-based approach for multiple design error diagnosis and correction. Diagnosis is performed through an implicit enumeration of the erroneous lines in an effort to avoid the exponential explosion of the error space as the number of errors increases. Resynthesis during correction is as little as possible so that most of the engineering effort invested in the design is preserved. Since both steps are based on test vector simulation, the proposed approach is applicable to circuits with no global binary decision diagram representation. Experiments on ISCAS'85 benchmark circuits exhibit the robustness and error resolution of the proposed methodology. Experiments also indicate that test vector simulation is indeed an attractive technique for multiple design error diagnosis and correction in digital VLSI circuits.

...read moreread less

112 citations

Journal Article•DOI•

SAMC: a code compression algorithm for embedded processors

[...]

Haris Lekatsas¹, Wayne Wolf¹•Institutions (1)

Princeton University¹

01 Dec 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper compress the instruction segment of the executable running on the embedded system, and shows how to design a run-time decompression unit to decompress code on the fly before execution.

...read moreread less

Abstract: In this paper, we present a method for reducing the memory requirements of an embedded system by using code compression. We compress the instruction segment of the executable running on the embedded system, and we show how to design a run-time decompression unit to decompress code on the fly before execution. Our algorithm uses arithmetic coding in combination with a Markov model, which is adapted to the instruction set and the application. We provide experimental results on two architectures, Analog Devices' Share and ARM's ARM and Thumb instruction sets, and show that programs can often be reduced by more than 50%. Furthermore, we suggest a table-based design that allows multibit decoding to speed up decompression.

...read moreread less

Journal Article•DOI•

On wirelength estimations for row-based placement

[...]

Andrew Caldwell¹, Andrew B. Kahng¹, Stefanus Mantik¹, Igor L. Markov¹, Alexander Zelikovsky² - Show less +1 more•Institutions (2)

University of California, Los Angeles¹, Georgia State University²

01 Sep 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper develops efficient wirelength estimation techniques appropriate for wire length estimation during top-down floorplanning and placement of cell-based designs and develops new wirelength estimates that are functions of a block's complexity (number of cell instances) and aspect ratio.

...read moreread less

Abstract: Wirelength estimation in very large scale integration layout is fundamental to any predetailed routing estimate of timing or routability. In this paper, we develop efficient wirelength estimation techniques appropriate for wirelength estimation during top-down floorplanning and placement of cell-based designs. Our methods give accurate, linear-time approaches, typically with sublinear time complexity for dynamic updating of estimates (e.g., for annealing placement). Our techniques offer advantages not only for early on-line wirelength estimation during top-down placement, but also for a posteriori estimation of routed wirelength given a final placement. In developing these new estimators, we have made several contributions, including (1) insight into the contrast between region-based and bounding box-based rectilinear Steiner minimal tree (RStMT) estimation techniques; (2) empirical assessment of the correlations between pin placements of a multipin net that is contained in a block; and (3) new wirelength estimates that are functions of a block's complexity (number of cell instances) and aspect ratio.

...read moreread less

Journal Article•DOI•

Fault emulation: A new methodology for fault grading

[...]

Kwang-Ting Cheng¹, Shi-Yu Huang, Wei-Jin Dai•Institutions (1)

University of California, Santa Barbara¹

01 Oct 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A serial fault emulation algorithm enhanced by two speed-up techniques that uses the field programmable gate array (FPGA)-based emulation system for fault grading and shows that this approach could be several orders of magnitude faster than the existing software approaches for large sequential designs.

...read moreread less

Abstract: In this paper, we introduce a method that uses the field programmable gate array (FPGA)-based emulation system for fault grading. The real-time simulation capability of a hardware emulator could significantly improve the performance of fault grading, which is one of the most time consuming tasks in the circuit design and test process. We employ a serial fault emulation algorithm enhanced by two speed-up techniques. First, a set of independent faults can be injected and emulated at the same time. Second, multiple dependent faults can be simultaneously injected within a single FPGA-configuration by adding extra circuitry. Because the reconfiguration time of mapping the numerous faulty circuits into the FPGA's is pure overhead and could be the bottleneck of the entire process, using extra circuitry for injecting a large number of faults can reduce the number of FPGA-reconfigurations and, thus, improving the performance significantly. In addition, we address the issue of handling potentially detected faults in this hardware emulation environment by using the dual-railed logic. The performance estimation shows that this approach could be several orders of magnitude faster than the existing software approaches for large sequential designs.

...read moreread less

Journal Article•DOI•

Efficient techniques for accurate modeling and simulation of substrate coupling in mixed-signal IC's

[...]

J.-P. Costa, M. Chou¹, Luis Miguel Silveira²•Institutions (2)

Massachusetts Institute of Technology¹, Instituto Superior Técnico²

01 May 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper presents a fast eigendecomposition technique that accelerates operator application in BEM methods and avoids the dense-matrix storage while taking all of the substrate boundary effects into account explicitly.

...read moreread less

Abstract: Industry trends aimed at integrating higher levels of circuit functionality have triggered a proliferation of mixed analog-digital systems. Magnified noise coupling through the common chip substrate has made the design and verification of such systems an increasingly difficult task. In this paper we present a fast eigendecomposition technique that accelerates operator application in BEM methods and avoids the dense-matrix storage while taking all of the substrate boundary effects into account explicitly. This technique can be used for accurate and efficient modeling of substrate coupling effects in mixed-signal integrated circuits.

...read moreread less

Journal Article•DOI•

Creating small fault dictionaries [logic circuit fault diagnosis]

[...]

B. Chess¹, T. Larrabee•Institutions (1)

Hewlett-Packard¹

01 Mar 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This work demonstrates that if information is removed from a fault dictionary, its ability to diagnose unmodeled faults may be severely curtailed even if dictionary quality metrics remain unaffected; it presents a new dictionary organization based on error sets, which is amenable to standard data-compression techniques.

...read moreread less

Abstract: Diagnostic fault simulation can generate enormous amounts of data. The techniques used to manage this data can have significant effect on the outcome of the fault diagnosis procedure. We first demonstrate that if information is removed from a fault dictionary, its ability to diagnose unmodeled faults may be severely curtailed even if dictionary quality metrics remain unaffected; we, therefore, focus on methods for producing small, lossless dictionaries, We present a new dictionary organization based on error sets, which is amenable to standard data-compression techniques. We compare several dictionary organizations and the effect of standard data-compression techniques on each of them. An appropriate organization and encoding makes dictionary-based diagnosis practical for very large circuits.

...read moreread less

Journal Article•DOI•

Automatic synthesis of extended burst-mode circuits. I. (Specification and hazard-free implementations)

[...]

K.Y. Yun¹, David L. Dill²•Institutions (2)

University of California, San Diego¹, Stanford University²

01 Feb 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: In this paper, the authors introduce a new design style called extended burst-mode, which can synthesize multiple-input change asynchronous finite state machines and many circuits that are difficult or impossible to synthesize automatically using existing methods.

...read moreread less

Abstract: We introduce a new design style called extended burst-mode. The extended burst-mode design style covers a wide spectrum of sequential circuits ranging from delay-insensitive to synchronous. We can synthesize multiple-input change asynchronous finite state machines and many circuits that fall in the gray area (hard to classify as synchronous or asynchronous) which are difficult or impossible to synthesize automatically using existing methods. Our implementation of extended burst-mode machines uses standard CMOS logic, generates low-latency outputs, and guarantees freedom from hazards at the gate level. In Part I, we formally define the extended burst-mode specification, provide an overview of the synthesis methods, and describe the hazard-free synthesis requirements for two different next-state logic synthesis methods: two-level sums-of-products implementation and generalized C-elements implementation. We also present an extension to existing theories for hazard-free combinational synthesis to handle nonmonotonic input changes.

...read moreread less

Journal Article•DOI•

[...]

Anand Raghunathan¹, S. Dey², Niraj K. Jha³•Institutions (3)

NEC¹, University of California, San Diego², Princeton University³

01 Jan 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Techniques that attempt to reduce glitching power consumption by minimizing propagation of glitches in the RTL circuit are developed, which include restructuring multiplexer networks, clocking control signals, and inserting selective rising/falling delays, in order to kill the propagate of glitches from control as well as data signals.

...read moreread less

Abstract: We present design-for-low-power techniques for register-transfer level (RTL) controller/data path circuits. We analyze the generation and propagation of glitches in both the control and data path parts of the circuit. In data-flow intensive designs, glitching power is primarily due to the chaining of arithmetic functional units. In control-flow intensive designs, on the other hand, multiplexer networks and registers dominate the total circuit power consumption, and the control logic can generate a significant amount of glitches at its outputs, which in turn propagate through the data path to account for a large portion of the glitching power in the entire circuit. Our analysis also highlights the relationship between the propagation of glitches from control signals and the bit-level correlation between data signals. Based on the analysis, we develop techniques that attempt to reduce glitching power consumption by minimizing propagation of glitches in the RTL circuit. Our techniques include restructuring multiplexer networks (to enhance data correlations and eliminate glitchy control signals), clocking control signals, and inserting selective rising/falling delays, in order to kill the propagation of glitches from control as well as data signals. In addition, we present a procedure to automatically perform the well-known power-reduction technique of clock gating through an efficient structural analysis of the RTL circuit, while avoiding the introduction of glitches on the clock signals. Application of the proposed power optimization techniques to several RTL circuits shows significant power savings, with negligible area and delay overheads.

...read moreread less

Journal Article•DOI•

Synthesis of software programs for embedded control applications

[...]

Felice Balarin¹, Massimiliano Chiodo, Paolo Giusto, Harry Hsieh, Attila Jurecska, Luciano Lavagno, Alberto Sangiovanni-Vincentelli, Ellen M. Sentovich, Kei Suzuki - Show less +5 more•Institutions (1)

Lawrence Berkeley National Laboratory¹

01 Jun 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A software generation methodology is proposed that takes advantage of a restricted class of specifications and allows for tight control over the implementation cost, and exploits several techniques from the domain of Boolean function optimization.

...read moreread less

Abstract: Software components for embedded reactive real-time applications must satisfy tight code size and run-time constraints. Cooperating finite state machines provide convenient intermediate format for embedded system co-synthesis, between high-level specification languages and software or hardware implementations. We propose a software generation methodology that takes advantage of a restricted class of specifications and allows for tight control over the implementation cost. The methodology exploits several techniques from the domain of Boolean function optimization. We also describe how the simplified control/data-flow graph used as an intermediate representation can be used to accurately estimate the size and timing cost of the final executable code.

...read moreread less

Journal Article•DOI•

Modeling digital substrate noise injection in mixed-signal IC's

[...]

Edoardo Charbon¹, Paolo Miliozzi, Luca P. Carloni, Alberto Ferrari, Alberto Sangiovanni-Vincentelli - Show less +1 more•Institutions (1)

Cadence Design Systems¹

01 Mar 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Techniques are presented to compactly represent substrate noise currents injected by digital networks using device-level simulation and standard benchmark circuits to verify the validity of the assumptions and to measure the accuracy of the obtained power spectra.

...read moreread less

Abstract: Techniques are presented to compactly represent substrate noise currents injected by digital networks. Using device-level simulation, every gate in a given library is modeled by means of the signal waveform it injects into the substrate, depending on its input transition scheme. For a given sequence of input vectors, the switching activity of every node in the Boolean network is computed. Assuming that technology mapping has been performed, each node corresponds to a gate in the library, hence, to a specific injection waveform. The noise contribution of each node is computed by convolving its switching activity with the associated injection waveforms. The total injected noise for the digital block is then obtained by summing all the noise contributions in the circuit. The resulting injected noise can be viewed as a random process, whose power spectrum is computed using standard signal processing techniques. A study was performed on a number of standard benchmark circuits to verify the validity of the assumptions and to measure the accuracy of the obtained power spectra.

...read moreread less

Journal Article•DOI•

Error bound for reduced system model by Pade approximation via the Lanczos process

[...]

Zhaojun Bai¹, R.D. Slone¹, W.T. Smith¹, Qiang Ye²•Institutions (2)

University of Kentucky¹, University of Manitoba²

01 Feb 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A matrix-based derivation of the error between the original circuit transfer function and the reduced-order transfer function generated using the PVL technique is presented and may be used for the development of an automated termination of the Lanczos process in the PVl technique and achieve the desired accuracy of the approximate transfer function.

...read moreread less

Abstract: Recently, there has been a great deal of interest in using the Pade Via Lanczos (PVL) technique to analyze the transfer functions and impulse responses of large-scale linear circuits. In this paper, a matrix-based derivation of the error between the original circuit transfer function and the reduced-order transfer function generated using the PVL technique is presented. This error measure may be used for the development of an automated termination of the Lanczos process in the PVL technique and achieve the desired accuracy of the approximate transfer function. PVL coupled with such an error bound will be referred to as the PVL-WEB algorithm.

...read moreread less

Journal Article•DOI•

Scan-based BIST fault diagnosis

[...]

Yuejian Wu¹, S.M.I. Adham¹•Institutions (1)

Nortel¹

01 Feb 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper presents a novel BIST fault diagnostic technique for scan-based VLSI devices based on faulty signature information that is applicable to all voltage-detectable faults, and applies naturally to multifrequency BIST.

...read moreread less

Abstract: Existing built-in self-test (BIST) diagnostic techniques assume the existence of a few bit errors in a test response sequence. This assumption is unrealistic since in a BIST environment a single defect can usually cause hundreds or thousands of errors in a test response sequence. Without making the above assumption, this paper presents a novel BIST fault diagnostic technique for scan-based VLSI devices. Based on faulty signature information, our scheme guarantees correct identification of the scan flip-flops that capture errors during test, regardless of the number of errors the circuit may produce. In addition, it is able to identify failing test vectors with a better diagnostic capacity than existing techniques. The proposed scheme does not assume any specific fault model. Thus, it is applicable to all voltage-detectable faults. It also applies naturally to multifrequency BIST. This paper analyzes the efficiency of the scheme in terms of diagnostic coverage. Experimental results on several large ISCAS89 benchmark circuits and industrial circuits are also reported.

...read moreread less

Journal Article•DOI•

A modeling technique for CMOS gates

[...]

Alexander Chatzigeorgiou¹, Spyridon Nikolaidis, I. Tsoukalas•Institutions (1)

Aristotle University of Thessaloniki¹

01 May 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A modeling technique for CMOS gates, based on the reduction of each gate to an equivalent inverter, is presented and can be incorporated in existing timing simulators in order to improve their accuracy.

...read moreread less

Abstract: In this paper, a modeling technique for CMOS gates, based on the reduction of each gate to an equivalent inverter, is presented. The proposed method can be incorporated in existing timing simulators in order to improve their accuracy. The conducting and parasitic behavior of parallel and serially connected transistors is accurately analyzed and an equivalent transistor is extracted for each case, taking into account the actual operating conditions of each device in the structure. The proposed model incorporates short-channel effects, the influence of body effect and is developed for nonzero transition time inputs. The exact time point when the gate starts conducting is efficiently calculated improving significantly the accuracy of the method. A mapping algorithm for reducing every possible input pattern of a gate to an equivalent signal is introduced and the "weight" of each transistor position in the gate structure is extracted. Complex gates are treated by first mapping every possible structure to a NAND/NOR gate and then by collapsing this gate to an equivalent inverter. Results are validated by comparisons to SPICE and ILLIADS2 for three submicron technologies.

...read moreread less

Journal Article•DOI•

Local memory exploration and optimization in embedded systems

[...]

Preeti Ranjan Panda¹, Nikil Dutt², Alexandru Nicolau²•Institutions (2)

Synopsys¹, University of California, Irvine²

01 Jan 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This work presents an analytical strategy for exploring the on-chip memory architecture for a given application, based on a memory performance estimation scheme, and demonstrates that its estimations closely follow the actual simulated performance at significantly reduced run times.

...read moreread less

Abstract: Embedded processor-based systems allow for the tailoring of the on-chip memory architecture based on application specific requirements. We present an analytical strategy for exploring the on-chip memory architecture for a given application, based on a memory performance estimation scheme. The analytical technique has the important advantage of enabling a fast evaluation of candidate memory architectures in the early stages of system design. Many digital signal-processing applications involve array accesses and loop nests that can benefit from such an exploration. Our experiments demonstrate that our estimations closely follow the actual simulated performance at significantly reduced run times.

...read moreread less

Journal Article•DOI•

BDD minimization using symmetries

[...]

Christoph Scholl¹, D. Moller, Paul Molitor², Rolf Drechsler¹•Institutions (2)

University of Freiburg¹, Martin Luther University of Halle-Wittenberg²

01 Feb 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper proves that using information about (partial) symmetries for the minimization of reduced ordered binary decision diagrams (ROBDD's) lead to improvements of the ROBDD sizes by up to 70%.

...read moreread less

Abstract: In this paper we study the effect of using information about (partial) symmetries for the minimization of reduced ordered binary decision diagrams (ROBDD's). The influence of symmetries for the integration in dynamic variable ordering is studied for both completely and incompletely specified Boolean functions. The problems above are studied from a theoretical and practical point of view. Statistical results and benchmark results are reported to underline the efficiency of the approach. They prove that our techniques lead to improvements of the ROBDD sizes by up to 70%.

...read moreread less

Journal Article•DOI•

Hierarchical approach to "atomistic" 3-D MOSFET simulation

[...]

Asen Asenov¹, Andrew R. Brown¹, John H. Davies¹, Subhash Saini²•Institutions (2)

University of Glasgow¹, Ames Research Center²

01 Nov 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: To reduce processor time and memory requirements at high drain voltage, a self-consistent option based on a solution of the current continuity equation restricted to a thin slab of the channel is developed.

...read moreread less

Abstract: We present a hierarchical approach to the "atomistic" simulation of aggressively scaled sub-0.1-/spl mu/m MOSFETs. These devices are so small that their characteristics depend on the precise location of dopant atoms within them, not just on their average density. A full-scale three-dimensional drift-diffusion atomistic simulation approach is first described and used to verify more economical, but restricted, options. To reduce processor time and memory requirements at high drain voltage, we have developed a self-consistent option based on a solution of the current continuity equation restricted to a thin slab of the channel. This is coupled to the solution of the Poisson equation in the whole simulation domain in the Gummel iteration cycles. The accuracy of this approach is investigated in comparison to the full self-consistent solution. At low drain voltage, a single solution of the nonlinear Poisson equation is sufficient to extract the current with satisfactory accuracy. In this case, the current is calculated by solving the current continuity equation in a drift approximation only, also in a thin slab containing the MOSFET channel. The regions of applicability for the different components of this hierarchical approach are illustrated in example simulations covering the random dopant-induced threshold voltage fluctuations, threshold voltage lowering, threshold voltage asymmetry, and drain current fluctuations.

...read moreread less

Journal Article•DOI•

Broadcasting test patterns to multiple circuits

[...]

Kuen-Jong Lee¹, Jih-Jeen Chen¹, Cheng-Hua Huang¹•Institutions (1)

National Cheng Kung University¹

01 Dec 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A novel test methodology that not only substantially reduces the total test pattern number for multiple circuits but also allows a single input data line to support multiple scan chains and provides a low-cost and high-performance method to integrate the boundary scan and scan architectures.

...read moreread less

Abstract: Scan designs can alleviate test difficulties of sequential circuits by replacing the memory elements with scannable registers. However, scan operations usually result in long test application time. Most classical methods to solving this problem either perform test compaction to obtain fewer test vectors or use multiple scan chain design to reduce the scan time. For a large system, test vector compaction is a time-consuming process, while multiple scan chains either require extra pin overhead or need the sharing of normal I/O and scan I/O pins. In this paper, we present a novel test methodology that not only substantially reduces the total test pattern number for multiple circuits but also allows a single input data line to support multiple scan chains. Our main idea is to explore the "sharing" property of test patterns among all circuits under test (CUT's). By appropriately connecting the inputs of all CUT's during the automatic test-pattern generation process such that the generated test patterns can be broadcast to all scan chains when the actual testing operation is executed, the above-mentioned problems can be solved effectively. Our method also provides a low-cost and high-performance method to integrate the boundary scan and scan architectures. Experimental results show that 157 test patterns are enough to detect all detectable faults in the ten ISCAS'85 combinational circuits, while 280 are enough for the ten largest ISCAS'89 scan-based sequential circuits.

...read moreread less

Journal Article•DOI•

Critical area computation via Voronoi diagrams

[...]

Evanthia Papadopoulou¹, Der-Tsai Lee²•Institutions (2)

IBM¹, Northwestern University²

01 Apr 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: In this paper, the critical area for shorts in a circuit layout is computed in O(n log n) time, where n is the size of the input, and is based on the concept of Voronoi diagrams.

...read moreread less

Abstract: In this paper, we present a new approach for computing the critical area for shorts in a circuit layout. The critical area calculation is the main computational problem in very large scale integration yield prediction. The method is based on the concept of Voronoi diagrams and computes the critical area for shorts (for all possible defect radii, assuming square defects) accurately in O(n log n) time, where n is the size of the input. The method is presented for rectilinear layouts and layouts containing edges of slope /spl plusmn/1. As a byproduct, we briefly sketch how to speed up the grid method of Wagner and Koren [1995].

...read moreread less

Journal Article•DOI•

Integrating communication protocol selection with hardware/software codesign

[...]

Peter Voigt Knudsen¹, Jan Madsen•Institutions (1)

Information Technology University¹

01 Aug 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: In this paper, the authors present a communication estimation model and show, by the use of this model, the importance of integrating communication protocol selection with hardware/software partitioning, which is illustrated by a number of design space exploration experiments performed within the LYCOS cosynthesis system, using models of the PCI and USB protocols.

...read moreread less

Abstract: This paper explores the problem of determining the characteristics of the communication links in a computer system as well as determining the best functional partitioning. In particular, we present a communication estimation model and show, by the use of this model, the importance of integrating communication protocol selection with hardware/software partitioning. The communication estimation model allows for fast estimation but is still sufficiently detailed as to allow the designer or design tool to efficiently explore tradeoffs between throughputs, bus widths, burst/nonburst transfers, operating frequencies of system components such as buses, CPU's, ASIC's, software code size, hardware area, and component prices. A distinct feature of the model is the modeling of driver processing of data (packing, splitting, compression, etc.) and its impact on communication throughput. The integration of communication protocol selection and communication driver design with hardware/software partitioning is illustrated by a number of design space exploration experiments carried out within the LYCOS cosynthesis system, using models of the PCI and USB protocols.

...read moreread less

Journal Article•DOI•

Configuration compression for the Xilinx XC6200 FPGA

[...]

Scott Hauck¹, Zhiyuan Li¹, Eric J. Schwabe²•Institutions (2)

Northwestern University¹, DePaul University²

01 Aug 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: An algorithm is developed, targeted to the decompression hardware imbedded in the Xilinx XC6200 series field-programmable gate array architecture, that can radically reduce the amount of data needed to transfer during reconfiguration.

...read moreread less

Abstract: One of the major overheads in reconfigurable computing is the time it takes to reconfigure the devices in the system. This overhead limits the speedups possible in this exciting new paradigm. In this paper me explore one technique for reducing this overhead: the compression of configuration datastreams. We develop an algorithm, targeted to the decompression hardware imbedded in the Xilinx XC6200 series field-programmable gate array architecture, that can radically reduce the amount of data needed to transfer during reconfiguration. This results in an overall reduction of about a factor of four in total bandwidth required for reconfiguration.

...read moreread less

Journal Article•DOI•

Using configurable computing to accelerate Boolean satisfiability

[...]

Peixin Zhong¹, Margaret Martonosi¹, Margaret Martonosi², P. Ashar, Sharad Malik¹ - Show less +1 more•Institutions (2)

Princeton University¹, Center for Information Technology²

01 Jan 1999-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A formula-specific method for implementing Boolean satisfiability solver circuits in configurable hardware using a template generator, which realizes large amount of fine-grained parallelism, and has broad applications in the very large scale integration CAD area.

...read moreread less

Abstract: The issues of software compute time and complexity are very important in current computer-aided design (CAD) tools. As field-programmable gate array (FPGA) speeds and densities increase, the opportunity for effective hardware accelerators built from FPGA technology has opened up. This paper describes and evaluates a formula-specific method for implementing Boolean satisfiability solver circuits in configurable hardware. That is, using a template generator, we create circuits specific to the problem instance to be solved. This approach yields impressive runtime speedups of up to several hundred times compared to the software approaches. The high performance comes from realizing fine-grained parallelism inherent in the clause evaluation and implication and from direct mapping of Boolean relations into logic gates. Our implementation uses a commercially available hardware system for proof of concept. This system yields more than 100 times run-time speedup on many problems, even though the clock rate of the hardware is 100 times slower than that of the workstation running the software solver. While the time to compile the solver circuit to configurable hardware can he quite long on current platforms (20-40 min per chip), this paper discusses new approaches to overcome this compilation overhead. More broadly, we view this work as a case study in the burgeoning domain of high performance configurable computing. Our approach realizes large amount of fine-grained parallelism, and has broad applications in the very large scale integration CAD area.

...read moreread less