scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems in 1996"


Journal Article•DOI•
TL;DR: This paper attacks the biggest MCNC benchmark ami49 with a conventional wiring area estimation method, and obtain a highly promising placement, and proposes a solution space where each packing is represented by a pair of module name sequences, called a sequence-pair.
Abstract: The earliest and the most critical stage in VLSI layout design is the placement. The background is the rectangle packing problem: given a set of rectangular modules of arbitrary sizes, place them without overlap on a plane within a rectangle of minimum area. Since the variety of the packing is uncountably infinite, the key issue for successful optimization is the introduction of a finite solution space which includes an optimal solution. This paper proposes such a solution space where each packing is represented by a pair of module name sequences, called a sequence-pair. Searching this space by simulated annealing, hundreds of modules have been packed efficiently as demonstrated. For applications to VLSI layout, we attack the biggest MCNC benchmark ami49 with a conventional wiring area estimation method, and obtain a highly promising placement.

687 citations


Journal Article•DOI•
TL;DR: The problem formulation for solving the multiple constant multiplication (MCM) problem is introduced where first the minimum number of shifts that are needed is computed, and then the number of additions is minimized using common subexpression elimination.
Abstract: Many applications in DSP, telecommunications, graphics, and control have computations that either involve a large number of multiplications of one variable with several constants, or can easily be transformed to that form. A proper optimization of this part of the computation, which we call the multiple constant multiplication (MCM) problem, often results in a significant improvement in several key design metrics, such as throughput, area, and power. However, until now little attention has been paid to the MCM problem. After defining the MCM problem, we introduce an effective problem formulation for solving it where first the minimum number of shifts that are needed is computed, and then the number of additions is minimized using common subexpression elimination. The algorithm for common subexpression elimination is based on an iterative pairwise matching heuristic. The power of the MCM approach is augmented by preprocessing the computation structure with a new scaling transformation that reduces the number of shifts and additions. An efficient branch and bound algorithm for applying the scaling transformation has also been developed. The flexibility of the MCM problem formulation enables the application of the iterative pairwise matching algorithm to several other important and common high level synthesis tasks, such as the minimization of the number of operations in constant matrix-vector multiplications, linear transforms, and single and multiple polynomial evaluations. All applications are illustrated by a number of benchmarks.

362 citations


Journal Article•DOI•
TL;DR: A new synthesis strategy that can automate fully the path from an analog circuit topology and performance specifications to a sized circuit schematic and relies on asymptotic waveform evaluation to predict circuit performance and simulated annealing to solve a novel unconstrained optimization formulation of the circuit synthesis problem is presented.
Abstract: We present a new synthesis strategy that can automate fully the path from an analog circuit topology and performance specifications to a sized circuit schematic. This strategy relies on asymptotic waveform evaluation to predict circuit performance and simulated annealing to solve a novel unconstrained optimization formulation of the circuit synthesis problem. We have implemented this strategy in a pair of tools called ASTRX and OBLX. To show the generality of our new approach, we have used this system to resynthesize essentially all the analog synthesis benchmarks published in the past decade; ASTRX/OBLX has resynthesized circuits in an afternoon that, for some prior approaches, had required months. To show the viability of the approach on difficult circuits, we have resynthesized a recently published (and patented), high-performance operational amplifier; ASTRX/OBLX achieved performance comparable to the expert manual design. And finally, to test the limits of the approach on industrial-sized problems, we have synthesized the component cells of a pipelined A/D converter; ASTRX/OBLX successfully generated cells 2-3/spl times/ more complex than those published previously.

347 citations


Journal Article•DOI•
TL;DR: The algorithm, Test Generation Using Satisfiability (TEGUS), solves a simplified test set characteristic equation using straightforward but powerful greedy heuristics, ordering the variables using depth-first search and selecting a variable from the next unsatisfied clause at each branching point.
Abstract: We present a robust, efficient algorithm for combinational test generation using a reduction to satisfiability (SAT) The algorithm, Test Generation Using Satisfiability (TEGUS), solves a simplified test set characteristic equation using straightforward but powerful greedy heuristics, ordering the variables using depth-first search and selecting a variable from the next unsatisfied clause at each branching point For difficult faults, the computation of global implications is iterated, which finds more implications than previous approaches and subsumes structural heuristics such as unique sensitization Without random tests or fault simulation, TEGUS completes on every fault in the ISCAS networks, demonstrating its robustness, and is ten times faster for those networks which have been completed by previous algorithms Our implementation of TEGUS can be used as a base line for comparing test generation algorithms; we present comparisons with 45 recently published algorithms TEGUS combines the advantages of the elegant organization of SAT-based algorithms with the efficiency of structural algorithms

329 citations


Journal Article•DOI•
TL;DR: HOPE as mentioned in this paper is an efficient parallel fault simulator for synchronous sequential circuits that employs the parallel version of the single fault propagation technique, which is based on an earlier fault simulator railed PROOFS, which employs several heuristics to efficiently drop faults and to avoid simulation of many inactive faults.
Abstract: HOPE is an efficient parallel fault simulator for synchronous sequential circuits that employs the parallel version of the single fault propagation technique. HOPE is based on an earlier fault simulator railed PROOFS, which employs several heuristics to efficiently drop faults and to avoid simulation of many inactive faults. In this paper, we propose three new techniques that substantially speed up parallel fault simulation: (1) reduction of faults simulated in parallel through mapping nonstem faults to stem faults, (2) a new fault injection method called functional fault injection, and (3) a combination of a static fault ordering method and a dynamic fault ordering method. Based on our experiments, our fault simulator, HOPE, which incorporates the proposed techniques, is about 1.6 times faster than PROOFS for 16 benchmark circuits.

301 citations


Journal Article•DOI•
TL;DR: A new transformation for incompletely specified Mealy-type machines is described that makes them suitable for gated-clock implementation with a limited increase in complexity, and identifies highly-probable idle conditions that will be exploited for the optimal synthesis of the logic block that controls the local clock of the FSM.
Abstract: The automatic synthesis of low power finite-state machines (FSM's) with gated clocks relies on efficient algorithms for synthesis and optimization of dedicated clock-stopping circuitry. We describe a new transformation for incompletely specified Mealy-type machines that makes them suitable for gated-clock implementation with a limited increase in complexity. The transformation is probabilistic-driven, and identifies highly-probable idle conditions that will be exploited for the optimal synthesis of the logic block that controls the local clock of the FSM. We formulate and solve a new logic optimization problem, namely, the synthesis of a subfunction of a Boolean function that is minimal in size under a constraint on its probability to be true. We describe the relevance of this problem for the optimal synthesis of gated clocks. A prototype tool has been implemented and its performance, although influenced by the initial structure of the FSM, shows that sizable power reductions can be obtained using our technique.

193 citations


Journal Article•DOI•
TL;DR: This paper presents a methodology for interfacing empirical gate models to reduced order RC interconnect models in terms of a nonlinear iteration procedure and generates a linear equivalent gate model which accurately captures the delays at the interconnect fan-out nodes.
Abstract: For efficiency, the performance of digital CMOS gates is often expressed in terms of empirical models. Both delay and short-circuit power dissipation are sometimes characterized as a function of load capacitance and input signal transition time. However, gate loads can no longer be modeled by purely capacitive loads for high performance CMOS due to the RC metal interconnect effects. This paper presents a methodology for interfacing empirical gate models to reduced order RC interconnect models in terms of a nonlinear iteration procedure. The delay and power are calculated with errors on the same order as those for the original empirical equations. Moreover, a linear equivalent gate model is generated which accurately captures the delays at the interconnect fan-out nodes.

172 citations


Journal Article•DOI•
TL;DR: A methodology for the automatic synthesis of full-custom IC layout with analog constraints is presented, guaranteeing that all performance constraints are met when feasible, or otherwise, infeasibility is detected as soon as possible, thus providing a robust and efficient design environment.
Abstract: A methodology for the automatic synthesis of full-custom IC layout with analog constraints is presented. The methodology guarantees that all performance constraints are met when feasible, or otherwise, infeasibility is detected as soon as possible, thus providing a robust and efficient design environment. In the proposed approach, performance specifications are translated into lower-level bounds on parasitics or geometric parameters, using sensitivity analysis. Bounds can be used by a set of specialized layout tools performing stack generation, placement, routing, and compaction. For each tool, a detailed description is provided of its functionality, of the way constraints are mapped and enforced, and of its impact on the design flow. Examples drawn from industrial applications are reported to illustrate the effectiveness of the approach.

162 citations


Journal Article•DOI•
TL;DR: This paper presents an efficient algorithm for identifying f-redundant path delay faults and presents a sufficient condition for functional redundancy, showing that a significant percentage of pathdelay faults are f- redundant for ISCAS'85 benchmark circuits.
Abstract: Recently published results have shown that, for many circuits, only a small percentage of path delay faults is robust testable, Among the robust untestable faults, a significant percentage is not nonrobust testable either. In this paper, we take a closer look at the properties of these nonrobust untestable faults with the goal of determining whether and how these faults should be tested. We define a path delay fault to be functional redundant (f-redundant) if, regardless of the delays at all other signals, the circuit performance will not be determined by the path. These paths are false paths-regardless of the delays of all signals. Therefore, these paths cannot and need not be tested. We present a sufficient condition for functional redundancy. We will show that nonrobust untestable faults are not necessarily f-redundant. For those nonrobust untestable but functional irredundant (f-irredundant) faults, the corresponding path may become a true path, and thus may determine the circuit performance under the faulty condition. We present an efficient algorithm for identifying f-redundant path delay faults. Results show that a significant percentage of path delay faults are f-redundant for ISCAS'85 benchmark circuits. Identification of f-redundant faults has two important applications: 1) it provides a more realistic fault coverage measure (as the number of detected faults divided by the total number of f-irredundant faults), 2) For circuits with a large number of paths, testing only a subset of paths becomes a common practice. The path selection process can be guided to avoid selecting f-redundant paths. To illustrate this application, we present an algorithm for selecting a set of f-irredundant path delay faults that includes at least one of the longest f-irredundant paths for each signal in the circuits.

148 citations


Journal Article•DOI•
TL;DR: The algorithm solves the previously open problem of synthesizing CA for all practical applications and shows that two CA exist for each irreducible polynomial, solving thePreviously open CA existence conjecture.
Abstract: This paper presents a method for the synthesis of a one-dimensional linear hybrid cellular automaton (CA) from a given irreducible polynomial. A detailed description of the algorithm is given, together with an outline of the theoretical background. It is shown that two CA exist for each irreducible polynomial, solving the previously open CA existence conjecture. An in-depth example of the synthesis is presented, along with timing benchmarks and an operation count. The algorithm solves the previously open problem of synthesizing CA for all practical applications.

146 citations


Journal Article•DOI•
TL;DR: An accurate and practical method of estimating interconnect capacitances for a given circuit layout and the resulting model capacitance values are found to be within 10% of both the measured data and 3D simulations of structures that are prevalent in typical VLSI chips.
Abstract: We report an accurate and practical method of estimating interconnect capacitances for a given circuit layout. The method allows extraction of the complete circuit level capacitances at each node in the circuit. The layout geometry is reduced into base elements that consist of different vertical profiles at each node in the layout. Accurate analytical models are developed for calculating capacitances of multilayer structures using a 2D capacitance simulator TDTL. These models are then transformed into 3D geometry. The resulting model capacitance values are found to be within 10% of both the measured data and 3D simulations of structures that are prevalent in typical VLSI chips. The models and their coefficients for different vertical profiles are stored in the capacitance extraction tool CUP, which is coupled to the layout extractor HILEX. As each base element has a unique vertical profile, the corresponding capacitance can easily be calculated for each node that is then written out to a circuit netlist. The comparisons of the models with the measured data, as well as 3D simulations results, are also discussed.

Journal Article•DOI•
TL;DR: This paper describes an activity-sensitive power analysis strategy for datapath, memory, control path, and interconnect elements using a new Activity-Based Control model and a hierarchical interconnect analysis strategy that enables estimates of chip area as well as power consumption.
Abstract: Prompted by demands for portability and low-cost packaging, the electronics industry has begun to view power consumption as a critical design criterion. As such there is a growing need for tools that can accurately predict power consumption early in the design process, many high-level power analysis models do not adequately model activity, however, leading to inaccurate results. This paper describes an activity-sensitive power analysis strategy for datapath, memory, control path, and interconnect elements. Since datapath and memory modeling has been described in a previous publication, this paper focuses mainly on a new Activity-Based Control (ABC) model and on a hierarchical interconnect analysis strategy that enables estimates of chip area as well as power consumption. Architecture-level estimates are compared to switch-level measurements based on net lists extracted from the layouts of three chips: a digital filter, a global controller, and a microprocessor. The average power estimation error is about 9% with a standard deviation of 10%, and the area estimates err on average by 14% with a standard deviation of 6%.

Journal Article•DOI•
TL;DR: symbolic algorithms to compute the steady-state probabilities for very large finite state machines (up to 10/sup 27/ states) based on Algebraic Decision Diagrams (ADD's) and solve the corresponding Chapman-Kolmogorov equations.
Abstract: Regarding finite state machines as Markov chains facilitates the application of probabilistic methods to very large logic synthesis and formal verification problems. In this paper we present symbolic algorithms to compute the steady-state probabilities for very large finite state machines (up to 10/sup 27/ states). These algorithms, based on Algebraic Decision Diagrams (ADD's)-an extension of BDD's that allows arbitrary values to be associated with the terminal nodes of the diagrams-determine the steady-state probabilities by regarding finite state machines as homogeneous, discrete-parameter Markov chains with finite state spaces, and by solving the corresponding Chapman-Kolmogorov equations. We first consider finite state machines with state graphs composed of a single terminal strongly connected component; for this type of system we have implemented two solution techniques: One is based on the Gauss-Jacobi iteration, the other one is based on simple matrix multiplication. Then we extend our treatment to the most general case of systems which can be modelled as finite state machines with arbitrary transition structures; here our approach exploits structural information to decompose and simplify the state graph of the machine. We report experimental results obtained for problems on which traditional methods fail.

Journal Article•DOI•
Mitiko Miura-Mattausch1, U. Feldmann1, A. Rahm, M. Bollu, D. Savignac •
TL;DR: The unified treatment of the complete MOSFET model allows all transistor characteristics to be calculated without any nonphysical fitting parameters, and the calculation time is drastically reduced in comparison with a conventional piece-wise model.
Abstract: In this paper, we describe a complete MOSFET model developed for circuit simulation based on fully consistent physical concept. The model describes all transistor characteristics as a function of surface potentials, which are calculated iteratively at each applied voltage under the charge-sheet approximation. The key idea of this development is to put as much physics as possible into the equations describing the surface potentials. Since the model includes both the drift and the diffusion contributions, a single equation is valid from the subthreshold to the saturation regions. Contrary to the expectation, the results show that our semi-implicit model including the iteration procedures can even reduce the CPU time significantly in comparison with a conventional model similar to BSIM2 including short-channel effects. This is due to the consistent description of the model equations for all transistor characteristics, which results in more straightforward device equations, once the surface potentials have been computed.

Journal Article•DOI•
TL;DR: A power estimation technique for digital integrated circuits that operates at the register transfer level (RTL) that is based on the use of entropy as a measure of the average activity to be expected in the final implementation of a circuit, given only its Boolean functional description.
Abstract: We present a power estimation technique for digital integrated circuits that operates at the register transfer level (RTL). Such a high-level power estimation capability Is required in order to provide early warning of any power problems before the circuit-level design has been specified. With such early warning, the designer can explore design trade-offs at a higher level of abstraction than previously possible, reducing design time and cost. Our estimator is based on the use of entropy as a measure of the average activity to be expected in the final implementation of a circuit, given only its Boolean functional description. This technique has been implemented and tested on a variety of circuits. The empirical results to be presented are very promising and demonstrate the feasibility and utility of this approach.

Journal Article•DOI•
TL;DR: An efficient algorithm for technology mapping targeting table look-up (TLU) blocks capable of minimizing either the number of TLUs used or the depth of the produced circuit is proposed.
Abstract: This paper proposes an efficient algorithm for technology mapping targeting table look-up (TLU) blocks. It is capable of minimizing either the number of TLUs used or the depth of the produced circuit. Our approach consists of two steps. First a network of super nodes, is created. Next a Boolean function of each super node with an appropriate don't care set is decomposed into a network of TLUs. To minimize the circuit's depth, several rules are applied on the critical portion of the mapped circuit.

Journal Article•DOI•
TL;DR: It is demonstrated that the average switching activity in the circuit can be calculated using either entropy or informational energy averages and the proposed switching activity estimation technique does not require simulation and is thus extremely fast, yet produces sufficiently accurate estimates.
Abstract: This paper considers the problem of estimating the power consumption at logic and register-transfer levels of design from an information theoretical point of view. In particular, it is demonstrated that the average switching activity in the circuit can be calculated using either entropy or informational energy averages. For control circuits and random logic, the output entropy (informational energy) per bit is calculated as a function of the input entropy (informational energy) per bit and an implementation dependent information scaling factor. For data-path circuits, the output entropy (informational energy) is calculated from the input entropy (informational energy) using a compositional technique which has linear complexity in terms of the circuit size. Finally, from these input and output values, the entropy (informational energy) per circuit line is calculated and used as an estimate for the average switching activity. The proposed switching activity estimation technique does not require simulation and is thus extremely fast, yet produces sufficiently accurate estimates.

Journal Article•DOI•
TL;DR: A prototype system named GATTO is used to assess the effectiveness of the approach in terms of result quality and CPU time requirements and the results are the best ones reported in the literature for most of the largest standard benchmark circuits.
Abstract: This paper deals with automated test pattern generation for large synchronous sequential circuits and describes an approach based on genetic algorithms. A prototype system named GATTO is used to assess the effectiveness of the approach in terms of result quality and CPU time requirements. An account is also given of a distributed version of the same algorithm, named GATTO*. Being based on the PVM library, it runs on any network of workstations and is able to either reduce the required time, or improve the result quality with respect to the monoprocessor version. In the latter case, in terms of Fault Coverage, the results are the best ones reported in the literature for most of the largest standard benchmark circuits. The flexibility of GATTO enables users to easily tradeoff fault coverage and CPU time to suit their needs.

Journal Article•DOI•
M.R. Corazao1, M. Khalaf, L.M. Guerra, Miodrag Potkonjak, Jan M. Rabaey •
TL;DR: This paper introduces a new approach to performance-driven template mapping for high-level synthesis that focuses on datapath-intensive ASIC design, though the concepts are also highly applicable to compiler development.
Abstract: This paper introduces a new approach to performance-driven template mapping for high-level synthesis. Template mapping, the process of mapping high-level algorithmic descriptions to specialized hardware libraries or instruction sets, involves template matching, template selection, and clock selection. Efficient algorithms for each are presented, and novel issues such as partial matching are addressed. The paper focuses on datapath-intensive ASIC design, though the concepts are also highly applicable to compiler development. Experimental results on examples from real applications show significant improvements in throughput with limited area overhead.

Journal Article•DOI•
TL;DR: A time-domain, non-Monte Carlo method for computer simulation of electrical noise in nonlinear dynamic circuits with arbitrary excitations and arbitrary large-signal waveforms is presented, based on results from the theory of stochastic differential equations.
Abstract: A time-domain, non-Monte Carlo method for computer simulation of electrical noise in nonlinear dynamic circuits with arbitrary excitations and arbitrary large-signal waveforms is presented. This time-domain noise simulation method is based on results from the theory of stochastic differential equations. The noise simulation method is general in the following sense. Any nonlinear dynamic circuit with any kind of excitation, which can be simulated by the transient analysis routine in a circuit simulator, can be simulated by our noise simulator in time-domain to produce the noise variances and covariances of circuit variables as a function of time, provided that noise models for the devices in the circuit are available. Noise correlations between circuit variables at different time points can also be calculated. Previous work on computer simulation of noise in electronic circuits is reviewed with comparisons to our method. Shot, thermal, and flicker noise models for integrated-circuit devices, in the context of our time-domain noise simulation method, are discussed. The implementation of this noise simulation method in a circuit simulator (SPICE) is described. Two examples of noise simulation (a CMOS inverter and a BJT active mixer) are given.

Journal Article•DOI•
TL;DR: This paper presents logic optimization techniques for multilevel combinational networks which apply a sequence of perturbations which result in simplification of the circuit through wires/gates addition and removal which are guided by the ATPG based reasoning.
Abstract: In this paper, we present logic optimization techniques for multilevel combinational networks. Our techniques apply a sequence of perturbations which result in simplification of the circuit. The perturbation and simplification is achieved through wires/gates addition and removal which are guided by the Automatic Test Pattern Generation (ATPG) based reasoning. The main operations of our approaches are incremental transformations of the circuit (such as adding wires/gates and changing gate's functionality) to remove some particular wire, At each iteration, a summary information of such wires/gates addition and removal is precomputed first. Then, a transformation is chosen to remove several wires at once. We have performed experiments on MCNC benchmarks and compared the results to those of misII and RAMBO. Experimental results are very encouraging.

Journal Article•DOI•
TL;DR: Evaluation on benchmark functions is given and it proves the superiority of the program to those known from the literature on the basis of power and efficiency.
Abstract: This paper presents a new operation (exorlink) and an algorithm to minimize Exclusive-OR Sum-of-Products expressions (ESOPs) for multiple valued input, two valued output, incompletely specified functions. Exorlink is a more powerful operation than any other existing one for this problem. Evaluation on benchmark functions is given and it proves the superiority of the program to those known from the literature.

Journal Article•DOI•
TL;DR: Algorithms for disjunctive and nondisjunctive decomposition of Boolean functions and Boolean methods for identifying common subfunctions from multiple Boolean functions are presented and results are presented.
Abstract: This paper presents algorithms for disjunctive and nondisjunctive decomposition of Boolean functions and Boolean methods for identifying common subfunctions from multiple Boolean functions. Ordered binary decision diagrams are used to represent and manipulate Boolean functions so that the proposed methods can be implemented concisely. These techniques are applied to the synthesis of look-up table based field programmable gate arrays and results are presented.

Journal Article•DOI•
H. Cho1, Gary D. Hachtel1, Enrico Macii1, B. Plessier1, Fabio Somenzi1 •
TL;DR: In this paper, the original finite state machine is partitioned in component submachines, and each of them is traversed separately; the result of the computation is an over-estimation of the set of reachable states of the original machine.
Abstract: This paper presents algorithms for approximate finite state machine traversal based on state space decomposition. The original finite state machine is partitioned in component submachines, and each of them is traversed separately; the result of the computation is an over-estimation of the set of reachable states of the original machine. Different traversal strategies, which reduce the effects of the degrees of freedom introduced by the decomposition, are discussed. Efficient partitioning is a key point for the performance of the traversal techniques; a method to heuristically find a good decomposition of the overall finite state machine, based on the exploration of its state variable dependency graph, is proposed. Applications of the approximate traversal methods to logic optimization of sequential circuits and behavioral verification of finite state machines are described; experimental results for such applications, together with data concerning pure traversal, are reported.

Journal Article•DOI•
TL;DR: It is shown that the power consumption of a static CMOS circuit is a convex function of the active area and Analytical formulation for the power dissipation of a circuit in terms of the transistor size is derived which includes both the capacitive and the short circuit power Dissipation.
Abstract: A direct approach to transistor sizing for minimizing the power consumption of a CMOS circuit under a delay constraint is presented. In contrast to the existing assumption that the power consumption of a static CMOS circuit is proportional to the active area of the circuit, it is shown that the power consumption is a convex function of the active area. Analytical formulation for the power dissipation of a circuit in terms of the transistor size is derived which includes both the capacitive and the short circuit power dissipation. SPICE circuit simulation results are presented to confirm the correctness of the analytical model. Based on the intuitions drawn from the analytical model, heuristics for initial transistor sizing on critical and noncritical paths for minimum power consumption are developed. Further, fast heuristics to perform transistor sizing in CMOS circuits for minimizing power consumption while meeting the given delay constraints are presented.

Journal Article•DOI•
TL;DR: A CAD tool for analog circuit synthesis is presented that uses fuzzy-logic based reasoning to select one topology among a fixed set of alternatives and to synthesize analog cells with different circuit topologies.
Abstract: A CAD tool for analog circuit synthesis is presented. This tool, called FASY, uses fuzzy-logic based reasoning to select one topology among a fixed set of alternatives. For the selected topology, a two-phase optimizer sizes all elements to satisfy the performance constraints minimizing a cost function. In FASY, the decision rules used in the topology selection process are introduced by an expert designer or automatically generated by means of a learning process that uses the optimizer mentioned above. The capability of learning topology selection rules by experience, is unique in FASY. Practical examples demonstrate the tool ability of this tool to learn topology selection rules and to synthesize analog cells with different circuit topologies.

Journal Article•DOI•
TL;DR: This paper utilizes information from the introduction of clock skew at an edge-triggered flip-flop to find an optimal retiming of the clock period, and views the circuit hierarchically, first solving the clock skew problem at one level above the gate level, and then using local transformations at the gatelevel to perform retimed for the optimal clock period.
Abstract: The introduction of clock skew at an edge-triggered flip-flop has an effect that is similar to the movement of the flip-flop across combinational logic module boundaries, and these are continuous and discrete optimizations with the same effect. While this fact has been recognized before, this paper, for the first time, utilizes this information to find an optimal retiming. The clock period is guaranteed to be at most one gate delay larger than the optimal clock period found using skew alone; note that since skew is a continuous optimization, it is possible that the optimal period may not be achievable. The method views the circuit hierarchically, first solving the clock skew problem at one level above the gate level, and then using local transformations at the gate level to perform retiming for the optimal clock period. The solution is thus divided into two phases. In Phase A, the clock skew optimization problem is solved with the objective of minimizing the clock period, while ensuring that the difference between the maximum and the minimum skew is minimized. Next, in Phase B, retiming is employed and some flip-flops are relocated across gates in an attempt to set the values of all skews to be as close to zero as possible.

Journal Article•DOI•
TL;DR: A framework for a class of algorithms solving shortest path related problems, such as the one-to-one shortest path problem, the one -to-many shortest paths problem and the minimum spanning tree problem, in the presence of obstacles is introduced.
Abstract: We introduce a framework for a class of algorithms solving shortest path related problems, such as the one-to-one shortest path problem, the one-to-many shortest paths problem and the minimum spanning tree problem, in the presence of obstacles. For these algorithms, the search space is restricted to a sparse strong connection graph that is implicitly represented and its searched portion is constructed incrementally on-the-fly during search. The time and space requirements of these algorithms essentially depend on actual search behavior. Therefore, additional techniques or heuristics can be incorporated into search procedure to further improve the performance of the algorithms. These algorithms are suitable for large VLSI design applications with many obstacles.

Journal Article•DOI•
TL;DR: This work investigates the use of multiplexed parity trees (MPTs) for zero-aliasing space compaction, and presents two design techniques based on MPTs-output selection and fanout insertion-that eliminate aliasing for both deterministic and pseudorandom test sets.
Abstract: Built-in self-testing requires test response streams from many observation points to be merged (space compaction) and compressed (time compaction) into a short signature. The compaction circuits should be transparent to error propagation in order to minimize aliasing, which occurs when a faulty response maps to the fault-free signature. We investigate the use of multiplexed parity trees (MPTs) for zero-aliasing space compaction. MPTs combine the error propagation properties of multiplexers and parity trees, and ensure zero aliasing via multistep compaction. We present two design techniques based on MPTs-output selection and fanout insertion-that eliminate aliasing for both deterministic and pseudorandom test sets. Our experiments with the ISCAS benchmark circuits show that zero aliasing can be achieved with small test sets and moderate hardware overhead. We also demonstrate that a very high percentage of single stuck-line faults in the compaction circuit are detected by the test patterns applied to the circuit under test.

Journal Article•DOI•
TL;DR: This work proposes new Steiner and arborescence FPGA routing algorithms that produce routing solutions with optimal source-sink pathlengths, and with wirelength on par with the best existing Steiner tree heuristics.
Abstract: Motivated by the goal of increasing the performance of FPGA-based designs, we propose new Steiner and arborescence FPGA routing algorithms. Our Steiner tree constructions significantly outperform the best known ones and have provably good performance bounds. Our arborescence heuristics produce routing solutions with optimal source-sink pathlengths, and with wirelength on par with the best existing Steiner tree heuristics. We have incorporated these algorithms into an actual FPGA router, which routed a number of industrial circuits using channel width considerably smaller than is achievable by previous routers. Our routing results for both the 3000 and 4000-series Xilinx parts are currently the best known in the Literature.