scispace - formally typeset
Search or ask a question

Showing papers by "Srinivas Devadas published in 1995"


Proceedings ArticleDOI
01 Jan 1995
TL;DR: This work surveys state-of-the-art optimization methods that target low power dissipation in VLSI circuits and considers the circuit, logic, architectural and system levels.
Abstract: We survey state-of-the-art optimization methods that target low power dissipation in VLSI circuits. Optimizations at the circuit, logic, architectural and system levels are considered.

257 citations


Journal ArticleDOI
TL;DR: This work describes a comprehensive framework for exact and approximate switching activity estimation in a sequential circuit and shows that the approximation scheme is within 1-3% of the exact method, but is orders of magnitude faster for large circuits.
Abstract: Recently developed methods for power estimation have primarily focused on combinational logic. We present a framework for the efficient and accurate estimation of average power dissipation in sequential circuits. Switching activity is the primary cause of power dissipation in CMOS circuits. Accurate switching activity estimation for sequential circuits is considerably more difficult than that for combinational circuits, because the probability of the circuit being in each of its possible states has to be calculated. The Chapman-Kolmogorov equations can be used to compute the exact state probabilities in steady state. However, this method requires the solution of a linear system of equations of size 2/sup N/ where N is the number of flip-flops in the machine. We describe a comprehensive framework for exact and approximate switching activity estimation in a sequential circuit. The basic computation step is the solution of a nonlinear system of equations which is derived directly from a logic realization of the sequential machine. Increasing the number of variables or the number of equations in the system results in increased accuracy. For a wide variety of examples, we show that the approximation scheme is within 1-3% of the exact method, but is orders of magnitude faster for large circuits. Previous sequential switching activity estimation methods can have significantly greater inaccuracies. >

144 citations


Proceedings ArticleDOI
27 Mar 1995
TL;DR: This work presents a framework for code size minimization where the compressed data consists of a dictionary and a skeleton, which can be computed using popular text compression algorithms.
Abstract: We address the problem of code size minimization in VLSI systems with embedded DSP processors. Reducing code size reduces the production cost of embedded systems. We use data compression methods to develop code size minimization strategies. We present a framework for code size minimization where the compressed data consists of a dictionary and a skeleton. The dictionary can be computed using popular text compression algorithms. We describe two methods to execute the compressed code that have varying performance characteristics and varying degrees of freedom in compressing the code. Experimental results obtained with a TMS320C25 code generator are presented.

104 citations


Proceedings ArticleDOI
01 Jun 1995
TL;DR: This paper proves that for the case of a single address register the decision problem is NP-complete and generalizes the problem to multiple address registers, and presents a formulation of the problem of optimal storage assignment such that explicit instructions for address arithmetic are minimized.
Abstract: DSP architectures typically provide indirect addressing modes with auto-increment and decrement. In addition, indexing mode is not available, and there are usually few, if any, general-purpose registers. Hence, it is necessary to use address registers and perform address arithmetic to access automatic variables. Subsuming the address arithmetic into auto-increment and auto-decrement modes improves the size of the generated code.In this paper we present a formulation of the problem of optimal storage assignment such that explicit instructions for address arithmetic are minimized. We prove that for the case of a single address register the decision problem is NP-complete. We then generalize the problem to multiple address registers. For both cases heuristic algorithms are given. Our experimental results indicate an improvement of 3.

98 citations


Proceedings ArticleDOI
01 Dec 1995
TL;DR: It is shown that optimal instruction selection on a PAG in the case of accumulator-based architectures requires a partial scheduling of nodes in the DAG, and the binate covering formulation is augmented to minimize spills and reloads.
Abstract: We address the problem of instruction selection in code generation for embedded DSP microprocessors. Such processors have highly irregular data-paths, and conventional code generation methods typically result in inefficient code. Instruction selection can be formulated as directed acyclic graph (DAG) covering. Conventional methods for instruction selection use heuristics that break up the DAG into a forest of trees and then cover them independently. This breakup can result in suboptimal solutions for the original DAG. Alternatively, the DAG covering problem can be formulated as a binate covering problem, and solved exactly or heuristically using branch-and-bound methods. We show that optimal instruction selection on a DAG in the case of accumulator-based architectures requires a partial scheduling of nodes in the DAG, and we augment the binate covering formulation to minimize spills and reloads. We show how the irregular data transfer costs of typical DSP data-paths can be modeled in the binate covering formulation.

88 citations


Proceedings ArticleDOI
01 Jan 1995
TL;DR: This paper formulate and solve some optimization problems that arise in code generation for processors with irregular datapaths, and presents optimal and heuristic algorithms that determine an instruction schedule simultaneously optimizing accumulator spilling and mode selection.
Abstract: We address the problem of code optimization for embedded DSP microprocessors. Such processors (e.g., those in the TMS320 series) have highly irregular datapaths, and conventional code generation methods typically result in inefficient code. In this paper we formulate and solve some optimization problems that arise in code generation for processors with irregular datapaths. In addition to instruction scheduling and register allocation, we also formulate the accumulator spilling and mode selection problems that arise in DSP microprocessors. We present optimal and heuristic algorithms that determine an instruction schedule simultaneously optimizing accumulator spilling and mode selection. Experimental results are presented.

80 citations


Proceedings ArticleDOI
23 Apr 1995
TL;DR: It is shown how user-specified sequences and programs can be modeled using a finite state machine, termed an input-modeling finite state machines or IMFSM, to aid the design of programmable controllers or processors.
Abstract: We describe an approach to estimate the average power dissipation in sequential logic circuits under user-specified input sequences or programs. This approach will aid the design of programmable controllers or processors, by enabling the estimation of the power dissipated when the controller or processor is running specific application programs. Current approaches to sequential circuit power estimation are limited by the fact that the input sequences to the sequential circuit are assumed to be uncorrelated. In reality, the inputs come from other sequential circuits, or are application programs. In this paper we show how user-specified sequences and programs can be modeled using a finite state machine, termed an input-modeling finite state machines or IMFSM. Power estimation can be carried out using existing sequential circuit power estimation methods on a cascade circuit consisting of the IMFSM and the original sequential circuit.

34 citations


Proceedings ArticleDOI
27 Mar 1995
TL;DR: New precomputation architectures for both combinational and sequential logic and new precomPUTation-based logic synthesis methods that optimize logic circuits for low power are presented.
Abstract: Precomputation is a recently proposed logic optimization technique which selectively disables the inputs of a sequential logic circuit, thereby reducing switching activity and power dissipation, without changing logic functionality. In this paper, we present new precomputation architectures for both combinational and sequential logic and describe new precomputation-based logic synthesis methods that optimize logic circuits for low power. We present a general precomputation architecture for sequential logic circuits and show that it is significantly more powerful than the architectures previously treated in the literature. In this architecture, output values required in a particular clock cycle are selectively precomputed one clock cycle earlier, and the original logic circuit is "turned off" in the succeeding clock cycle. The very power of this architecture makes the synthesis of precomputation logic a challenging problem and we present a method to automatically synthesize precomputation logic for this architecture. We introduce a powerful precomputation architecture for combinational logic circuits that uses transmission gates or transparent latches to disable parts of the logic. Unlike in the sequential circuit architecture, precomputation occurs in an early portion of a clock cycle, and parts of the combinational logic circuit are "turned off" in a later portion of the same clock cycle. Further we are not restricted to perform precomputation on the primary inputs. Preliminary results obtained using the described methods are presented. Up to 66 percent reductions in switching activity and power dissipation are possible using the proposed architectures. For many examples, the proposed architectures result in significantly less power dissipation than previously developed methods.

32 citations


Journal ArticleDOI
TL;DR: A new method for directly synthesizing a hazard-free multilevel logic implementation from a given logic specification based on free/ordered Binary Decision Diagrams is described, which is naturally applicable to multiple-output logic functions.
Abstract: We describe a new method for directly synthesizing a hazard-free multilevel logic implementation from a given logic specification. The method is based on free/ordered Binary Decision Diagrams (BDD's), and is naturally applicable to multiple-output logic functions. Given an incompletely-specified (multiple-output) Boolean function, the method produces a multilevel logic network that is hazard-free for a specified set of multiple-input changes. We assume an arbitrary (unbounded) gate and wire delay model under a pure delay (PD) assumption, we permit multiple-input changes, and we consider both static and dynamic hazards under the fundamental-mode assumption. Our framework is thus general and powerful. While it is not always possible to generate hazard-free implementations using our technique, we show that in some cases hazard-free multilevel implementations can be generated when hazard-free two-level representations cannot be found. This problem is generally regarded as a difficult problem and it has important applications in the field of asynchronous design. The method has been automated and applied to a number of examples. >

29 citations


Journal ArticleDOI
TL;DR: It is shown that functions difficult to verify using reduced, ordered binary decision diagrams can be verified using the free Boolean diagrams package using substantially less memory.
Abstract: We propose a data structure for Boolean functions termed "the free Boolean diagram." A free Boolean diagram allows decision vertices as in the conventional binary decision diagram, but also allows function vertices corresponding to the AND and XOR functions. It has been shown previously that the equivalence of two free Boolean diagrams can be decided probabilistically in polynomial time. Based on the equivalence checking method, we develop a set of algorithms for the probabilistic construction of free Boolean diagrams from multilevel combinational logic circuits, and for their manipulation. These algorithms are modified versions of reduced, ordered binary decision diagram manipulation methods. We provide the implementation details of a free Boolean diagram package. We show that functions difficult to verify using reduced, ordered binary decision diagrams can be verified using the free Boolean diagrams package using substantially less memory. >

14 citations