scispace - formally typeset
Search or ask a question

Showing papers by "Srinivas Devadas published in 1996"


Journal ArticleDOI
TL;DR: This article proves that for the case of a single address register the decision problem is NP-complete, even for a single basic block, and generalizes the problem to multiple address registers.
Abstract: DSP architectures typically provide indirect addressing modes with autoincrement and decrement. In addition, indexing mode is generally not available, and there are usually few, if any, general-purpose registers. Hence, it is necessary to use address registers and perform address arithmetic to access automatic variables. Subsuming the address arithmetic into autoincrement and decrement modes improves the size of the generated code. In this article we present a formulation of the problem of optimal storage assignment such that explicit instructions for address arithmetic are minimized. We prove that for the case of a single address register the decision problem is NP-complete, even for a single basic block. We then generalize the problem to multiple address registers. For both cases heuristic algorithms are given, and experimental results are presented.

177 citations


Proceedings ArticleDOI
10 Nov 1996
TL;DR: A new metric for measuring the extent of design verification provided by a set of functional simulation vectors is proposed, which can be used uniformly for all designs and computes observability information to determine whether effects of errors that are activated by the program stimuli can be observed at the circuit outputs.
Abstract: Functional simulation is the most widely used method for design verification. At various levels of abstraction, e.g., behavioral, register-transfer level and gate level, the designer simulates the design using a large number of vectors attempting to debug and verify the design. A major problem with functional simulation is the lack of good metrics and tools to evaluate the quality of a set of functional vectors. Metrics used currently are based on instruction counts and are quite simplistic. Designers are forced to use ad-hoc methods to terminate functional simulation, e.g., CPU time limitations, We propose a new metric for measuring the extent of design verification provided by a set of functional simulation vectors. This metric is universal, and can be used uniformly for all designs. Our metric computes observability information to determine whether effects of errors that are activated by the program stimuli can be observed at the circuit outputs. We provide preliminary experimental evidence that supports the validity of the proposed metric. We believe that using this metric in design verification will result in higher-quality functional tests and improved correctness checking.

123 citations


Proceedings ArticleDOI
01 Jun 1996
TL;DR: This paper presents a scheduling algorithm which maximizes the "shut-down" period of execution units in a system and shows that this scheduling technique can save up to 40% in power dissipation.
Abstract: "Shut-down" techniques are effective in reducing the power dissipation of logic circuits. Recently, methods have been developed that identify conditions under which the output of a module in a logic circuit is not used for a given clock cycle. When these conditions are met, input latches for that module are disabled, thus eliminating any switching activity and power dissipation. In this paper, we introduce these power management techniques in behavioral synthesis. We present a scheduling algorithm which maximizes the "shut-down" period of execution units in a system. Given a throughput constraint and the number of execution units available, the algorithm first schedules operations that generate controlling signals and activates only those modules whose result is eventually used. We present results which show that this scheduling technique can save up to 40% in power dissipation.

106 citations


Dissertation
01 Jan 1996
TL;DR: This thesis presents techniques for code generation and optimization that target embedded digital signal processors that have proven to be effective in improving the performance and reducing the size of compiled software.
Abstract: The advent of deep submicron processing technology has made it possible and desirable to integrate a processor core, a program ROM, and application-specific circuitry all on a single IC. As the complexity of embedded software grows, highlevel languages such as C and C++ are increasingly employed in writing embedded software. Consequently, high-level language compilers have become an essential tool in the development of embedded systems. Fixed-point digital signal processors are among the most commonly embedded cores, due to their favorable performance–cost characteristics. However, these architectures are usually designed and optimized for their application domain, and pose challenges for compiler technology. Traditional compiler optimizations, though necessary, are insufficient for generating efficient and compact code. Therefore, new optimizations are required to produce code of the highest quality in a reasonable amount of time. In this thesis the author presents techniques for code generation and optimization that target embedded digital signal processors. These techniques have proven to be effective in improving the performance and reducing the size of compiled software. This thesis emphasizes optimization techniques; only by gaining a deeper understanding of the problems involved can we then apply them to a wider class of architectures. Keywords—compiler optimizations, digital signal processors, embedded systems. Thesis Supervisor: Srinivas Devadas Title: Associate Professor of Electrical Engineering and Computer Science

93 citations


Journal ArticleDOI
TL;DR: An exact method of estimating power in pipelined sequential circuits that accurately models the correlation between the vectors applied to the combinational logic of the circuit and is significantly more efficient than methods based on solving Chapman–Kolmogorov equations.
Abstract: Switching activity is a primary cause of power dissipation in combinational and sequential circuits. In this paper, we present a retiming method that targets the power dissipation of a sequential circuit by reducing the switching activity of nodes driving large capacitive loads. We explore the implications of the observation that the switching activity at flip-flop outputs in a synchronous sequential circuit can be significantly less than the activity at the flip-flop inputs. The method automatically determines positions of flip-flops in the circuit so as to heuristically minimize weighted switching activities summed over all the gates and flip-flops in the circuit. We extend this method to minimize power dissipation with a specified clock period. For this work we need to obtain efficiently an estimation of the switching activity of every node in the circuit. We give an exact method of estimating power in pipelined sequential circuits that accurately models the correlation between the vectors applied to the combinational logic of the circuit. This method is significantly more efficient than methods based on solving Chapman–Kolmogorov equations. Experimental results are presented on a variety of circuits.

85 citations


Book ChapterDOI
01 Jan 1996
TL;DR: The advent of 0.5μ processing that allows for the integration of 5 million transistors on a single integrated circuit has brought forth new challenges and opportunities in embedded-system design.
Abstract: The advent of 0.5μ processing that allows for the integration of 5 million transistors on a single integrated circuit has brought forth new challenges and opportunities in embedded-system design. This high level of integration makes it possible and desirable to integrate a processor core, a program ROM, and an ASIC together on a single IC. To justify the design costs of such an IC, these embedded-system designs must be sold in large volumes and, as a result, they are very cost-sensitive. The cost of an IC is most closely linked to its size, which is derived from the final circuit area. It is not unusual for the ROM that stores the program code to be the largest contributor to the area of such ICs. Thus the incremental value of using logic optimization to reduce the size of the ASIC is smaller because the ASIC takes up a relatively smaller percentage of the final circuit area. On the other hand, the potential for cost reduction through diminishing the size of the program ROM is great. There are also often strong real-time performance requirements on the final code; hence, there is a necessity for producing high-performance code as well.

58 citations


Book
30 Nov 1996
TL;DR: This book presents power as a Design Constraint as well as examples of Precomputation Applied to Datapath Modules and some of the techniques used to achieve this goal.
Abstract: 1 Introduction.- 1.1 Power as a Design Constraint.- 1.2 Organization of this Book.- References.- 2 Power Estimation.- 2.1 Power Dissipation Model.- 2.2 Switching Activity Estimation.- 2.2.1 Simulation-Based Techniques.- 2.2.2 Issues in Probabilistic Estimation Techniques.- 2.2.3 Probabilistic Techniques.- 2.3 Summary.- References.- 3 A Power Estimation Method for Combinational Circuits.- 3.1 Symbolic Simulation.- 3.2 Transparent Latches.- 3.3 Modeling Inertial Delay.- 3.4 Power Estimation Results.- 3.5 Summary.- References.- 4 Power Estimation for Sequential Circuits.- 4.1 Pipelines.- 4.2 Finite State Machines: Exact Method.- 4.2.1 Modeling Temporal Correlation.- 4.2.2 State Probability Computation.- 4.2.3 Power Estimation given State Probabilities.- 4.3 Finite State Machines: Approximate Method.- 4.3.1 Basis for the Approximation.- 4.3.2 Computing Present State Line Probabilities.- 4.3.3 Picard-Peano Method.- 4.3.4 Newton-Raphson Method.- 4.3.5 Improving Accuracy using m-Expanded Networks.- 4.3.6 Improving Accuracy using k-Unrolled Networks.- 4.3.7 Redundant State Lines.- 4.4 Results on Sequential Power Estimation Techniques.- 4.5 Modeling Correlation of Input Sequences.- 4.5.1 Completely and Incompletely Specified Input Sequences.- 4.5.2 Assembly Programs.- 4.5.3 Experimental Results.- 4.6 Summary.- References.- 5 Optimization Techniques for Low Power Circuits.- 5.1 Power Optimization by Transistor Sizing.- 5.2 Combinational Logic Level Optimization.- 5.2.1 Path Balancing.- 5.2.2 Don't-care Optimization.- 5.2.3 Logic Factorization.- 5.2.4 Technology Mapping.- 5.3 Sequential Optimization.- 5.3.1 State Encoding.- 5.3.2 Encoding in the Datapath.- 5.3.3 Gated Clocks.- 5.4 Summary.- References.- 6 Retiming for Low Power.- 6.1 Review of Retiming.- 6.1.1 Basic Concepts.- 6.1.2 Applications of Retiming.- 6.2 Retiming for Low Power.- 6.2.1 Cost Function.- 6.2.2 Verifying a Given Clock Period.- 6.2.3 Retiming Constraints.- 6.2.4 Executing the Retiming.- 6.3 Experimental Results.- 6.4 Conclusions.- References.- 7 Precomputation.- 7.1 Subset Input Disabling Precomputation.- 7.1.1 Subset Input Disabling Precomputation Architecture.- 7.1.2 An Example.- 7.1.3 Synthesis of Precomputation Logic.- 7.1.4 Multiple-Output Functions.- 7.1.5 Examples of Precomputation Applied to Datapath Modules.- 7.1.6 Multiple Cycle Precomputation.- 7.1.7 Experimental Results for the Subset Input Disabling Architecture.- 7.2 Complete Input Disabling Precomputation.- 7.2.1 Complete Input Disabling Precomputation Architecture.- 7.2.2 An Example.- 7.2.3 Synthesis of Precomputation Logic.- 7.2.4 Simplifying the Original Combinational Logic Block.- 7.2.5 Multiple-Output Functions.- 7.2.6 Experimental Results for the Complete Input Disabling Architecture.- 7.3 Combinational Precomputation.- 7.3.1 Combinational Logic Precomputation.- 7.3.2 Precomputation at the Inputs.- 7.3.3 Precomputation for Arbitrary Sub-Circuits in a Circuit.- 7.3.4 Experimental Results for the Combinational Precomputation Architecture.- 7.4 Multiplexor-Based Precomputation.- 7.5 Conclusions.- References.- 8 High-Level Power Estimation and Optimization.- 8.1 Register Transfer Level Power Estimation.- 8.1.1 Functional Modules.- 8.1.2 Controller.- 8.1.3 Interconnect.- 8.2 Behavioral Level Synthesis for Low Power.- 8.2.1 Transformation Techniques.- 8.2.2 Scheduling Techniques.- 8.2.3 Allocation Techniques.- 8.2.4 Optimizations at the Register-Transfer Level.- 8.3 Conclusions.- References.- 9 Conclusion.- 9.1 Power Estimation at the Logic Level.- 9.2 Optimization Techniques at the Logic Level.- 9.3 Estimation and Optimization Techniques at the RT Level.- References.

35 citations


Journal ArticleDOI
22 Nov 1996
TL;DR: A survey of state-of-the-art power estimation methods and optimization techniques targeting low power VLSI circuits.
Abstract: We present a survey of state-of-the-art power estimation methods and optimization techniques targeting low power VLSI circuits Estimation and optimizations at the circuit and logic levels are considered

6 citations


Proceedings ArticleDOI
18 Nov 1996
TL;DR: A low bandwidth protocol for wireless multi-media terminals targeted towards low power consumption on the terminal side and error correction and retransmission methods capable of dealing with burst error noise up to BERs of 10/sup -3/.
Abstract: We present a low bandwidth protocol for wireless multi-media terminals targeted towards low power consumption on the terminal side. With the widespread use of portable computing devices, low power has become a major design criterion. One way of minimizing power consumption is to perform all tasks, other than managing hardware for the display and input, on a stationary workstation and exchange information between that workstation and the portable terminal via a wireless link. A protocol for such a system that emphasizes low bandwidth and low power requirements is presented. Such a protocol should address the issue of noisy wireless channels. We describe error correction and retransmission methods capable of dealing with burst error noise up to BERs of 10/sup -3/. The final average bandwidth required is 140 kbits/sec for 8-bit color applications.

6 citations


Proceedings ArticleDOI
12 May 1996
TL;DR: A simulator that transforms the switching activity in a given CMOS combinational circuit into a set of weighted Boolean clauses in terms of the two-vector sequence is presented and transformational rules that account for differing arrival times of the input signals, gate sizes, load capacitances, and signal glitching phenomena are developed.
Abstract: Electromigration of metal wires and excessive heat dissipation are two common reliability problems facing IC designers today. Techniques for fast estimation of peak supply current and power dissipation in ICs to ensure proper design and long-term reliable operation are of great interest. However, finding such peak quantities in VLSI combinational circuits has always been a difficult problem due to the exponential dependence on the input vectors applied to the given circuit. Static CMOS combinational circuits complicate the problem further because such worst-case quantities depend on the application of two successive input-vector patterns. In this paper, we present a simulator that transforms the switching activity in a given CMOS combinational circuit into a set of weighted Boolean clauses in terms of the two-vector sequence. We develop transformational rules that account for differing arrival times of the input signals, gate sizes, load capacitances, and signal glitching phenomena in a given circuit. Satisfying a maximum number of the Boolean clauses, weighted appropriately by the node capacitances and strengths of transistors driving the nodes, yields the desired sequence of two input vectors. PEGS is developed to formulate the switching activity, find the worst-case input vector pattern, and calculate the peak supply current waveform as well as power dissipation from circuit netlists. Experimental results show, on the average, no more than 10% deviation from those of SPICE and a significant speedup of our simulator over SPICE.

6 citations


01 Jan 1996
TL;DR: A methodology for low power design based on selectively precomputing the output logic values of a circuit one clock cycle before they are required, and using the precomputed values to reduce internal switching activity in the succeeding clock cycle is developed.
Abstract: Rapid increases in chip complexity, increasingly faster clocks, and the proliferation of portable devices have combined to make power dissipation an important design parameter. The power consumption of a digital system determines its heat dissipation as well as battery life. For some systems, power has become the most critical design constraint. In this thesis we develop a methodology for low power design. We first present techniques for estimating the average power dissipation of a logic circuit. At the logic level, power dissipation is directly related to switching activity. We describe a symbolic simulation method to accurately and efficiently compute the switching activity in logic circuits. This method is extended to handle sequential logic circuits, namely by modeling correlation in time and by calculating the probabilities of present state lines. In the second part of this thesis we develop methods for the reduction of switching activity in logic circuits. We present a retiming method for low power. Registers are re-positioned such that the overall glitching in the circuit is minimized. We then propose a powerful optimization method that is based on selectively precomputing the output logic values of a circuit one clock cycle before they are required, and using the precomputed values to reduce internal switching activity in the succeeding clock cycle. Finally we describe a scheduling method that maximizes the inactivity period of the modules in a circuit. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

Journal ArticleDOI
TL;DR: The robust nature of the gate delay fault tests corresponding to Theorems 7 and 8 in the original paper is clarified and described in greater detail.
Abstract: For original paper see ibid., vol. 11, pp. 87-101 (Jan. 1992). The robust nature of the gate delay fault tests corresponding to Theorems 7 and 8 in the original paper is clarified and described in greater detail. There are two types of robust tests for gate delay faults: a hazard-free robust test for a gate delay fault on a gate g is a robust test where only paths that pass through g are event sensitized; a general robust test for a gate delay fault on a gate g is a robust test where paths that do not pass through g can be event sensitized. The two types of robust tests are illustrated.