scispace - formally typeset
Search or ask a question

Showing papers by "Massoud Pedram published in 2006"


Journal Article•DOI•
25 Sep 2006
TL;DR: A brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power V LSI circuits is presented.
Abstract: The growing packing density and power consumption of very large scale integration (VLSI) circuits have made thermal effects one of the most important concerns of VLSI designers The increasing variability of key process parameters in nanometer CMOS technologies has resulted in larger impact of the substrate and metal line temperatures on the reliability and performance of the devices and interconnections Recent data shows that more than 50% of all integrated circuit failures are related to thermal issues This paper presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power VLSI circuits The paper is concluded with an overview of techniques to improve the full-chip thermal integrity by means of off-chip versus on-chip and static versus adaptive methods

420 citations


Journal Article•DOI•
TL;DR: The proposed high-level model, which relies on online current and voltage measurements, correctly accounts for the temperature and cycle aging effects and has a maximum of 5% error between simulated and predicted data.
Abstract: Predicting the residual energy of the battery source that powers a portable electronic device is imperative in designing and applying an effective dynamic power management policy for the device This paper starts up by showing that a 30% error in predicting the battery capacity of a lithium-ion battery can result in up to 20% performance degradation for a dynamic voltage and frequency scaling algorithm Next, this paper presents a closed form analytical expression for predicting the remaining capacity of a lithium-ion battery The proposed high-level model, which relies on online current and voltage measurements, correctly accounts for the temperature and cycle aging effects The accuracy of the high-level model is validated by comparing it with DUALFOIL simulation results, demonstrating a maximum of 5% error between simulated and predicted data

271 citations


Journal Article•DOI•
TL;DR: The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion, thereby, achieving a good balance between the two sources of distortion while maximizing the backlight dimming-driven energy saving in the display system and meeting an overall video quality figure of merit.
Abstract: Liquid crystal displays (LCDs) have appeared in applications ranging from medical equipment to automobiles, gas pumps, laptops, and handheld portable computers. These display components present a cascaded energy attenuator to the battery of the handheld device which is responsible for about half of the energy drain at maximum display intensity. As such, the display components become the main focus of every effort for maximization of embedded system's battery lifetime. This paper proposes an approach for pixel transformation of the displayed image to increase the potential energy saving of the backlight scaling method. The proposed approach takes advantage of human visual system (HVS) characteristics and tries to minimize distortion between the perceived brightness values of the individual pixels in the original image and those of the backlight-scaled image. This is in contrast to previous backlight scaling approaches which simply match the luminance values of the individual pixels in the original and backlight-scaled images. Furthermore, this paper proposes a temporally-aware backlight scaling technique for video streams. The goal is to maximize energy saving in the display system by means of dynamic backlight dimming subject to a video distortion tolerance. The video distortion comprises of: 1) an intra-frame (spatial) distortion component due to frame-sensitive backlight scaling and transmittance function tuning and 2) an inter-frame (temporal) distortion component due to large-step backlight dimming across frames modulated by the psychophysical characteristics of the human visual system. The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion, thereby, achieving a good balance between the two sources of distortion while maximizing the backlight dimming-driven energy saving in the display system and meeting an overall video quality figure of merit. The proposed dynamic backlight scaling approach is amenable to highly efficient hardware realization and has been implemented on the Apollo Testbed II. Actual current measurements demonstrate the effectiveness of proposed technique compared to the previous backlight dimming techniques, which have ignored the temporal distortion effect

76 citations


Proceedings Article•DOI•
24 Jan 2006
TL;DR: A three-phase solution framework, which integrates power management scheduling and task voltage assignment, is proposed, which outperforms existing methods by an average of 18% in terms of the system-wide energy savings.
Abstract: This paper addresses the problem of minimizing energy consumption of a computer system performing periodic hard real-time tasks with precedence constraints. In the proposed approach, dynamic power management and voltage scaling techniques are combined to reduce the energy consumption of the CPU and devices. The optimization problem is first formulated as an integer programming problem. Next, a three-phase solution framework, which integrates power management scheduling and task voltage assignment, is proposed. Experimental results show that the proposed approach outperforms existing methods by an average of 18% in terms of the system-wide energy savings.

69 citations


Proceedings Article•DOI•
24 Jul 2006
TL;DR: A new current-based cell delay model is utilized, which can accurately compute the output waveform for input waveforms of arbitrary shapes subjected to noise, and the cell parasitic capacitances are pre-characterized by lookup tables to improve the accuracy.
Abstract: A statistical model for the purpose of logic cell timing analysis in the presence of process variations is presented. A new current-based cell delay model is utilized, which can accurately compute the output waveform for input waveforms of arbitrary shapes subjected to noise. The cell parasitic capacitances are pre-characterized by lookup tables to improve the accuracy. To capture the effect of process parameter variations on the cell behavior, the output voltage waveform of logic cells is modeled by a stochastic Markovian process in which the voltage value probability distribution at each time instance is computed from that of the previous time instance. Next the probability distribution of a % V/sub dd/ crossing time, i.e., the hitting time of the output voltage stochastic process is computed. Experimental results demonstrate the high accuracy of our cell delay model compared to Monte-Carlo-based SPICE simulations.

68 citations


Proceedings Article•DOI•
06 Mar 2006
TL;DR: The paper introduces the notion of quantum factored forms and presents a canonical and concise representation of quantum logic circuits in the form of quantum decision diagrams (QDD's), which are amenable to efficient manipulation and optimization including recursive unitary functional bi-decomposition.
Abstract: Quantum information processing technology is in its pioneering stage and no proficient method for synthesizing quantum circuits has been introduced so far. This paper introduces an effective analysis and synthesis framework for quantum logic circuits. The proposed synthesis algorithm and flow can generate a quantum circuit using the most basic quantum operators, i. e., the rotation and controlled-rotation primitives. The paper introduces the notion of quantum factored forms and presents a canonical and concise representation of quantum logic circuits in the form of quantum decision diagrams (QDD’s), which are amenable to efficient manipulation and optimization including recursive unitary functional bi-decomposition. This paper concludes by presenting the QDD-based algorithm for automatic synthesis of quantum circuits.

60 citations


Journal Article•DOI•
TL;DR: It is shown that the variational balanced truncation technique produces reduced systems that accurately follow the time- and frequency-domain responses of the original system when variations in the circuit parameters are taken into consideration.
Abstract: This paper presents a spectrally weighted balanced truncation (SBT) technique for tightly coupled integrated circuit interconnects, when the interconnect circuit parameters change as a result of statistical variations in the manufacturing process. The salient features of this algorithm are the inclusion of the parameter variation in the RLCK interconnect, the guaranteed passivity of the reduced transfer function, and the availability of provable spectrally weighted error bounds for the reduced-order system. This paper shows that the variational balanced truncation technique produces reduced systems that accurately follow the time- and frequency-domain responses of the original system when variations in the circuit parameters are taken into consideration. Experimental results show that the new variational SBT attains, in average, 30% more accuracy than the variational Krylov-subspace-based model-order reduction techniques.

58 citations


Proceedings Article•DOI•
06 Mar 2006
TL;DR: Simulation results with a 65 nm process demonstrate that this technique can reduce the total leakage power dissipation of a 64 Kb SRAM by more than 50% and incurs neither area nor delay overhead.
Abstract: Aggressive CMOS scaling results in low threshold voltage and thin oxide thickness for transistors manufactured in very deep submicron regime. As a result, reducing the subthreshold and gate-tunneling leakage currents has become one of the most important criteria in the design of VLSI circuits. This paper presents a method based on dual-V t and dual-Tox assignment to reduce the total leakage power dissipation of SRAMs while maintaining their performance. The proposed method is based on the observation that the read and write delays of a memory cell in an SRAM block depend on the physical distance of the cell from the sense amplifier and the decoder. Thus, the idea is to deploy different types of six-transistor SRAM cells corresponding to different threshold voltage and oxide thickness assignments for the transistors. Unlike other techniques for low-leakage SRAM design, the proposed technique incurs neither area nor delay overhead. In addition, it results in a minor change in the SRAM design flow. Simulation results with a 65 nm process demonstrate that this technique can reduce the total leakage power dissipation of a 64 Kb SRAM by more than 50%

49 citations


Patent•
02 Mar 2006
TL;DR: In this paper, the authors present a method for determining a pixel transformation function that maximizes backlight dimming while maintaining a pre-specified distortion level, based on the original image and the distortion level.
Abstract: An embodiment of the present invention is directed to a method for determining a pixel transformation function that maximizes backlight dimming while maintaining a pre-specified distortion level. The method includes determining a minimum dynamic range of pixel values in a transformed image based on an original image and the pre-specified distortion level and determining the pixel transformation function. The pixel transformation function takes a histogram of the original image to a uniform distribution histogram having the minimum dynamic range.

48 citations


Proceedings Article•DOI•
24 Jul 2006
TL;DR: The proposed charge recycling technique can save up to 46% of the mode transition energy while, in most cases, maintaining, or even improving, the wake up time of the original circuit.
Abstract: Designing an energy efficient power gating structure is an important and challenging task in multi-threshold CMOS (MTCMOS) circuit design. In order to achieve a very low power design, the large amount of energy consumed during mode transition in MTCMOS circuits should be avoided. In this paper, we propose an appropriate charge recycling technique to reduce energy consumption during the mode transition of MTCMOS circuits. The proposed method can save up to 46% of the mode transition energy while, in most cases, maintaining, or even improving, the wake up time of the original circuit. It also reduces the peak negative voltage value and the settling time of the ground bounce.

31 citations


Proceedings Article•DOI•
27 Mar 2006
TL;DR: Experimental results show that this technique can reduce the leakage-power dissipation of a 64Kb SRAM by more than 35% and improves the static noise margin under process variations.
Abstract: This paper presents a method based on dual threshold voltage assignment to reduce the leakage power dissipation of SRAMs while maintaining their performance. The proposed method is based on the observation that the read and write delays of a memory cell in an SRAM block depend on the physical distance of the cell from the sense amplifier and the decoder. The key idea is thus to realize and deploy different types of six-transistor SRAM cells corresponding to different threshold voltage assignments for individual transistors in the cell. Unlike other techniques for low-leakage SRAM design, the proposed technique incurs no area or delay overhead. In addition, it results only in a slight change in the SRAM design flow. Finally, it improves the static noise margin under process variations. Experimental results show that this technique can reduce the leakage-power dissipation of a 64Kb SRAM by more than 35%.

Proceedings Article•DOI•
24 Jul 2006
TL;DR: The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion.
Abstract: This paper presents a temporally-aware backlight scaling (TABS) technique for video streams. The goal is to maximize energy saving in the display system by means of dynamic backlight dimming subject to a user-specified tolerance on the video distortion. The video distortion itself comprises of (i) an intra-frame (spatial) distortion component due to frame-sensitive backlight scaling and transmittance function tuning and (ii) an inter-frame (temporal) distortion component due to large-step backlight dimming across multiple frames and modulated by the physiological characteristics of the human visual system. The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion. The proposed technique has been implemented on the Apollo testbed II hardware platform. Actual current measurements on this platform demonstrate the superiority of TABS compared to previous backlight dimming techniques.

Proceedings Article•DOI•
05 Nov 2006
TL;DR: An efficient adaptive method to perform dynamic voltage and frequency management (DVFM) for minimizing the energy consumption of microprocessor chips is presented and leads to power savings of up to 60% for highly correlated workloads compared to DVFM systems based on fixed update intervals.
Abstract: An efficient adaptive method to perform dynamic voltage and frequency management (DVFM) for minimizing the energy consumption of microprocessor chips is presented. Instead of using a fixed update interval, the proposed DVFM system makes use of adaptive update intervals for optimal frequency and voltage scheduling. The optimization enables the system to rapidly track the workload changes so as to meet soft real-time deadlines. The method, which is based on introducing the concept of an effective deadline, utilizes the correlation between consecutive values of the workload. In practice because the frequency and voltage update rates are dynamically set based on variable update interval lengths, voltage fluctuations on the power network are also minimized. The technique, which may be implemented by simple hardware and is completely transparent from the application, leads to power savings of up to 60% for highly correlated workloads compared to DVFM systems based on fixed update intervals.

Proceedings Article•DOI•
06 Mar 2006
TL;DR: This paper presents a timeout-driven DPM technique which relies on the theory of Markovian processes to determine the energy-optimal timeout values for a system with multiple power saving states while satisfying a set of user defined performance constraints.
Abstract: This paper presents a timeout-driven DPM technique which relies on the theory of Markovian processes. The objective is to determine the energy-optimal timeout values for a system with multiple power saving states while satisfying a set of user defined performance constraints. More precisely, a controllable Markovian process is exploited to model the power management behavior of a system under the control of a timeout policy. Starting with this model, a perturbation analysis technique is applied to develop an offline gradient-based approach to determine the optimal timeout values. Online implementation of this technique for a system with dynamically-varying system parameters is also described. Experimental results demonstrate the effectiveness of the proposed approach. Introduction Dynamic power management (DPM), which refers to selective shut-off or slow-down of components that are idle or underutilized, has proven to be a particularly effective technique for reducing power dissipation in such systems. In the literature, various DPM techniques have been proposed, from heuristic methods presented in early works [ 1][ 2] to stochastic optimization approaches [ 3][ 4]. Among the heuristic DPM methods, the timeout policy is the most widely used approach in industry and has been implemented in many operating systems. Examples include the power management scheme incorporated into the Windows system, the low-power saving mode of the IEEE 802.11a-g protocol for wireless LAN card, and the enhanced adaptive battery life extender (EABLE) for the Hitachi disk drive. Most of these industrial DPM techniques provide mechanisms to adjust the timeout values at the user level.

Proceedings Article•DOI•
06 Mar 2006
TL;DR: Experimental results show average errors of less than 2% for the mean, variance and skewness of interconnect delay and slew while achieving orders of magnitude speedup with respect to a Monte Carlo simulation with 104 samples.
Abstract: This paper focuses on statistical interconnect timing analysis in a parameterized block-based statistical static timing analysis tool. In particular, a new framework for performing timing analysis of RLC networks with step inputs, under both Gaussian and non-Gaussian sources of variation, is presented. In this framework, resistance, inductance, and capacitance of the RLC line are modeled in a canonical first order form and used to produce the corresponding propagation delay and slew (time) in the canonical first-order form. To accomplish this step, mean, variance, and skewness of delay and slew distributions are obtained in an efficient, yet accurate, manner. The proposed framework can be extended to consider higher order terms of the various sources of variation. Experimental results show average errors of less than 2% for the mean, variance and skewness of interconnect delay and slew while achieving orders of magnitude speedup with respect to a Monte Carlo simulation with 104 samples.

Proceedings Article•DOI•
24 Jan 2006
TL;DR: A new framework for performing statistical gate timing analysis for non-Gaussian sources of variation in block-based sigmaTA is introduced and a statistical effective capacitance calculation method is presented to achieve the aforementioned objective.
Abstract: As technology scales down, timing verification of digital integrated circuits becomes an increasingly challenging task due to the gate and wire variability. Therefore, statistical timing analysis (denoted by /spl sigma/TA) is becoming unavoidable. This paper introduces a new framework for performing statistical gate timing analysis for non-Gaussian sources of variation in block-based /spl sigma/TA. First, an approach is described to approximate a variational RC-/spl pi/ load by using a canonical first-order model. Next, an accurate variation-aware gate timing analysis based on statistical input transition, statistical gate timing library, and statistical RC-/spl pi/ load is presented. Finally, to achieve the aforementioned objective, a statistical effective capacitance calculation method is presented. Experimental results show an average error of 6% for gate delay and output transition time with respect to the Monte Carlo simulation with 10/sup 4/ samples while the runtime is nearly two orders of magnitude shorter.

01 Jan 2006
TL;DR: In this article, the authors present a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power very large scale integration (VLSI) circuits.
Abstract: The growing packing density and power con- sumption of very large scale integration (VLSI) circuits have made thermal effects one of the most important concerns of VLSI designers. The increasing variability of key process parameters in nanometer CMOS technologies has resulted in larger impact of the substrate and metal line temperatures on the reliability and performance of the devices and interconnec- tions. Recent data shows that more than 50% of all integrated circuit failures are related to thermal issues. This paper presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power VLSI circuits. The paper is concluded with an over- view of techniques to improve the full-chip thermal integrity by means of off-chip versus on-chip and static versus adaptive methods.

Proceedings Article•DOI•
30 Apr 2006
TL;DR: Monte Carlo Spice-based experimental results demonstrate the effectiveness of the proposed approach in accurately modeling the correlation-aware process variations and their impact on interconnect delay when crosstalk is present.
Abstract: Process variations have become a key concern of circuit designers because of their significant, yet hard to predict impact on performance and signal integrity of VLSI circuits. Statistical approaches have been suggested as the most effective substitute for corner-based approaches to deal with the variability of present process technology nodes. This paper introduces a statistical analysis of the crosstalk-aware delay of coupled interconnects considering process variations. The few existing works that have studied this problem suffer not only from shortcomings in their statistical models, but also from inaccurate crosstalk circuit models. We utilize an accurate distributed RC-p model of the interconnections to be able to model process variations close to reality. The considerable effect of correlation among the parameters of neighboring wire segments is also indicated. Statistical properties of the crosstalk-aware output delay are characterized and presented as closed-formed expressions. Monte Carlo Spice-based experimental results demonstrate the effectiveness of the proposed approach in accurately modeling the correlation-aware process variations and their impact on interconnect delay when crosstalk is present.

Proceedings Article•DOI•
27 Mar 2006
TL;DR: A new design methodology is introduced that minimizes the impact of virtual ground parasitic resistances on the performance of an MTCMOS circuit by using gate resizing and logic restructuring (i.e., gate replication.)
Abstract: The Multi-Threshold CMOS (MTCMOS) technique can significantly reduce sub-threshold leakage currents during the circuit sleep (standby) mode by adding high-V/sub th/ power switches (sleep transistors) to low-V/sub th/ logic cells. During the active mode of the circuit, the high-V/sub th/ transistors and the virtual ground network can be modeled as resistors, which in turn cause voltage of the virtual ground node to rise thereby degrading the switching speed of the logic cells. This paper introduces a new design methodology that minimizes the impact of virtual ground parasitic resistances on the performance of an MTCMOS circuit by using gate resizing and logic restructuring (i.e., gate replication.) Experimental results show that the proposed techniques are highly effective in making the MTCMOS circuits robust with respect to such parasitic resistance effects.

Patent•
22 Dec 2006
TL;DR: In this article, the authors proposed a method of forming a memory cell by coupling a first transistor between a supply rail and a node that is operable to accept a supply voltage.
Abstract: A method of forming a memory cell includes coupling a first transistor between a supply rail of a memory cell and a node operable to accept a supply voltage. The method further includes coupling a second transistor between a ground rail of the cell and a node operable to accept a ground. In one embodiment, the method includes forming the cell to accept selectively applied external voltages, wherein the external voltages are selected to minimize leakage current in the cell. In another embodiment, the method includes forming at least one of the first and the second transistors to have a channel width and/or a threshold voltage selected to minimize a total leakage current in the cell.

Proceedings Article•DOI•
22 Oct 2006
TL;DR: A hybrid simulation engine, named B2Sim for (cycle-characterized) Basic Block based Simulator, where a fast cache simulator and a slow pipeline simulator e.g., sim-outorder are employed together, to reduce the runtime of architectural simulation engines by making use of the instruction behavior within executed basic blocks.
Abstract: State-of-the-art architectural simulators support cycle accurate pipeline execution of application programs. However, it takes days and weeks to complete the simulation of even a moderate-size program. During the execution of a program, program behavior does not change randomly but changes over time in a predictable/periodic manner. This behavior provides the opportunity to limit the use of a pipeline simulator. More precisely, this paper presents a hybrid simulation engine, named B2Sim for (cycle-characterized) Basic Block based Simulator, where a fast cache simulator e.g., sim-cache and a slow pipeline simulator e.g., sim-outorder are employed together. B2Sim reduces the runtime of architectural simulation engines by making use of the instruction behavior within executed basic blocks. We have integrated B2Sim into SimpleScalar and have achieved on average a factor of 3.3 times speedup on the SPEC2000 benchmark and Media-bench programs compared to conventional pipeline simulator while maintaining the accuracy of the simulation results with less than 1% CPI error on average.

Proceedings Article•DOI•
24 Jan 2006
TL;DR: A new timing-driven placement algorithm, which attempts to minimize zigzags and crisscrosses on the timing-critical paths of a circuit and integrates this idea into a recursive bipartitioning-based placement framework with a min-cut objective function.
Abstract: In this paper, we present a new timing-driven placement algorithm, which attempts to minimize zigzags and crisscrosses on the timing-critical paths of a circuit. We observed that most of the paths that cause timing problems in the circuit meander outside the minimum bounding box of the start and end nodes of the path. To limit this undesirable behavior, we impose a physical constraint on the placement problem, i.e., we assign a preferred signal direction to each critical path in the circuit. Starting from an initial placement solution, by using a move-based optimization strategy, these preferred directions force cells to move in a direction that maximizes the monotonic behavior of the timing-critical paths in the new placement solution. To make the direction assignment tractable, we implicitly group all circuit paths into a set of input-output conduits and assign a unique preferred direction to each such conduit. We integrated this idea into a recursive bipartitioning-based placement framework with a min-cut objective function. Experimental results on a set of standard placement benchmarks show that this approach improves the result of a state-of-the-art industrial placement tool for all the benchmark circuits while increasing the wire length by a tolerable amount.

Journal Article•DOI•
TL;DR: This paper presents sufficiently accurate and highly efficient filtering algorithms for interconnect timing as well as gate timing analysis, and shows accuracies that are quite comparable with sign-off delay calculators with more than of 65% reduction in the computation times.
Abstract: Static timing analysis is a key step in the physical design optimization of VLSI designs. The lumped capacitance model for gate delay and the Elmore model for wire delay have been shown to be inadequate for wire-dominated designs. Using the effective capacitance model for the gate delay calculation and model-order reduction techniques for wire delay calculation is prohibitively expensive. In this paper, we present sufficiently accurate and highly efficient filtering algorithms for interconnect timing as well as gate timing analysis. The key idea is to partition the circuit into low and high complexity circuits, whereby low complexity circuits are handled with efficient algorithms such as total capacitance algorithm for gate delay and the Elmore metric for wire delay and high complexity circuits are handled with sign-off algorithms. Experimental results on microprocessor designs show accuracies that are quite comparable with sign-off delay calculators with more than of 65% reduction in the computation times

Proceedings Article•DOI•
30 Apr 2006
TL;DR: A new framework for handling the effect of Gaussian and Non-Gaussian process variations on coupled interconnects is proposed and Experimental results show that the proposed method is capable of accurately predicting delay variation in a coupled interConnect line.
Abstract: Process technology and environment-induced variability of gates and wires in VLSI circuits make timing analyses of such circuits a challenging task. Process variation can have a significant impact on both device (front-end of the line) and interconnect (back-end of the line) performance. Statistical static timing analysis techniques are being developed to tackle this important problem. Existing timing analysis tools divide the analysis into interconnect (wire) timing analysis and gate timing analysis. In this paper, we focus on statistical static timing analysis of coupled interconnects where crosstalk noise analysis is unavoidable. We propose a new framework for handling the effect of Gaussian and Non-Gaussian process variations on coupled interconnects. The technique allows for closed-form computation of interconnect delay probability density functions (PDFs) given variations in relevant process parameters such as the line width, metal thickness, and dielectric thickness in the presence of crosstalk noise. To achieve this goal, we express the electrical parameters of the coupled interconnects in a first order (linear) form as function of changes in physical parameters and subsequently use these forms to perform accurate timing and noise analysis to produce the propagation delay and slew in the first-order forms. This work can be easily extended to consider the effect of higher order terms of the sources of variation. Experimental results show that the proposed method is capable of accurately predicting delay variation in a coupled interconnect line.

Book Chapter•DOI•
01 Jan 2006
TL;DR: This chapter reviewed a number of RTL techniques for low power design of VLSI circuits targeting both dynamic and leakage components of power dissipation in CMOS V LSI circuits.
Abstract: This chapter reviewed a number of RTL techniques for low power design of VLSI circuits targeting both dynamic and leakage components of power dissipation in CMOS VLSI circuits. A more detailed review of techniques for low power design of VLSI circuits and systems can be found in many references, including Reference 1.

01 Jan 2006
TL;DR: Key contributions of this thesis include the introduction of the complete set of generalized signatures of a Boolean function, development of efficient methods of recognizing variable symmetries, and presentation of a proficient algorithm for computing the canonical form of the class of NPN-equivalent Boolean functions based on the generalized signatures and variable asymmetries.
Abstract: Boolean matching algorithms have many applications in logic synthesis especially in technology mapping and combinational logic verification. Canonical form based Boolean matching has been studied by many researchers. However, none of the previous work has produced in an algorithm with reasonable space and time complexities for general Boolean matching problem. In contrast, this dissertation provides an efficient and compact canonical form for representing the set of all Boolean functions that are equivalent under permutation of input variables and complementation of input or output variables (i.e., NPN-equivalent Boolean functions). In particular, important properties of the proposed canonical form are investigated, and subsequently utilized to devise an effective algorithm for computing the proposed canonical form. The low average computational complexity of this algorithm allows it to be applied to large complex Boolean functions with no limitation on the number of input variables as opposed to previous approaches, which are not capable of handling functions with more than seven inputs. Key contributions of this thesis include the introduction of the complete set of generalized signatures of a Boolean function, development of efficient methods of recognizing variable symmetries, and presentation of a proficient algorithm for computing the canonical form of the class of NPN-equivalent Boolean functions based on the generalized signatures and variable symmetries.

Proceedings Article•DOI•
30 Apr 2006
TL;DR: This paper presents a minimum area, low-power driven clustering algorithm for coarse-grained, antifuse-based FPGAs under delay constraints that reduces size of duplicated logic substantially, resulting in benefits in area, delay, and power dissipation.
Abstract: This paper presents a minimum area, low-power driven clustering algorithm for coarse-grained, antifuse-based FPGAs under delay constraints. The algorithm accurately predicts logic replication caused by timing constraint during the low-power driven clustering. This technique reduces size of duplicated logic substantially, resulting in benefits in area, delay, and power dissipation. First, we build power-delay curves at nodes with the aid of the prediction algorithm. Next, we choose the best cluster starting from primary outputs moving backward in the circuit based on these curves. Experimental results show 16% and 20% reduction in dynamic and leakage power dissipation with 18% area reduction compared to the results of clustering without the replication prediction.

Proceedings Article•DOI•
06 Mar 2006
TL;DR: A cell delay model based on rate-of-current-change is presented, which accounts for the impact of the shape of the noisy waveform on the output voltage waveform.
Abstract: A cell delay model based on rate-of-current-change is presented, which accounts for the impact of the shape of the noisy waveform on the output voltage waveform. More precisely, a pre-characterized table of time derivatives of the output current as a function of input voltage and output load values is constructed. The data in this table, in combination with the Taylor series expansion of the output current, is utilized to progressively compute the output current waveform, which is then integrated to produce the output voltage waveform. Experimental results show the effectiveness and efficiency of this new delay model.

Journal Article•DOI•
TL;DR: This paper advances the state of the art by presenting a well-founded mathematical framework for modeling and manipulating Markov processes and presents a new state assignment technique to reduce dynamic power consumption in finite state machines.
Abstract: This paper advances the state of the art by presenting a well-founded mathematical framework for modeling and manipulating Markov processes. The key idea is based on the fact that a Markov process can be decomposed into a collection of directed cycles with positive weights, which are proportional to the probability of the cycle traversals in a random walk. Two applications of this new formalism in the computer-aided design area are studied. In the first application, the authors present a new state assignment technique to reduce dynamic power consumption in finite state machines. The technique comprises of first decomposing the state machine into a set of cycles and then performing a state assignment by using Gray codes. The proposed encoding algorithm reduces power consumption by an average of 15%. The second application is sequence compaction for improving the efficiency of dynamic power simulators. The proposed method is based on the cycle decomposition of the Markov process representing the given input sequence and then selecting a subset of these cycles to construct the compacted sequence

Proceedings Article•DOI•
24 Jan 2006
TL;DR: This paper introduces a new current-based cell timing analyzer, called CGTA, which has a higher performance than existing logic cell timing analysis tools and relies on a compact lookup table storing the output current gain of every logic cell as a function of its input voltage and output load.
Abstract: This paper introduces a new current-based cell timing analyzer, called CGTA, which has a higher performance than existing logic cell timing analysis tools. CGTA relies on a compact lookup table storing the output current gain (sensitivity) of every logic cell as a function of its input voltage and output load. The current gain values are subsequently used by the timing calculator to produce the output current value as a function of the applied input voltage. This current and the output load then uniquely determine the output voltage value. Therefore, CGTA is capable of efficiently and accurately computing the output voltage waveform of a logic cell, which has been subjected to an arbitrary noisy input voltage waveform. Experimental results are presented to assess the quality of CGTA compared to other existing approaches.