Showing papers by "Massoud Pedram published in 2006"

PDF

Open Access

Journal Article•DOI•

Thermal Modeling, Analysis, and Management in VLSI Circuits: Principles and Methods

[...]

Massoud Pedram¹, Shahin Nazarian¹•Institutions (1)

25 Sep 2006

TL;DR: A brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power V LSI circuits is presented.

...read moreread less

Abstract: The growing packing density and power consumption of very large scale integration (VLSI) circuits have made thermal effects one of the most important concerns of VLSI designers The increasing variability of key process parameters in nanometer CMOS technologies has resulted in larger impact of the substrate and metal line temperatures on the reliability and performance of the devices and interconnections Recent data shows that more than 50% of all integrated circuit failures are related to thermal issues This paper presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power VLSI circuits The paper is concluded with an overview of techniques to improve the full-chip thermal integrity by means of off-chip versus on-chip and static versus adaptive methods

...read moreread less

420 citations

Journal Article•DOI•

An analytical model for predicting the remaining battery capacity of lithium-ion batteries

[...]

Peng Rong¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

01 May 2006-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The proposed high-level model, which relies on online current and voltage measurements, correctly accounts for the temperature and cycle aging effects and has a maximum of 5% error between simulated and predicted data.

...read moreread less

Abstract: Predicting the residual energy of the battery source that powers a portable electronic device is imperative in designing and applying an effective dynamic power management policy for the device This paper starts up by showing that a 30% error in predicting the battery capacity of a lithium-ion battery can result in up to 20% performance degradation for a dynamic voltage and frequency scaling algorithm Next, this paper presents a closed form analytical expression for predicting the remaining capacity of a lithium-ion battery The proposed high-level model, which relies on online current and voltage measurements, correctly accounts for the temperature and cycle aging effects The accuracy of the high-level model is validated by comparing it with DUALFOIL simulation results, demonstrating a maximum of 5% error between simulated and predicted data

...read moreread less

271 citations

Journal Article•DOI•

HVS-Aware Dynamic Backlight Scaling in TFT-LCDs

[...]

Ali Iranli¹, Wonbok Lee¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

01 Oct 2006-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion, thereby, achieving a good balance between the two sources of distortion while maximizing the backlight dimming-driven energy saving in the display system and meeting an overall video quality figure of merit.

...read moreread less

Abstract: Liquid crystal displays (LCDs) have appeared in applications ranging from medical equipment to automobiles, gas pumps, laptops, and handheld portable computers. These display components present a cascaded energy attenuator to the battery of the handheld device which is responsible for about half of the energy drain at maximum display intensity. As such, the display components become the main focus of every effort for maximization of embedded system's battery lifetime. This paper proposes an approach for pixel transformation of the displayed image to increase the potential energy saving of the backlight scaling method. The proposed approach takes advantage of human visual system (HVS) characteristics and tries to minimize distortion between the perceived brightness values of the individual pixels in the original image and those of the backlight-scaled image. This is in contrast to previous backlight scaling approaches which simply match the luminance values of the individual pixels in the original and backlight-scaled images. Furthermore, this paper proposes a temporally-aware backlight scaling technique for video streams. The goal is to maximize energy saving in the display system by means of dynamic backlight dimming subject to a video distortion tolerance. The video distortion comprises of: 1) an intra-frame (spatial) distortion component due to frame-sensitive backlight scaling and transmittance function tuning and 2) an inter-frame (temporal) distortion component due to large-step backlight dimming across frames modulated by the psychophysical characteristics of the human visual system. The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion, thereby, achieving a good balance between the two sources of distortion while maximizing the backlight dimming-driven energy saving in the display system and meeting an overall video quality figure of merit. The proposed dynamic backlight scaling approach is amenable to highly efficient hardware realization and has been implemented on the Apollo Testbed II. Actual current measurements demonstrate the effectiveness of proposed technique compared to the previous backlight dimming techniques, which have ignored the temporal distortion effect

...read moreread less

76 citations

Proceedings Article•DOI•

Power-aware scheduling and dynamic voltage setting for tasks running on a hard real-time system

[...]

Peng Rong¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

24 Jan 2006

TL;DR: A three-phase solution framework, which integrates power management scheduling and task voltage assignment, is proposed, which outperforms existing methods by an average of 18% in terms of the system-wide energy savings.

...read moreread less

Abstract: This paper addresses the problem of minimizing energy consumption of a computer system performing periodic hard real-time tasks with precedence constraints. In the proposed approach, dynamic power management and voltage scaling techniques are combined to reduce the energy consumption of the CPU and devices. The optimization problem is first formulated as an integer programming problem. Next, a three-phase solution framework, which integrates power management scheduling and task voltage assignment, is proposed. Experimental results show that the proposed approach outperforms existing methods by an average of 18% in terms of the system-wide energy savings.

...read moreread less

69 citations

Proceedings Article•DOI•

Statistical logic cell delay analysis using a current-based model

[...]

Hanif Fatemi¹, Shahin Nazarian¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

24 Jul 2006

TL;DR: A new current-based cell delay model is utilized, which can accurately compute the output waveform for input waveforms of arbitrary shapes subjected to noise, and the cell parasitic capacitances are pre-characterized by lookup tables to improve the accuracy.

...read moreread less

Abstract: A statistical model for the purpose of logic cell timing analysis in the presence of process variations is presented. A new current-based cell delay model is utilized, which can accurately compute the output waveform for input waveforms of arbitrary shapes subjected to noise. The cell parasitic capacitances are pre-characterized by lookup tables to improve the accuracy. To capture the effect of process parameter variations on the cell behavior, the output voltage waveform of logic cells is modeled by a stochastic Markovian process in which the voltage value probability distribution at each time instance is computed from that of the previous time instance. Next the probability distribution of a % V/sub dd/ crossing time, i.e., the hitting time of the output voltage stochastic process is computed. Experimental results demonstrate the high accuracy of our cell delay model compared to Monte-Carlo-based SPICE simulations.

...read moreread less

68 citations

Proceedings Article•DOI•

Analysis and Synthesis of Quantum Circuits by Using Quantum Decision Diagrams

[...]

Afshin Abdollahi¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

06 Mar 2006

TL;DR: The paper introduces the notion of quantum factored forms and presents a canonical and concise representation of quantum logic circuits in the form of quantum decision diagrams (QDD's), which are amenable to efficient manipulation and optimization including recursive unitary functional bi-decomposition.

...read moreread less

Abstract: Quantum information processing technology is in its pioneering stage and no proficient method for synthesizing quantum circuits has been introduced so far. This paper introduces an effective analysis and synthesis framework for quantum logic circuits. The proposed synthesis algorithm and flow can generate a quantum circuit using the most basic quantum operators, i. e., the rotation and controlled-rotation primitives. The paper introduces the notion of quantum factored forms and presents a canonical and concise representation of quantum logic circuits in the form of quantum decision diagrams (QDD’s), which are amenable to efficient manipulation and optimization including recursive unitary functional bi-decomposition. This paper concludes by presenting the QDD-based algorithm for automatic synthesis of quantum circuits.

...read moreread less

60 citations

Journal Article•DOI•

Model-order reduction using variational balanced truncation with spectral shaping

[...]

Payam Heydari¹, Massoud Pedram•Institutions (1)

University of California, Irvine¹

10 Apr 2006-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: It is shown that the variational balanced truncation technique produces reduced systems that accurately follow the time- and frequency-domain responses of the original system when variations in the circuit parameters are taken into consideration.

...read moreread less

Abstract: This paper presents a spectrally weighted balanced truncation (SBT) technique for tightly coupled integrated circuit interconnects, when the interconnect circuit parameters change as a result of statistical variations in the manufacturing process. The salient features of this algorithm are the inclusion of the parameter variation in the RLCK interconnect, the guaranteed passivity of the reduced transfer function, and the availability of provable spectrally weighted error bounds for the reduced-order system. This paper shows that the variational balanced truncation technique produces reduced systems that accurately follow the time- and frequency-domain responses of the original system when variations in the circuit parameters are taken into consideration. Experimental results show that the new variational SBT attains, in average, 30% more accuracy than the variational Krylov-subspace-based model-order reduction techniques.

...read moreread less

58 citations

Proceedings Article•DOI•

Reducing the Sub-threshold and Gate-tunneling Leakage of SRAM Cells using Dual-Vt and Dual-Tox Assignment

[...]

Behnam Amelifard¹, Farzan Fallah², Massoud Pedram¹•Institutions (2)

University of Southern California¹, Fujitsu²

06 Mar 2006

TL;DR: Simulation results with a 65 nm process demonstrate that this technique can reduce the total leakage power dissipation of a 64 Kb SRAM by more than 50% and incurs neither area nor delay overhead.

...read moreread less

Abstract: Aggressive CMOS scaling results in low threshold voltage and thin oxide thickness for transistors manufactured in very deep submicron regime. As a result, reducing the subthreshold and gate-tunneling leakage currents has become one of the most important criteria in the design of VLSI circuits. This paper presents a method based on dual-V t and dual-Tox assignment to reduce the total leakage power dissipation of SRAMs while maintaining their performance. The proposed method is based on the observation that the read and write delays of a memory cell in an SRAM block depend on the physical distance of the cell from the sense amplifier and the decoder. Thus, the idea is to deploy different types of six-transistor SRAM cells corresponding to different threshold voltage and oxide thickness assignments for the transistors. Unlike other techniques for low-leakage SRAM design, the proposed technique incurs neither area nor delay overhead. In addition, it results in a minor change in the SRAM design flow. Simulation results with a 65 nm process demonstrate that this technique can reduce the total leakage power dissipation of a 64 Kb SRAM by more than 50%

...read moreread less

49 citations

Patent•

Dynamic backlight scaling for power minimization in a backlit TFT-LCD

[...]

Massoud Pedram¹, Ali Iranli¹•Institutions (1)

University of Southern California¹

02 Mar 2006

TL;DR: In this paper, the authors present a method for determining a pixel transformation function that maximizes backlight dimming while maintaining a pre-specified distortion level, based on the original image and the distortion level.

...read moreread less

Abstract: An embodiment of the present invention is directed to a method for determining a pixel transformation function that maximizes backlight dimming while maintaining a pre-specified distortion level. The method includes determining a minimum dynamic range of pixel values in a transformed image based on an original image and the pre-specified distortion level and determining the pixel transformation function. The pixel transformation function takes a histogram of the original image to a uniform distribution histogram having the minimum dynamic range.

...read moreread less

48 citations

Proceedings Article•DOI•

Charge recycling in MTCMOS circuits: concept and analysis

[...]

Ehsan Pakbaznia¹, Farzan Fallah², Massoud Pedram¹•Institutions (2)

University of Southern California¹, Fujitsu²

24 Jul 2006

TL;DR: The proposed charge recycling technique can save up to 46% of the mode transition energy while, in most cases, maintaining, or even improving, the wake up time of the original circuit.

...read moreread less

Abstract: Designing an energy efficient power gating structure is an important and challenging task in multi-threshold CMOS (MTCMOS) circuit design. In order to achieve a very low power design, the large amount of energy consumed during mode transition in MTCMOS circuits should be avoided. In this paper, we propose an appropriate charge recycling technique to reduce energy consumption during the mode transition of MTCMOS circuits. The proposed method can save up to 46% of the mode transition energy while, in most cases, maintaining, or even improving, the wake up time of the original circuit. It also reduces the peak negative voltage value and the settling time of the ground bounce.

...read moreread less

31 citations

Proceedings Article•DOI•

Low-leakage SRAM design with dual V/sub t/ transistors

[...]

Behnam Amelifard¹, Farzan Fallah², Massoud Pedram¹•Institutions (2)

University of Southern California¹, Fujitsu²

27 Mar 2006

TL;DR: Experimental results show that this technique can reduce the leakage-power dissipation of a 64Kb SRAM by more than 35% and improves the static noise margin under process variations.

...read moreread less

Abstract: This paper presents a method based on dual threshold voltage assignment to reduce the leakage power dissipation of SRAMs while maintaining their performance. The proposed method is based on the observation that the read and write delays of a memory cell in an SRAM block depend on the physical distance of the cell from the sense amplifier and the decoder. The key idea is thus to realize and deploy different types of six-transistor SRAM cells corresponding to different threshold voltage assignments for individual transistors in the cell. Unlike other techniques for low-leakage SRAM design, the proposed technique incurs no area or delay overhead. In addition, it results only in a slight change in the SRAM design flow. Finally, it improves the static noise margin under process variations. Experimental results show that this technique can reduce the leakage-power dissipation of a 64Kb SRAM by more than 35%.

...read moreread less

Proceedings Article•DOI•

Backlight dimming in power-aware mobile displays

[...]

Ali Iranli¹, Wonbok Lee¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

24 Jul 2006

...read moreread less

Abstract: This paper presents a temporally-aware backlight scaling (TABS) technique for video streams. The goal is to maximize energy saving in the display system by means of dynamic backlight dimming subject to a user-specified tolerance on the video distortion. The video distortion itself comprises of (i) an intra-frame (spatial) distortion component due to frame-sensitive backlight scaling and transmittance function tuning and (ii) an inter-frame (temporal) distortion component due to large-step backlight dimming across multiple frames and modulated by the physiological characteristics of the human visual system. The proposed backlight scaling technique is capable of efficiently computing the flickering effect online and subsequently using a measure of the temporal distortion to appropriately adjust the slack on the intra-frame spatial distortion. The proposed technique has been implemented on the Apollo testbed II hardware platform. Actual current measurements on this platform demonstrate the superiority of TABS compared to previous backlight dimming techniques.

...read moreread less

Proceedings Article•DOI•

Dynamic voltage and frequency management based on variable update intervals for frequency setting

[...]

Mehrdad Najibi¹, Mostafa E. Salehi², A. Afzali Kusha², Massoud Pedram³, S.M. Fakhraie², Hossein Pedram¹ - Show less +2 more•Institutions (3)

Amirkabir University of Technology¹, University of Tehran², University of Southern California³

05 Nov 2006

TL;DR: An efficient adaptive method to perform dynamic voltage and frequency management (DVFM) for minimizing the energy consumption of microprocessor chips is presented and leads to power savings of up to 60% for highly correlated workloads compared to DVFM systems based on fixed update intervals.

...read moreread less

Abstract: An efficient adaptive method to perform dynamic voltage and frequency management (DVFM) for minimizing the energy consumption of microprocessor chips is presented. Instead of using a fixed update interval, the proposed DVFM system makes use of adaptive update intervals for optimal frequency and voltage scheduling. The optimization enables the system to rapidly track the workload changes so as to meet soft real-time deadlines. The method, which is based on introducing the concept of an effective deadline, utilizes the correlation between consecutive values of the workload. In practice because the frequency and voltage update rates are dynamically set based on variable update interval lengths, voltage fluctuations on the power network are also minimized. The technique, which may be implemented by simple hardware and is completely transparent from the application, leads to power savings of up to 60% for highly correlated workloads compared to DVFM systems based on fixed update intervals.

...read moreread less

Proceedings Article•DOI•

Determining the Optimal Timeout Values for a Power-Managed System based on the Theory of Markovian Processes: Offline and Online Algorithms

[...]

Peng Rong¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

06 Mar 2006

TL;DR: This paper presents a timeout-driven DPM technique which relies on the theory of Markovian processes to determine the energy-optimal timeout values for a system with multiple power saving states while satisfying a set of user defined performance constraints.

...read moreread less

Abstract: This paper presents a timeout-driven DPM technique which relies on the theory of Markovian processes. The objective is to determine the energy-optimal timeout values for a system with multiple power saving states while satisfying a set of user defined performance constraints. More precisely, a controllable Markovian process is exploited to model the power management behavior of a system under the control of a timeout policy. Starting with this model, a perturbation analysis technique is applied to develop an offline gradient-based approach to determine the optimal timeout values. Online implementation of this technique for a system with dynamically-varying system parameters is also described. Experimental results demonstrate the effectiveness of the proposed approach. Introduction Dynamic power management (DPM), which refers to selective shut-off or slow-down of components that are idle or underutilized, has proven to be a particularly effective technique for reducing power dissipation in such systems. In the literature, various DPM techniques have been proposed, from heuristic methods presented in early works [ 1][ 2] to stochastic optimization approaches [ 3][ 4]. Among the heuristic DPM methods, the timeout policy is the most widely used approach in industry and has been implemented in many operating systems. Examples include the power management scheme incorporated into the Windows system, the low-power saving mode of the IEEE 802.11a-g protocol for wireless LAN card, and the enhanced adaptive battery life extender (EABLE) for the Hitachi disk drive. Most of these industrial DPM techniques provide mechanisms to adjust the timeout values at the user level.

...read moreread less

Proceedings Article•DOI•

Non-Gaussian Statistical Interconnect Timing Analysis

[...]

Soroush Abbaspour¹, Hanif Fatemi¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

06 Mar 2006

TL;DR: Experimental results show average errors of less than 2% for the mean, variance and skewness of interconnect delay and slew while achieving orders of magnitude speedup with respect to a Monte Carlo simulation with 104 samples.

...read moreread less

Abstract: This paper focuses on statistical interconnect timing analysis in a parameterized block-based statistical static timing analysis tool. In particular, a new framework for performing timing analysis of RLC networks with step inputs, under both Gaussian and non-Gaussian sources of variation, is presented. In this framework, resistance, inductance, and capacitance of the RLC line are modeled in a canonical first order form and used to produce the corresponding propagation delay and slew (time) in the canonical first-order form. To accomplish this step, mean, variance, and skewness of delay and slew distributions are obtained in an efficient, yet accurate, manner. The proposed framework can be extended to consider higher order terms of the various sources of variation. Experimental results show average errors of less than 2% for the mean, variance and skewness of interconnect delay and slew while achieving orders of magnitude speedup with respect to a Monte Carlo simulation with 104 samples.

...read moreread less

Proceedings Article•DOI•

Parameterized block-based non-gaussian statistical gate timing analysis

[...]

Soroush Abbaspour¹, Hanif Fatemi¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

24 Jan 2006

TL;DR: A new framework for performing statistical gate timing analysis for non-Gaussian sources of variation in block-based sigmaTA is introduced and a statistical effective capacitance calculation method is presented to achieve the aforementioned objective.

...read moreread less

Abstract: As technology scales down, timing verification of digital integrated circuits becomes an increasingly challenging task due to the gate and wire variability. Therefore, statistical timing analysis (denoted by /spl sigma/TA) is becoming unavoidable. This paper introduces a new framework for performing statistical gate timing analysis for non-Gaussian sources of variation in block-based /spl sigma/TA. First, an approach is described to approximate a variational RC-/spl pi/ load by using a canonical first-order model. Next, an accurate variation-aware gate timing analysis based on statistical input transition, statistical gate timing library, and statistical RC-/spl pi/ load is presented. Finally, to achieve the aforementioned objective, a statistical effective capacitance calculation method is presented. Experimental results show an average error of 6% for gate delay and output transition time with respect to the Monte Carlo simulation with 10/sup 4/ samples while the runtime is nearly two orders of magnitude shorter.

...read moreread less

Thermal Modeling, Analysis, and Management in VLSI Circuits: Principles and Methods Maximum chip performance under peak permissible temperature limits may be achieved with the help of combined electrical and thermal simulation of VLSI circuits.

[...]

Massoud Pedram, Shahin Nazarian

01 Jan 2006

TL;DR: In this article, the authors present a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power very large scale integration (VLSI) circuits.

...read moreread less

Abstract: The growing packing density and power con- sumption of very large scale integration (VLSI) circuits have made thermal effects one of the most important concerns of VLSI designers. The increasing variability of key process parameters in nanometer CMOS technologies has resulted in larger impact of the substrate and metal line temperatures on the reliability and performance of the devices and interconnec- tions. Recent data shows that more than 50% of all integrated circuit failures are related to thermal issues. This paper presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power VLSI circuits. The paper is concluded with an over- view of techniques to improve the full-chip thermal integrity by means of off-chip versus on-chip and static versus adaptive methods.

...read moreread less

Proceedings Article•DOI•

Crosstalk analysis in nanometer technologies

[...]

Shahin Nazarian¹, Ali Iranli¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

30 Apr 2006

TL;DR: Monte Carlo Spice-based experimental results demonstrate the effectiveness of the proposed approach in accurately modeling the correlation-aware process variations and their impact on interconnect delay when crosstalk is present.

...read moreread less

Abstract: Process variations have become a key concern of circuit designers because of their significant, yet hard to predict impact on performance and signal integrity of VLSI circuits. Statistical approaches have been suggested as the most effective substitute for corner-based approaches to deal with the variability of present process technology nodes. This paper introduces a statistical analysis of the crosstalk-aware delay of coupled interconnects considering process variations. The few existing works that have studied this problem suffer not only from shortcomings in their statistical models, but also from inaccurate crosstalk circuit models. We utilize an accurate distributed RC-p model of the interconnections to be able to model process variations close to reality. The considerable effect of correlation among the parameters of neighboring wire segments is also indicated. Statistical properties of the crosstalk-aware output delay are characterized and presented as closed-formed expressions. Monte Carlo Spice-based experimental results demonstrate the effectiveness of the proposed approach in accurately modeling the correlation-aware process variations and their impact on interconnect delay when crosstalk is present.

...read moreread less

Proceedings Article•DOI•

Gate Sizing and Replication to Minimize the Effects of Virtual Ground Parasitic Resistances in MTCMOS Designs

[...]

Chanseok Hwang¹, Chang-woo Kang¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

27 Mar 2006

TL;DR: A new design methodology is introduced that minimizes the impact of virtual ground parasitic resistances on the performance of an MTCMOS circuit by using gate resizing and logic restructuring (i.e., gate replication.)

...read moreread less

Abstract: The Multi-Threshold CMOS (MTCMOS) technique can significantly reduce sub-threshold leakage currents during the circuit sleep (standby) mode by adding high-V/sub th/ power switches (sleep transistors) to low-V/sub th/ logic cells. During the active mode of the circuit, the high-V/sub th/ transistors and the virtual ground network can be modeled as resistors, which in turn cause voltage of the virtual ground node to rise thereby degrading the switching speed of the logic cells. This paper introduces a new design methodology that minimizes the impact of virtual ground parasitic resistances on the performance of an MTCMOS circuit by using gate resizing and logic restructuring (i.e., gate replication.) Experimental results show that the proposed techniques are highly effective in making the MTCMOS circuits robust with respect to such parasitic resistance effects.

...read moreread less

Patent•

PG-gated data retention technique for reducing leakage in memory cells

[...]

Farzan Fallah¹, Behnam Amelifard¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

22 Dec 2006

TL;DR: In this article, the authors proposed a method of forming a memory cell by coupling a first transistor between a supply rail and a node that is operable to accept a supply voltage.

...read moreread less

Abstract: A method of forming a memory cell includes coupling a first transistor between a supply rail of a memory cell and a node operable to accept a supply voltage. The method further includes coupling a second transistor between a ground rail of the cell and a node operable to accept a ground. In one embodiment, the method includes forming the cell to accept selectively applied external voltages, wherein the external voltages are selected to minimize leakage current in the cell. In another embodiment, the method includes forming at least one of the first and the second transistors to have a channel width and/or a threshold voltage selected to minimize a total leakage current in the cell.

...read moreread less

Proceedings Article•DOI•

B2Sim: a fast micro-architecture simulator based on basic block characterization

[...]

Wonbok Lee¹, Kimish Patel¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

22 Oct 2006

TL;DR: A hybrid simulation engine, named B2Sim for (cycle-characterized) Basic Block based Simulator, where a fast cache simulator and a slow pipeline simulator e.g., sim-outorder are employed together, to reduce the runtime of architectural simulation engines by making use of the instruction behavior within executed basic blocks.

...read moreread less

Abstract: State-of-the-art architectural simulators support cycle accurate pipeline execution of application programs. However, it takes days and weeks to complete the simulation of even a moderate-size program. During the execution of a program, program behavior does not change randomly but changes over time in a predictable/periodic manner. This behavior provides the opportunity to limit the use of a pipeline simulator. More precisely, this paper presents a hybrid simulation engine, named B2Sim for (cycle-characterized) Basic Block based Simulator, where a fast cache simulator e.g., sim-cache and a slow pipeline simulator e.g., sim-outorder are employed together. B2Sim reduces the runtime of architectural simulation engines by making use of the instruction behavior within executed basic blocks. We have integrated B2Sim into SimpleScalar and have achieved on average a factor of 3.3 times speedup on the SPEC2000 benchmark and Media-bench programs compared to conventional pipeline simulator while maintaining the accuracy of the simulation results with less than 1% CPI error on average.

...read moreread less

Proceedings Article•DOI•

Timing-driven placement based on monotone cell ordering constraints

[...]

Chanseok Hwang¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

24 Jan 2006

TL;DR: A new timing-driven placement algorithm, which attempts to minimize zigzags and crisscrosses on the timing-critical paths of a circuit and integrates this idea into a recursive bipartitioning-based placement framework with a min-cut objective function.

...read moreread less

Abstract: In this paper, we present a new timing-driven placement algorithm, which attempts to minimize zigzags and crisscrosses on the timing-critical paths of a circuit. We observed that most of the paths that cause timing problems in the circuit meander outside the minimum bounding box of the start and end nodes of the path. To limit this undesirable behavior, we impose a physical constraint on the placement problem, i.e., we assign a preferred signal direction to each critical path in the circuit. Starting from an initial placement solution, by using a move-based optimization strategy, these preferred directions force cells to move in a direction that maximizes the monotonic behavior of the timing-critical paths in the new placement solution. To make the direction assignment tractable, we implicitly group all circuit paths into a set of input-output conduits and assign a unique preferred direction to each such conduit. We integrated this idea into a recursive bipartitioning-based placement framework with a min-cut objective function. Experimental results on a set of standard placement benchmarks show that this approach improves the result of a state-of-the-art industrial placement tool for all the benchmark circuits while increasing the wire length by a tolerable amount.

...read moreread less

Journal Article•DOI•

Fast Interconnect and Gate Timing Analysis for Performance Optimization

[...]

Soroush Abbaspour¹, Massoud Pedram², Amir H. Ajami³, C. Kashyap⁴•Institutions (4)

IBM¹, University of Southern California², Magma Design Automation³, Intel⁴

01 Dec 2006-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper presents sufficiently accurate and highly efficient filtering algorithms for interconnect timing as well as gate timing analysis, and shows accuracies that are quite comparable with sign-off delay calculators with more than of 65% reduction in the computation times.

...read moreread less

Abstract: Static timing analysis is a key step in the physical design optimization of VLSI designs. The lumped capacitance model for gate delay and the Elmore model for wire delay have been shown to be inadequate for wire-dominated designs. Using the effective capacitance model for the gate delay calculation and model-order reduction techniques for wire delay calculation is prohibitively expensive. In this paper, we present sufficiently accurate and highly efficient filtering algorithms for interconnect timing as well as gate timing analysis. The key idea is to partition the circuit into low and high complexity circuits, whereby low complexity circuits are handled with efficient algorithms such as total capacitance algorithm for gate delay and the Elmore metric for wire delay and high complexity circuits are handled with sign-off algorithms. Experimental results on microprocessor designs show accuracies that are quite comparable with sign-off delay calculators with more than of 65% reduction in the computation times

...read moreread less

Proceedings Article•DOI•

SACI: statistical static timing analysis of coupled interconnects

[...]

Hanif Fatemi¹, Soroush Abbaspour¹, Massoud Pedram¹, Amir H. Ajami², Emre Tuncer² - Show less +1 more•Institutions (2)

University of Southern California¹, Magma Design Automation²

30 Apr 2006

TL;DR: A new framework for handling the effect of Gaussian and Non-Gaussian process variations on coupled interconnects is proposed and Experimental results show that the proposed method is capable of accurately predicting delay variation in a coupled interConnect line.

...read moreread less

Abstract: Process technology and environment-induced variability of gates and wires in VLSI circuits make timing analyses of such circuits a challenging task. Process variation can have a significant impact on both device (front-end of the line) and interconnect (back-end of the line) performance. Statistical static timing analysis techniques are being developed to tackle this important problem. Existing timing analysis tools divide the analysis into interconnect (wire) timing analysis and gate timing analysis. In this paper, we focus on statistical static timing analysis of coupled interconnects where crosstalk noise analysis is unavoidable. We propose a new framework for handling the effect of Gaussian and Non-Gaussian process variations on coupled interconnects. The technique allows for closed-form computation of interconnect delay probability density functions (PDFs) given variations in relevant process parameters such as the line width, metal thickness, and dielectric thickness in the presence of crosstalk noise. To achieve this goal, we express the electrical parameters of the coupled interconnects in a first order (linear) form as function of changes in physical parameters and subsequently use these forms to perform accurate timing and noise analysis to produce the propagation delay and slew in the first-order forms. This work can be easily extended to consider the effect of higher order terms of the sources of variation. Experimental results show that the proposed method is capable of accurately predicting delay variation in a coupled interconnect line.

...read moreread less

Book Chapter•DOI•

Power minimisation techniques at the RT-level and below

[...]

Afshin Abdollahi¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

01 Jan 2006

TL;DR: This chapter reviewed a number of RTL techniques for low power design of VLSI circuits targeting both dynamic and leakage components of power dissipation in CMOS V LSI circuits.

...read moreread less

Abstract: This chapter reviewed a number of RTL techniques for low power design of VLSI circuits targeting both dynamic and leakage components of power dissipation in CMOS VLSI circuits. A more detailed review of techniques for low power design of VLSI circuits and systems can be found in many references, including Reference 1.

...read moreread less

Canonical form based boolean matching and symmetry detection in logic synthesis and verification

[...]

Massoud Pedram¹, Afshin Abdollahi¹•Institutions (1)

University of Southern California¹

01 Jan 2006

TL;DR: Key contributions of this thesis include the introduction of the complete set of generalized signatures of a Boolean function, development of efficient methods of recognizing variable symmetries, and presentation of a proficient algorithm for computing the canonical form of the class of NPN-equivalent Boolean functions based on the generalized signatures and variable asymmetries.

...read moreread less

Abstract: Boolean matching algorithms have many applications in logic synthesis especially in technology mapping and combinational logic verification. Canonical form based Boolean matching has been studied by many researchers. However, none of the previous work has produced in an algorithm with reasonable space and time complexities for general Boolean matching problem. In contrast, this dissertation provides an efficient and compact canonical form for representing the set of all Boolean functions that are equivalent under permutation of input variables and complementation of input or output variables (i.e., NPN-equivalent Boolean functions). In particular, important properties of the proposed canonical form are investigated, and subsequently utilized to devise an effective algorithm for computing the proposed canonical form. The low average computational complexity of this algorithm allows it to be applied to large complex Boolean functions with no limitation on the number of input variables as opposed to previous approaches, which are not capable of handling functions with more than seven inputs. Key contributions of this thesis include the introduction of the complete set of generalized signatures of a Boolean function, development of efficient methods of recognizing variable symmetries, and presentation of a proficient algorithm for computing the canonical form of the class of NPN-equivalent Boolean functions based on the generalized signatures and variable symmetries.

...read moreread less

Proceedings Article•DOI•

Low-power clustering with minimum logic replication for coarse-grained, antifuse based FPGAs

[...]

Chang Woo Kang¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

30 Apr 2006

TL;DR: This paper presents a minimum area, low-power driven clustering algorithm for coarse-grained, antifuse-based FPGAs under delay constraints that reduces size of duplicated logic substantially, resulting in benefits in area, delay, and power dissipation.

...read moreread less

Abstract: This paper presents a minimum area, low-power driven clustering algorithm for coarse-grained, antifuse-based FPGAs under delay constraints. The algorithm accurately predicts logic replication caused by timing constraint during the low-power driven clustering. This technique reduces size of duplicated logic substantially, resulting in benefits in area, delay, and power dissipation. First, we build power-delay curves at nodes with the aid of the prediction algorithm. Next, we choose the best cluster starting from primary outputs moving backward in the circuit based on these curves. Experimental results show 16% and 20% reduction in dynamic and leakage power dissipation with 18% area reduction compared to the results of clustering without the replication prediction.

...read moreread less

Proceedings Article•DOI•

Cell Delay Analysis Based on Rate-of-Current Change

[...]

Shahin Nazarian¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

06 Mar 2006

TL;DR: A cell delay model based on rate-of-current-change is presented, which accounts for the impact of the shape of the noisy waveform on the output voltage waveform.

...read moreread less

Abstract: A cell delay model based on rate-of-current-change is presented, which accounts for the impact of the shape of the noisy waveform on the output voltage waveform. More precisely, a pre-characterized table of time derivatives of the output current as a function of input voltage and output load values is constructed. The data in this table, in combination with the Taylor series expansion of the output current, is utilized to progressively compute the output current waveform, which is then integrated to produce the output voltage waveform. Experimental results show the effectiveness and efficiency of this new delay model.

...read moreread less

Journal Article•DOI•

Cycle-Based Decomposition of Markov Chains With Applications to Low-Power Synthesis and Sequence Compaction for Finite State Machines

[...]

Ali Iranli¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

01 Dec 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper advances the state of the art by presenting a well-founded mathematical framework for modeling and manipulating Markov processes and presents a new state assignment technique to reduce dynamic power consumption in finite state machines.

...read moreread less

Abstract: This paper advances the state of the art by presenting a well-founded mathematical framework for modeling and manipulating Markov processes. The key idea is based on the fact that a Markov process can be decomposed into a collection of directed cycles with positive weights, which are proportional to the probability of the cycle traversals in a random walk. Two applications of this new formalism in the computer-aided design area are studied. In the first application, the authors present a new state assignment technique to reduce dynamic power consumption in finite state machines. The technique comprises of first decomposing the state machine into a set of cycles and then performing a state assignment by using Gray codes. The proposed encoding algorithm reduces power consumption by an average of 15%. The second application is sequence compaction for improving the efficiency of dynamic power simulators. The proposed method is based on the cycle decomposition of the Markov process representing the given input sequence and then selecting a subset of these cycles to construct the compacted sequence

...read moreread less

Proceedings Article•DOI•

CGTA: current gain-based timing analysis for logic cells

[...]

Shahin Nazarian¹, Massoud Pedram¹, Tao Lin², Emre Tuncer²•Institutions (2)

University of Southern California¹, Magma Design Automation²

24 Jan 2006

TL;DR: This paper introduces a new current-based cell timing analyzer, called CGTA, which has a higher performance than existing logic cell timing analysis tools and relies on a compact lookup table storing the output current gain of every logic cell as a function of its input voltage and output load.

...read moreread less

Abstract: This paper introduces a new current-based cell timing analyzer, called CGTA, which has a higher performance than existing logic cell timing analysis tools. CGTA relies on a compact lookup table storing the output current gain (sensitivity) of every logic cell as a function of its input voltage and output load. The current gain values are subsequently used by the timing calculator to produce the output current value as a function of the applied input voltage. This current and the output load then uniquely determine the output voltage value. Therefore, CGTA is capable of efficiently and accurately computing the output voltage waveform of a logic cell, which has been subjected to an arbitrary noisy input voltage waveform. Experimental results are presented to assess the quality of CGTA compared to other existing approaches.

...read moreread less