scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Analysis and optimization of thermal issues in high-performance VLSI

TL;DR: It is shown that chip level thermal effects can have a significant impact on large-scale circuit optimization techniques, including the clock-skew minimization scheme, and can influence other physical design problem formulations.
Abstract: This paper provides an overview of various thermal issues in high-performance VLSI with especial attention to their implications for performance and reliability. More specifically, it examines the impact of thermal effects on both interconnect design and electromigration reliability and discusses their impact on the allowable current density limits. Furthermore, it also discusses how thermal and reliability constrained current density limits may conflict with those obtained through purely performance based criterion. Additionally, it is shown that chip level thermal effects can have a significant impact on large-scale circuit optimization techniques, including the clock-skew minimization scheme, and can influence other physical design problem formulations. Finally, high-current interconnect design rules for ESD and I/O circuits are also examined.
Citations
More filters
Journal ArticleDOI
TL;DR: An efficient 3-D transient thermal simulator based on the alternating direction implicit (ADI) method for temperature estimation in a3-D environment, which not only has a linear runtime and memory requirement, but also is unconditionally stable.
Abstract: Recent study shows that the nonuniform thermal distribution not only has an impact on the substrate but also interconnects. Hence, three-dimensional (3-D) thermal analysis is crucial to analyze these effects. In this paper, the authors present and develop an efficient 3-D transient thermal simulator based on the alternating direction implicit (ADI) method for temperature estimation in a 3-D environment. Their simulator, 3D Thermal-ADI, not only has a linear runtime and memory requirement, but also is unconditionally stable. Detailed analysis of the 3-D nonhomogeneous cases and boundary conditions for on-chip VLSI applications are introduced and presented. Extensive experimental results show that our algorithm is not only orders of magnitude faster than the traditional thermal simulation algorithms but also highly accurate and memory efficient. The temperature profile of steady state can also be reached in several iterations. This software will be released via the web for public usage.

221 citations


Cites background from "Analysis and optimization of therma..."

  • ...A comprehensive analysis of the thermal effects in high-performance VLSI has been discussed recently [1]–[4]....

    [...]

Proceedings ArticleDOI
24 Jul 2006
TL;DR: This work is the first attempt to study the performance benefits of 3D technology under the influence of thermal constraints, and it is shown that the 3D system registers large performance improvement for memory intensive applications.
Abstract: Three-dimensional (3-D) integrated circuits have emerged as promising candidates to overcome the interconnect bottlenecks of nanometer scale designs. While they offer several other advantages, it is expected that the benefits from this technology can potentially be off-set by thermal considerations which impact chip performance and reliability. The work presented in this paper is the first attempt to study the performance benefits of 3-D technology under the influence of such thermal constraints. Using a processor-cache-memory system and carefully chosen applications encompassing different memory behaviors, the performance of 3-D architecture is compared with a conventional planar (2-D) design. It is found that the substantial increase in memory bus frequency and bus width contribute to a significant reduction in execution time with a 3-D design. It is also found that increasing the clock frequency translates into larger gains in system performance with 3-D designs than for planar 2-D designs in memory intensive applications. The thermal profile of the vertically stacked chip is generated taking into account the highly temperature sensitive leakage power dissipation. The maximum allowed operating frequency imposed by temperature constraint is shown to be lower for 3-D than for 2-D designs. In spite of these constraints, it is shown that the 3-D system registers large performance improvement for memory intensive applications.

215 citations


Additional excerpts

  • ...strongly temperature sensitive [23][24]....

    [...]

Patent
21 Mar 2008
TL;DR: In this paper, a thermal analysis is used to determine the temperature dependent power dissipation of a circuit and the temperature distribution of the circuit resulting from dissipating the heat created by the temperature-dependent power disipation.
Abstract: Methods and apparatuses for circuit design to reduce power usage, such as reducing temperature dependent power usage, and/or to improve timing, such as reducing temperature dependent delay or transition time. At least one embodiment of the present invention reduces the power dissipation and improves the timing of an integrated circuit to optimize the design. A thermal analysis is used to determine the temperature dependent power dissipation of a circuit and the temperature distribution of the circuit resulting from dissipating the heat created by the temperature dependent power dissipation. Then, the components of the design are selectively transformed to reduce the power dissipation and to improve timing based on the temperature solution. The transformation may include placement changes and netlist changes, such as the change of transistor threshold voltages for cells or for blocks of the circuit chip.

207 citations

Patent
22 Jan 2009
TL;DR: In this article, a shielding mesh of at least two reference voltages (e.g., power and ground) is used to reduce both the capacitive coupling and the inductive coupling in routed signal wires in IC chips.
Abstract: Methods and apparatuses to design an Integrated Circuit (IC) with a shielding of wires. In at least one embodiment, a shielding mesh of at least two reference voltages (e.g., power and ground) is used to reduce both the capacitive coupling and the inductive coupling in routed signal wires in IC chips. In some embodiments, a type of shielding mesh (e.g., a shielding mesh with a window surrounded by a power ring, or a window with a parser set of shielding wires) is selected to make more routing area available in locally congested areas. In other embodiments, the shielding mesh is used to create or add bypass capacitance. Other embodiments are also disclosed.

182 citations

Journal ArticleDOI
TL;DR: In this paper, the authors systematically explore the limits for heat removal from a model chip in various configurations, and identify bottlenecks in the thermal performance of current generation packages and motivate lowering of thermal resistance through the board-side for efficient heat removal to meet ever increasing reliability and performance requirements.
Abstract: The drive for higher performance has led to greater integration and higher clock frequency of microprocessor chips. This translates into higher heat dissipation and, therefore, effective cooling of electronic chips is becoming increasingly important for their reliable performance. We systematically explore the limits for heat removal from a model chip in various configurations. First, the heat removal from a bare chip by pure heat conduction and convection is studied to establish the theoretical limit of heat removal from a bare die bound by an infinite medium. This is followed by an analysis of heat removal from a packaged chip by evaluating the thermal resistance due to individual packaging elements. The analysis results allow us to identify the bottlenecks in the thermal performance of current generation packages, and to motivate lowering of thermal resistance through the board-side for efficient heat removal to meet ever increasing reliability and performance requirements.

138 citations


Cites background from "Analysis and optimization of therma..."

  • ...At the chip level, the nonuniformity in temperature leads to a clock skew [3]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: It is found possible to define delay time and rise time in such a way that these quantities can be computed very simply from the Laplace system function of the network.
Abstract: When the transient response of a linear network to an applied unit step function consists of a monotonic rise to a final constant value, it is found possible to define delay time and rise time in such a way that these quantities can be computed very simply from the Laplace system function of the network. The usefulness of the new definitions is illustrated by applications to low pass, multi‐stage wideband amplifiers for which a number of general theorems are proved. In addition, an investigation of a certain class of two‐terminal interstage networks is made in an endeavor to find the network giving the highest possible gain—rise time quotient consistent with a monotonic transient response to a step function.

1,693 citations

Journal ArticleDOI
J.R. Black1
TL;DR: In this article, it is shown that positive gradients, in terms of electron flow, of temperature, current density, or ion diffusion coefficient foreshorten conductor life because they present regions where vacancies condense to form voids.
Abstract: Recently, electromigration has been identified as a potential wear-out failure mode for semiconductor devices employing metal film conductors of inadequate cross-sectional area. A brief survey of electromigration indicates that although the effect has been known for several decades, a great deal of the processes involved is still unknown, especially for complex metals and solute ions. Earlier design equations are improved to account for conductor film cross-sectional area as well as film structure, film temperature, and current density. Design curves are presented which permit the construction of high reliability "infinite life" aluminum conductors for specific conditions of maximum current and temperature stress expected in use. It is also shown that positive gradients, in terms of electron flow, of temperature, current density, or ion diffusion coefficient foreshorten conductor life because they present regions where vacancies condense to form voids.

1,267 citations

Book
30 Jun 1995
TL;DR: The Hierarchy of Limits of Power J.D. Stratakos, et al., and Low Power Programmable Computation coauthored with M.B. Srivastava, provide a review of the main approaches to Voltage Scaling Approaches.
Abstract: 1. Introduction. 2. Hierarchy of Limits of Power J.D. Meindl. 3. Sources of Power Consumption. 4. Voltage Scaling Approaches. 5. DC Power Supply Design in Portable Systems coauthored with A.J. Stratakos, et al. 6. Adiabatic Switching L. Svensson. 7. Minimizing Switched Capacitance. 8. Computer Aided Design Tools. 9. A Portable Multimedia Terminal. 10. Low Power Programmable Computation coauthored with M.B. Srivastava. 11. Conclusions. Subject Index.

1,024 citations

Journal ArticleDOI
TL;DR: In this article, a deferred-merge embedding (DME) algorithm is proposed to construct a clock tree with zero skew while minimizing the total wirelength, which can be applied to either the Elmore or linear delay model.
Abstract: The deferred-merge embedding (DME) algorithm, which embeds any given connection topology to create a clock tree with zero skew while minimizing total wirelength, is presented. The algorithm always yields exact zero skew trees with respect to the appropriate delay model. Experimental results show an 8% to 15% wire length reduction over some previous constructions. The DME algorithm may be applied to either the Elmore or linear delay model, and yields optimal total wirelength for linear delay. DME is a very fast algorithm, running in time linear in the number of synchronizing elements. A unified BB+DME algorithm, which constructs a clock tree topology using a top-down balanced bipartition (BB) approach and then applies DME to that topology, is also presented. The experimental results indicate that both the topology generation and embedding components of the methodology are necessary for effective clock tree construction. >

302 citations

Proceedings ArticleDOI
01 May 1998
TL;DR: To achieve a non-iterative design flow, it is proposed that early synthesis stages should use "wireplanning" to distribute delays over the functional elements and interconnect, and layout synthesis should use its degrees of freedom to realize those delays.
Abstract: A shift is proposed in the design of VLSI circuits. In conventional design, higher levels of synthesis produce a netlist, from which layout synthesis builds a mask specification for manufacturing. Timing analysis is built into a feedback loop to detect timing violations which are then used to update specifications to synthesis. Such iteration is undesirable, and for very high performance designs, infeasible. The problem is likely to become much worse with future generations of technology. To achieve a non-iterative design flow, we propose that early synthesis stages should use "wireplanning" to distribute delays over the functional elements and interconnect, and layout synthesis should use its degrees of freedom to realize those delays. In this paper we attempt to quantify this problem for future technologies and propose some solutions for a "constant delay" methodology.

300 citations