scispace - formally typeset
Open AccessJournal ArticleDOI

Thermal Modeling, Analysis, and Management in VLSI Circuits: Principles and Methods

Massoud Pedram, +1 more
- Vol. 94, Iss: 8, pp 1487-1501
Reads0
Chats0
TLDR
A brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power V LSI circuits is presented.
Abstract
The growing packing density and power consumption of very large scale integration (VLSI) circuits have made thermal effects one of the most important concerns of VLSI designers The increasing variability of key process parameters in nanometer CMOS technologies has resulted in larger impact of the substrate and metal line temperatures on the reliability and performance of the devices and interconnections Recent data shows that more than 50% of all integrated circuit failures are related to thermal issues This paper presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power VLSI circuits The paper is concluded with an overview of techniques to improve the full-chip thermal integrity by means of off-chip versus on-chip and static versus adaptive methods

read more

Content maybe subject to copyright    Report

1
Thermal Modeling, Analysis and Management in VLSI
Circuits: Principles and Methods
Massoud Pedram Shahin Nazarian
Dept. of Electrical Engineering, University of Southern California, Los Angeles, CA 90089
pedram@usc.edu shahin@usc.edu
ABSTRACT
The growing packing density and power consumption of
VLSI circuits have made thermal effects one of the most
important concerns of VLSI designers. The increasing
variability of key process parameters in nanometer CMOS
technologies has resulted in larger impact of the substrate
and metal line temperatures on the reliability and
performance of the devices and interconnections. Recent
data shows that more than 50% of all IC failures are
related to thermal issues. This article presents a brief
discussion of key sources of power dissipation and their
temperature relation in CMOS VLSI circuits, and
techniques for full-chip temperature calculation with
especial attention to its implications on the design of high-
performance, low power VLSI circuits. The article is
concluded with an overview of techniques to improve the
full-chip thermal integrity by means of off-chip vs. on-chip
and static vs. adaptive methods.
Keywords
Dynamic power, hot spots, leakage power, on-chip
temperature, thermal gradient
1 INTRODUCTION
"Smaller and faster" are the chief demands driving today's
electronic designs because they generally mean higher
performance. However, they also translate into high power
densities, higher operating temperatures and reduced
reliability. Furthermore, local hot spots, which have much
higher temperatures compared to the average die
temperature, are becoming more prevalent in VLSI circuits.
Elevated temperatures are a major contributor to lower
semiconductor reliability. If heat is not removed at a rate
equal to or greater than its rate of generation, junction
temperatures will rise. Higher junction temperatures reduce
mean time to failure (MTTF) for the devices. Device
reliability has a direct impact on the overall system
reliability. Removing heat from these devices is thus a
major task facing design engineers of modern electronic
systems concerned with improving reliability.
Understanding the effect of heat on the reliability of
electronic products and the integrity of manufacturing
processes is critical if problems are to be avoided. This
means the need to understand thermal management
techniques and the need for comprehensive data has never
been greater. With passive cooling methods, the chip
temperature is determined by the efficacy of heat transfer
out of the device to the ambient. Equilibrium is achieved
when the heat generation rate matches the heat transfer
rate. The key mechanisms are: thermal conduction, thermal
convection, and thermal radiation. Of the three methods of
heat transfer, radiation is the simplest: it simply needs a
large area with good emissivity to transfer a large amount
of heat to the surroundings. Conduction and radiation can
be implemented with a fully passive heat transfer system,
whereas convection is an active method that requires
design overhead.
With component packages becoming more compact
and having smaller physical profiles, it is no longer
sufficient to merely add "a bigger fan" as a downstream fix
for thermal problems. Because heat conduction is playing a
bigger role while heat convection is playing a lesser role in
removing heat, thermal management is best accomplished
when it is incorporated starting at the beginning of the
design cycle. Heat flow must be planned and thermal
resistances minimized. In addition, although worst-case
heating conditions seldom arise in a circuit during it s
lifetime, when they do arise, they can cause significant
problems, ranging from circuit transient timing errors to
complete catastrophic burnout. A package designed for the
worst case is excessive. To reduce packaging cost without
unnecessarily limiting performance, the package should be
designed for the worst typical application. Any application
that generates more heat than this cheaper package can
handle should engage an alternative, runtime thermal-
management technique. Since typical high-power
applications still operate 20% or more below the worst case
[1], this can lead to dramatic savings. This is the
philosophy behind the thermal design of the Intel Pentium
4. At the same time, the heat flux, or heat load per unit
area, for state of the art microprocessors is currently at 10-
15W/cm
2
, which is fast exceeding the limit of air cooling.
Temperature difference in the die is especially
important for sensitive analog circuits where such
differences can easily cause mismatches between signal
levels and bias currents, thus degrading the performance of
the analog circuit and reducing the noise margins.

2
Emerging circuit fabrics, such as vertically integrated
(3-D) ICs, are significantly impacted by thermal effects.
The 3-D architectures, which provide multiple layers of
active devices together with high-density local
interconnects, offer unique advantages both in terms of
density and circuit performance [2][3]. However, the power
density and temperature of these architectures can be quite
high. Thermal management is thus of critical importance
for the 3-D designs [4][5].
High on-chip temperatures can give rise to timing
failures and reliability concerns. In fact, many of the
electronic circuit failures are caused by or related to
elevated temperatures, sudden spatial or temporal
temperature variations, and presence of hot spots.
Temperature variations across a VLSI chip can result in
significant timing uncertainty, prompting wider timing
margins, and thus, lower circuit performance. Yet another
consideration is that the on-chip temperature gradient (the
difference in temperatures at different parts of the chip,
which are in turn caused by uneven power dissipation, can
produce mechanical stress, which may degrade the chip
reliability.
Leakage power consumption is known to be highly
dependent on the on-chip temperature profile, that is,
higher temperature results in larger power dissipation,
which in turn increases the on-chip temperatures. This can
result in thermal runaway condition. Consequently, power
reduction and management interact with thermal effect
analysis and control and vice versa.
Dynamic and leakage power are the two main sources
of power consumption in VLSI circuits. In many new high
performance designs, the leakage component is comparable
to the switching component. Reports indicate that 40% or
even a higher percentage of the total power consumption in
90nm process technology is due to the leakage of
transistors [6]. This percentage is expected to increase with
technology scaling. Simulation results in [7] predict that the
transistor off-state current per micron of transistor width
increases by a factor of 3-5 per generation. As will be
shown in Section 4, for a given package, the die
temperature can be modeled as a linear function of the total
power dissipation of the circuit. At the same time, the
leakage power increases exponentially with temperature.
These facts clearly motivate the need for leakage power
reduction techniques in existing designs.
Figure 1 illustrates the significant increase in leakage
power of a 15mm die fabricated in a 100nm technology
with a supply voltage of 0.7V as a function of substrate
temperature. If the thermal conductance of the package is
not large enough, this exponential dependence will cause
thermal runaway where the die temperature increases
unbounded and the chip fails [8]. Even when thermal
runaway does not occur, the operating temperature of the
chip can become larger than the designed value, which will
either increase the package cost or degrade the performance
as well as the long-term reliability of the chip [9].
Figure 1: Power consumption of a die as a function of
temperature. Courtesy of Vivek De, Intel.
This article focuses on thermal issues and the
techniques that deal with them. More precisely, the first
part of the article provides an overview of the major
sources of power consumption and their relation to the die
temperature. The second part of the article describes full-
chip thermal modeling and electrothermal simulation
technique. The third part of the paper focuses on the impact
of substrate and interconnect temperatures gate and
interconnect delays. The paper is concluded with an
overview of dynamic thermal management strategies for
micro-processor chips.
2 FULL-CHIP TEMPERATURE
CALCULATION
Heat is generated in both the substrate and the
interconnections. The major source of heat generation is the
power dissipation of devices that are embedded in the
substrate. Some power dissipation also results from Joule
heating (or self-heating) caused by the flow of current in
the interconnect network. Although interconnect Joule
heating constitutes only a small fraction of the total power
dissipation in the chip, the temperature rise in the
interconnections due to Joule heating can be significant.
This is due to the fact that interconnects are located away
from the Silicon substrate and the heat sink by several
layers of insulating materials which have lower thermal
conductivities than that of Silicon.
Simply stated, the operating temperature of a VLSI
chip can be calculated from the following linear equation:
tot
chip a
P
TTR
A
θ
=+
(1)
where T
chip
is the average chip (silicon junction)
temperature, T
a
is the ambient temperature (T
a
= 25°C), P
tot
(in W) is the total power consumption, A (in cm
2
) is the
chip area, and R
θ
is the equivalent thermal resistance of the
substrate (Si) layer plus the package and heat sink
(cm
2
°C/W.) As this equation shows, to calculate the chip

3
temperature, one must have calculated power dissipation of
the circuit (P
tot
), constructed the chip thermal model (R
θ
),
and be given information about the environment (T
a
).
The self-heating effect can be analyzed as follows
[10]. The metal temperature, T
metal
, is given by
2
,
m
etal chip self
s
elf E rms self
TT T
TRIR
θ
=+Δ
Δ=
(2)
where
Δ
T
self
is the temperature rise of the metal
interconnect due to the flow of current, R
E
is the electrical
resistance of interconnect, and R
θ
,self
is the thermal
impedance of the interconnect line to the substrate.
2.1 Power Dissipation Sources
Power dissipation in the substrate of a CMOS VLSI
circuit can be calculated as:
T
otal Dynamic Short Circuit Stati
c
PP P P
=+ +
(3)
where
P
dynamic
denotes the dynamic power consumption that
occurs when the output signal of a CMOS logic cell makes
a transition;
P
short-circuit
represents the power dissipation of
the circuits when both n- and p-transistors transistors are
simultaneously conducting, creating a direct path between
the supplying power and the ground; and
P
static
is the static
power dissipation that is caused by the static current drawn
from power supply. This component is mainly due to the
direct gate current and the sub-threshold conduction
current, which are collectively referred to as the leakage
current. Each component of the power dissipation in a
CMOS circuit is discussed in more detail next.
2.1.1 Dynamic Power
Dynamic or switching power is due to the signal switching
activity at the output of a CMOS logic cell. The dynamic
power component dominates during the active
mode of the
cell operation. This component is expressed as:
2
0.5
Dynamic load DD
P
CVf
α
=
(4)
where
f is the clock frequency, and
α
is the expected
number of output transitions in a clock period, and
C
load
is
the load capacitance (including gate input and interconnect
capacitances.) During the output signal transition, the
output capacitance is charged to
V
DD
or discharged to
ground as follows: while charging, half of the energy
supplied by
V
DD
is stored in the output capacitance and the
other half is dissipated in the pull-up transistors. While
discharging the remaining charge is removed from the
output capacitance and dissipated in the pull-down
transistors.
2.1.2 Short-Circuit Power
Short-circuit power (
I
short-circuit
×V
DD
) is due to direct current
flow from the power supply to the ground. The short-circuit
current,
I
short-circuit
, occurs when pull-up and pull-down
networks are conducting simultaneously. Short-circuit
current is dependent on the duration of the simultaneous
on-times of the pull-up and pull-down networks, the
transistor sizes, and the supply voltage level. In general, the
short-circuit power of a cell is minimized if the output
transition time is larger than its input transition time. The
derivation of an exact formula for the short-circuit power is
a complicated task; however, simple closed-form
expressions have been proposed by making simplifying
assumptions and/or considering special cases [11]-[14].
The short-circuit current is found to depend on the carrier
mobility and threshold voltage of the transistors both of
which vary with temperature.
2.1.3 Static Power
Although a source of static power manifests itself in
circuits that have constant sources of current between the
power supplies, leakage currents are the major sources
static power dissipation. Although there are sources of
leakage current in a CMOS circuit, the three dominant ones
are (cf. Figure 2):
1. Reverse-biased junction leakage current (
I
REV
)
2. Gate direct tunneling leakage (
I
G
)
3. Subthreshold (weak inversion) leakage (
I
SUB
)
Halo
N
P Substrate
N
I
G
Insulator
I
REV
I
SUB
I
GIDL
Source
Drain
Gate
Figure 2: Leakage current components in an nmos transistor.
I
REV
flows from the source or drain to the substrate
through the reverse biased diodes when a transistor is off.
The magnitude of
I
REV
depends on the area of the drain
diffusion and the leakage current density, which is in turn
determined by the doping concentration. If both n and p
regions are heavily doped, band-to-band tunneling
dominates the pn junction leakage [15]. Junction leakage
has a rather high temperature dependency as much as 50-
100x/100
°C; however, this is generally significant only in
circuits designed to operate at high temperatures greater
than 150
°C. Junction reverse-bias leakage components
from both the source-drain diodes and the well diodes are
generally negligible with respect to the other three leakage
sources.
Gate direct tunneling leakage,
I
G
, flows from the gate
through the insulator and to the substrate. In oxide layers
thicker than 3-4nm,
I
G
is due to the Fowler-Nordheim
tunneling of electrons into the conduction band of the oxide
layer under a high applied electric field across the oxide
layer. In technology node 0.15
μm and lower which have
lower oxide thicknesses, direct tunneling through the
silicon oxide layer is the leading effect.
I
G
of a p-transistor
is typically one order of magnitude smaller than that of an

4
n-transistor with identical gate oxide thickness,
T
ox
, when
SiO
2
is used as the gate dielectric. The magnitude of I
G
increases exponentially with T
ox
and V
DD
. For example, for
relatively thin oxide thicknesses in the order of 2-3nm, at
V
GS
=1V, every 0.2nm reduction in T
ox
causes a 10 fold
increase in
I
G
[16]]. The temperature dependency of I
G
is
quite weak, i.e., only
2x/100°C.
Subthreshold leakage,
I
SUB
, is the drain-source current
of a transistor operating in the weak inversion region.
Unlike the strong inversion region in which the drift current
dominates, the subthreshold conduction is due to the
diffusion current of the minority carriers in the channel. In
current CMOS technologies,
I
SUB
is much larger than the
other leakage current components [17]. This is mainly
because of the relatively low
V
T
in modern CMOS devices.
According to the BSIM3v3.2 MOSFET transistor model
[18], the subthreshold drain current
I
SUB
of a transistor in
the normal “off” state, V
ds
= V
DD
and V
gs
= 0, is expressed
by the following equation:
10
T
sub tech
V
S
W
Ik
L
⎛⎞
=
⎜⎟
⎝⎠
(5)
where
k
tech
is a transistor geometry and CMOS technology
dependent parameter,
W and L denote the transistor width
and length, V
T
denotes the threshold voltage of the device
and
S, which is called the subthreshold swing parameter, is
equal to the subthreshold voltage decrease required to
increase
I
sub
by a factor of 10. In fact, S=2.3nk
B
T/q where
n1 is a device-dependent parameter, k
B
is the Boltzmann’s
constant,
T denotes absolute temperature in degrees Kelvin,
and
q is the electron charge.
It is desirable to have as small S value as possible
since this is the parameter that determines the amount of
voltage swing necessary to switch a MOSFET from off to
the on state (Typical values of
S for bulk CMOS devices
are 70-90 mV/decade.) To minimize
S, the thinnest
possible gate oxide to increase
C
ox
and the lowest possible
doping concentration in the channel to decrease
C
dep
should
be used. Higher temperatures increase
S, which in turn
increase the off leakage current.
I
SUB
is a function of temperature, threshold voltage,
device size, and the process parameters out of which the
threshold voltage (V
T
) is dominant. The subthreshold
leakage current increases rapidly with temperature. This is
shown in Figure 3 which illustrates the leakage current
versus temperature for several technology nodes.
I
SUB
has a
temperature sensitivity of 8-12 x/100
°C. The data also
confirms that the leakage power increases as the
technology moves forward.
Figure 3: I
SUB
(V
GS
=0) trend as a function of temperature.
Courtesy of Vivek De, Intel.
It is seen that each component of power consumption
is a function of temperature. Reduction of the supply
voltage reduces the chip total power consumption, which in
turn reduces the chip temperature. As the chip temperature
is reduced, the leakage power is reduced dramatically.
2.2 Full-Chip Thermal Modeling
Key to successful thermal management is the ability to
obtain comprehensive and accurate temperature data under
as realistic operating conditions. The commonly used
method of gathering this temperature data by using point
contact methods (thermocouples) is limited by the large
number of points to be monitored and the small size of the
components. Connecting tens or hundreds of
thermocouples is very time consuming. Infrared (IR)
thermal imaging is a new technique which addresses these
issues by providing comprehensive two-dimensional maps
of thousands of temperatures in a matter of seconds. This is
accomplished without the need to make contact with the
components. This approach is however expensive and time-
consuming and can only be applied post-design. It is thus
important to have full-chip thermal models and simulation
tools that can provide the temperature profile of the die.
The heat diffusion equation is in general used to
describe the heat conduction in a chip and calculate the
temperature profile [19]:
(,)
[(, ) (,)] (,)
p
Trt
c
krT Trt gr
t
t
ρ
=∇ +
G
G
GG
(6)
which is subject to the general thermal convection
boundary condition:
(,)
(
,) ( (,
))
ia
i
Trt
k
rT hT Trt
n
=−
G
G
G
(7)
In the above equations T is the temperature (°C), k is
the thermal conductivity (W/(m°C)),
ρ
is the density of
material (Kg/m
3
), c
p
is the specific heat (J/(Kg°C)), g is the
power density of the heat sources (W/m
3
), h
i
is the heat
transfer coefficient in the direction of heat flow
i
G
on the
boundary surface of the chip (W/(m
2
°C)). Note that
,
1/( )
i
ii
h
AR
θ
=
where
i
A
is the effective area normal to
i
G
and

5
,
i
R
θ
denotes the equivalent thermal resistance. /n
i
is the
differentiation operator along the outward direction normal
to the boundary surface and T
a
is the ambient temperature.
The term
[(, ) (,)]
k
rT Trt∇⋅
GG
in eqn. (6) can be
replaced by
2
() (,)
k
TTr
t
G
for homogeneous materials,
resulting in a second order parabolic partial differential
equation:
222
222
(,)
(,) (,) (,)
()( ) (,)
p
Trt
c
t
Trt Trt Trt
k
Tgr
t
xyy
ρ
=
∂∂∂
++ +
∂∂∂
G
GGG
G
(8)
The heat flow described by this differential equation has a
similar form to that for electrical current, and there is a well
known duality between them. The heat flow (W) passing
through a thermal resistor (°C/W) is equivalent to the
electrical current (Ampere) through an electrical resistance
(Ohm), and the temperature difference (°C) corresponds to
voltage difference (Volt). There is also the thermal
equivalent capacitance (J/°C) where the heat is absorbed
for the electrical capacitance (Farad). More precisely,
p
C
c
H
θ
ρ
, which is the thermal capacitance, is
modeled by an electrical capacitance whereas
g
HΔ
, which
is the heat flow coming from power generated by logic
cells in control volume
H
xy
z
Δ =ΔΔΔ
, is modeled by an
electrical current source, i
p
. Various thermal resistances,
which are defined in x, y and z directions with values
inverse-linearly proportional to k and distances, are
replaced by the corresponding electrical resistances, R
E
.
Ambient temperature is expressed using an independent
voltage source, v
0
. Node temperatures will then correspond
to node voltages in the electrical network constructed in
this way (cf. Figure 4.)
v
R
C
v
0
i
p
0
p
E
dv v v
Ci
dt R
=+
Figure 4: Simple electrical model of heat flux and temperature
with the corresponding differential equation for a single heat
source on the chip
Now consider the general case of multiple heat sources
in the substrate connected via thermal resistances to each
other and to the ambient environment as depicted in Figure
5. Here, node n
i
represents a circuit block (logic cell or
groups of logic cells depending on the granularity of the
thermal model.) The power consumption of each circuit
block is represented as a current source, i
k
, associated with
the corresponding node. Between neighboring nodes, a
thermal resistance, r
ij
, is added to model the lateral heat
conduction path. Thermal resistances are also added
between nodes and the ambient voltage terminal, r
i0
, to
capture the vertical component of the thermal resistance
between the circuit block and the ambience (includes the
effects of substrate, package and heat sink.) A thermal
capacitor, c
io
, at each node is included to model the heat
absorption and storage in the substrate and thereby derive
the chip temperature evolution over time. The ambient
temperature is modeled as an independent fixed voltage
source, v
0
.
Figure 5: Equivalent circuit model to temperature
distribution accounting for both lateral and vertical heat flux
and heat absorption.
One may use DC (or AC) analysis functions available
in SPICE-type circuit simulators to calculate chip
temperature [20][21]. Notice that it is critical to consider
the effect of metal interconnect on heat distribution since
the metal interconnects tend to provide a low thermal
impedance path for heat flux among various parts of the
substrate.
Substrate (T
sub
)
Insulator (k
ins
)
w
Interconnect (k
m
)
t
m
t
ins
L
Figure 6: An interconnect line passing over the substrate,
separated by an insulation layer.
The 1-D heat diffusion equation in metal
interconnection under the steady-state can be written as:
2
2
() (, )
m
T
xgxT
xk
=−
(9)
where
(, )
g
xT
is the temperature-dependent power density
of heat sources (W/m
3
) at x and k
m
is the thermal
conductivity of the material (W/(m°C)). For the
interconnects,
2
(, ) (, )
rms
g
xT J xT
ρ
=
, where
r
m
s
J
is the rms
current density (A/m
2
) and
(, )
x
T
ρ
is the temperature-
dependent metal resistivity at x (Ωm).

Citations
More filters
Journal ArticleDOI

A critical review of traditional and emerging techniques and fluids for electronics cooling

TL;DR: In this paper, a critical review of traditional and emerging cooling methods as well as coolants for electronics is provided, summarizing traditional coolants, heat transfer properties and performances of potential new coolants such as nanofluids are also reviewed and analyzed.
Journal ArticleDOI

Experimental investigation on paraffin wax integrated with copper foam based heat sinks for electronic components thermal cooling

TL;DR: In this paper, the performance of heat sinks filled with phase change material (PCM) was investigated under a heat load of 8 to 24 W. The results revealed that base temperature of the heat sink is reduced as the volume fraction of PCM is increased.
Proceedings ArticleDOI

Predictive dynamic thermal and power management for heterogeneous mobile platforms

TL;DR: This paper presents a DTPM algorithm based on a practical temperature prediction methodology using system identification that dynamically computes a power budget using the predicted temperature, and controls the types and number of active processors as well as their frequencies.
References
More filters
Journal ArticleDOI

I and i

Kevin Barraclough
- 08 Dec 2001 - 
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Journal ArticleDOI

From the authors

TL;DR: Findings, i.e. that as-needed AO provided for a period of 3 months had no effect on quality of life and walked distance, are against the stream of current guidelines.
Proceedings ArticleDOI

Wattch: a framework for architectural-level power analysis and optimizations

TL;DR: Wattch is presented, a framework for analyzing and optimizing microprocessor power dissipation at the architecture-level and opens up the field of power-efficient computing to a wider range of researchers by providing a power evaluation methodology within the portable and familiar SimpleScalar framework.
Book

Fundamentals of Modern VLSI Devices

Yuan Taur, +1 more
TL;DR: In this article, the authors highlight the intricate interdependencies and subtle tradeoffs between various practically important device parameters, and also provide an in-depth discussion of device scaling and scaling limits of CMOS and bipolar devices.
Journal ArticleDOI

Electromigration—A brief survey and some recent results

TL;DR: In this article, it is shown that positive gradients, in terms of electron flow, of temperature, current density, or ion diffusion coefficient foreshorten conductor life because they present regions where vacancies condense to form voids.
Related Papers (5)
Frequently Asked Questions (20)
Q1. What have the authors contributed in "Thermal modeling, analysis and management in vlsi circuits: principles and methods" ?

This article presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with especial attention to its implications on the design of highperformance, low power VLSI circuits. The article is concluded with an overview of techniques to improve the full-chip thermal integrity by means of off-chip vs. on-chip and static vs. adaptive methods. 

Assuming a fixed driver resistance, which simplifies the delay expression, the authors show that about 5% delay degradation is achieved for each 20°C temperature increase in the constant temperature along the interconnect line. 

Possible response mechanisms include micro-architectural adaptations (e.g., clock throttling, register file resizing, limiting the issue width of a processor, and computation migration to auxiliary hardware), and/or on-the-fly performance adjustment via dynamic power management (DPM), dynamic voltage scaling (DVS), clock/power gating. 

Voltage drop effects are becoming increasingly significant, because the resistivity of the power and ground tracks rises as a function of decreasing feature sizes (track widths) and increasing chip temperatures. 

Dummy vias in the higher metal layers may be used to reduce temperatures on interconnect without impacting their electrical resistance and capacitance. 

Because heat conduction is playing a bigger role while heat convection is playing a lesser role in removing heat, thermal management is best accomplished when it is incorporated starting at the beginning of the design cycle. 

The existence of high thermal gradients on the substrate creates non-uniform temperature profiles along the length of the global interconnect lines, which are located above the substrate. 

metal temperatures increase significantly (i.e., by hundreds of Celsius degrees) beyond the 45-nm node owing to the combined effects of increasing metal resistivity, increasing current density, increasing number of global metal levels, and decreasing ILD thermal conductivity. 

Designers can also make designs more robust by limiting the maximum power draw that is sustained over a period of say tens of micro-seconds. 

Assuming a constant current density in all metal layers of a signal net, it is found that the heat diffusion length is larger for the higher level metal layers due to their higher underlying insulator thickness. 

In general, the voltage drops on the power rail can be in the form of a self-induced IR drop from the external power pin to the power terminal of a logic block due to the current that is drawn by the logic block itself. 

This is in turn because variations in VT or Leff of transistors and voltage drops on power supply lines reduce the noise margins of logic cells, leaving less room to accommodate temperature effects on parasitic RC values and circuit performance. 

The non-uniform resistance profile of the global interconnects will in turn strongly impact many aspects of interconnect performance modeling and optimization. 

Even when thermal runaway does not occur, the operating temperature of the chip can become larger than the designed value, which will either increase the package cost or degrade the performance as well as the long-term reliability of the chip [9]. 

These effects can be minimized by increasing the width of power tracks (which reduces the power track resistances) and/or by increasing the spacing between logic blocks (which reduces power density and hence reduces chip temperature). 

The net heat energy generation per unit volume is:( ) ( ) ( , ) gen lossmP x P x g x Twt x− =Δ (11)Since the length of global interconnects can be assumed to be much larger than its thickness and width, the thermal gradients along thickness and width can be ignored, i.e., a 1-D formula can be found to represent the heat diffusion: 

by solving eqn. (12) subject to boundary condition (i.e., given temperatures at the two ends of the line), the thermal profile of interconnection lines can be found. 

As will be shown in Section 4, for a given package, the die temperature can be modeled as a linear function of the total power dissipation of the circuit. 

Figure 1 illustrates the significant increase in leakage power of a 15mm die fabricated in a 100nm technology with a supply voltage of 0.7V as a function of substrate temperature. 

Up until recently, corner-based timing and signal integrity analysis techniques were used as relatively fast techniques to address the concerns related to various sources of variation in VLSI circuits.