scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Dynamic thermal management for FinFET-based circuits exploiting the temperature effect inversion phenomenon

11 Aug 2014-pp 105-110
TL;DR: Experimental results demonstrate 40% energy saving can be achieved by the proposed TEI-aware DTM approach compared to the best-in-class DTMs that are unaware of this phenomenon.
Abstract: Due to limits on the availability of the energy source in many mobile user platforms (ranging from handheld devices to portable electronics to deeply embedded devices) and concerns about how much heat can effectively be removed from chips, minimizing the power consumption has become a primary driver for system-on-chip designers. Because of their superb characteristics, FinFETs have emerged as a promising replacement for planar CMOS devices in sub-20nm CMOS technology nodes. However, based on extensive simulations, we have observed that the delay vs. temperature characteristics of FinFET-based circuits are fundamentally different from that of the conventional bulk CMOS circuits, i.e., the delay of a FinFET circuit decreases with increasing temperature even in the super-threshold supply voltage regime. Unfortunately, the leakage power dissipation of the FinFET-based circuits increases exponentially with the temperature. These two trends give rise to a tradeoff between delay and leakage power as a function of the chip temperature, and hence, lead to the definition of an optimum chip temperature operating point (i.e., one that balances concerns about the circuit speed and power efficiency.) This paper presents the results of our investigations into the aforesaid temperature effect inversion (TEI) and proposes a novel dynamic thermal management (DTM) algorithm, which exploits this phenomenon to minimize the energy consumption of FinFET-based circuits without any appreciable performance penalty. Experimental results demonstrate 40% energy saving (with no performance penalty) can be achieved by the proposed TEI-aware DTM approach compared to the best-in-class DTMs that are unaware of this phenomenon.

Summary (3 min read)

1. INTRODUCTION

  • With the dramatic downscaling of layout geometries, the traditional bulk CMOS technology has hit critical roadblocks, namely increasing leakage current and power consumption induced by the short-channel effects (SCEs) and the increasing variability levels.
  • This is an important point because the delay versus temperature behavior of FinFET devices and circuits is different from that of the conventional bulk CMOS devices operating in the super-threshold regime.
  • In the near/subthreshold regime [12, 13] or in high-vt devices [14], it has been reported that the delay of these circuits decreases with increasing temperature.
  • The authors effectively find the optimal temperature point to maximize energy efficiency of the circuits, and introduce new voltage scaling policies to make the circuits operate at the optimal point.

2. TEMPERATURE EFFECT INVERSION (TEI) PHENOMENON IN FinFETs

  • For VLSI circuits, the delay of a logic gate is directly affected by the driving current (Ion).
  • S, µ, and Vth are the temperature dependent parameters.
  • Meanwhile, conventional MOSFETs operating in the sub/nearthreshold regime or high-vt devices have shown the similar phenomenon (indeed, more significant than what was observed in FinFETs with super-threshold Vdd) that the circuit delay decreases with the increasing temperature [12, 13, 14].
  • As a consequence, different from the super-threshold regime where the slightly stronger effect of µ than that of Vth causes decreasing Ion with increasing T , the changes of Vth and S considerably increases Ion in the sub/near-threshold regime, and thus the gate can run much faster.
  • Ion of FinFETs operating in the sub/near-threshold regime also has the same exponential dependency on Vth and S. Combined with the tensile stress effect, FinFETs in the sub/near-threshold regime exhibit a significant delay reduction as the temperature goes high.

3. POWER AND THERMAL MODELS

  • The power consumption of VLSI circuits has two components: a dynamic part and static part.
  • The authors use the conventional RC-circuit thermal model, which is shown in Figure 4 (a) [19].
  • Due to the strong dependence of Pstatic on Tdie and Vdd from (2), the amount of differences between the two Pcircuit levels from the high Vdd and low Vdd , which is indicated by the arrows in Figure 4 (b), increases super-linearly with increasing Tdie and Vdd .
  • Hence, for some high Vdd levels, it is possible that the corresponding Teq’s exceed the die temperature limit (e.g., 90°C), or such Teq’s do not exist at all.
  • The details will be explained at Section 5.

4.1 Influence of TEI on energy consumption

  • Due to the TEI phenomenon, the worst-case delays occur at the low temperature in FinFET circuits.
  • Therefore, for a given target clock frequency, the corresponding voltage level of the circuit should be set according to the worst-cased circuit delay, which occurs at the lowest die temperatures.
  • Lowering down the voltage levels right after the increased temperature reaching Tth leads to two possible cases: (Case I) Tdie keeps increasing, or (Case II) Tdie begins decreasing.
  • Different from Figure 5 (b) and (c), each of which considers simply two available voltage levels, there can be more than two available voltages levels in reality that can meet the scheduled frequency condition in the whole temperature range.

4.2 Energy optimization

  • With the given deadline specification of a task, the required (min- imum) operating frequency ftarget and corresponding base voltage level Vbase can be determined in order to finish task execution by deadline.
  • Conventional DTMs of the circuit try not to exceed the temperature limit Tlimit by forcing to lower down the frequency or stop execution with performance penalties.
  • Therefore, the authors propose a policy as: I Policy I: Check if there exists a k such that TVkeq ∈.
  • If k exists, the optimal voltage level is Vk and the optimal and stable temperature is TVkeq .
  • Tdie keeps increasing in all the regions until the region i with TVieq lower than TVith .

5. EXPERIMENTAL WORK

  • The authors validated their proposed DTM with various FinFET-based circuits, namely, 50 FO4 inverter chain, 16-bit carry-select adder, 16- bit multiplier, and 16-bit comparator based on 10nm, 14nm, 16nm, and 20nm PTM-MG bulk FinFET libraries.
  • The delays were obtained from the worst case inputs of the circuits.
  • The authors found the scaling factor s, such that multiplying s to the power data from the 20nm based inverter chain makes the temperature increase of the circuit (working with 0.7V) follow the same trend of ARM Cortex-A8.
  • The authors defined Gain = Saved energy w/ the proposed DTM·100(%)Energy consumption w/ the conventional DTM .
  • The authors also determined the simulation conditions as follows: (i) the base Vdd in the simulation is assumed to be the minimum voltage level, that the circuit controlled by the conventional DTM can finish a given task with the base Vdd before the temperature exceeds 90°C or its Teq, and (ii) the circuit starts the operation at the ambient temperature (25°C).

6. CONCLUSION

  • This paper started by presenting a key observation of TEI phenomenon that the delay of a FinFET gate decreases with increasing die temperature both in the near and super-threshold voltage regimes, which is different from that exhibited by planar CMOS devices operating at the super-threshold Vdd .
  • Next it introduced the TEI-aware DTM algorithm to minimize the energy consumption of FinFET-based circuits without any appreciable performance penalty.
  • More precisely, instead of choosing the smallest possi- ble voltage to complete a task within its specified deadline, the proposed DTM algorithm dynamically adjusts the supply voltage of the chip so as to maintain the chip temperature at or near its optimum operation point.
  • Experimental results showed 40% energy saving (with no performance penalty) can be achieved by the proposed TEI-aware DTM approach compared to the best-in-class DTMs that are unaware of this phenomenon.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Dynamic Thermal Management for FinFET-Based Circuits
Exploiting the Temperature Effect Inversion Phenomenon
Woojoo Lee, Yanzhi Wang, Tiansong Cui, Shahin Nazarian and Massoud Pedram
University of Southern California, CA, USA
{woojoole, yanzhiwa, tcui, snazaria, pedram}@usc.edu
ABSTRACT
Due to limits on the availability of the energy source in many mo-
bile user platforms (ranging from handheld devices to portable elec-
tronics to deeply embedded devices) and concerns about how much
heat can effectively be removed from chips, minimizing the power
consumption has become a primary driver for system-on-chip de-
signers. Because of their superb characteristics, FinFETs have eme-
rged as a promising replacement for planar CMOS devices in sub-
20nm CMOS technology nodes. However, based on extensive sim-
ulations, we have observed that the delay vs. temperature charac-
teristics of FinFET-based circuits are fundamentally different from
that of the conventional bulk CMOS circuits, i.e., the delay of a
FinFET circuit decreases with increasing temperature even in the
super-threshold supply voltage regime. Unfortunately, the leakage
power dissipation of the FinFET-based circuits increases exponen-
tially with the temperature. These two trends give rise to a tradeoff
between delay and leakage power as a function of the chip temper-
ature, and hence, lead to the definition of an optimum chip tem-
perature operating point (i.e., one that balances concerns about the
circuit speed and power efficiency.) This paper presents the results
of our investigations into the aforesaid temperature effect inversion
(TEI) and proposes a novel dynamic thermal management (DTM)
algorithm, which exploits this phenomenon to minimize the en-
ergy consumption of FinFET-based circuits without any apprecia-
ble performance penalty. Experimental results demonstrate 40%
energy saving (with no performance penalty) can be achieved by
the proposed TEI-aware DTM approach compared to the best-in-
class DTMs that are unaware of this phenomenon.
1. INTRODUCTION
With the dramatic downscaling of layout geometries, the tradi-
tional bulk CMOS technology has hit critical roadblocks, namely
increasing leakage current and power consumption induced by the
short-channel effects (SCEs) and the increasing variability levels.
To overcome such drawbacks, FinFET devices, a special kind of
quasi-planar double gate (DG) devices, have been proposed as an
alternative for the bulk CMOS as technology scales down below the
20nm technology node [1, 2]. This is due to more effective channel
control, higher ON/OFF current ratios, and superior voltage scala-
bility features of FinFET devices.
DVFS (Dynamic Voltage and Frequency Scaling) is a well-known
technique for minimizing power in VLSI designs by reducing the
supply voltage and clock frequency to the minimum values that
are needed to meet a given performance level. Indeed, a number
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request
permissions from Permissions@acm.org.
ISLPED’14, August 11–13, 2014, La Jolla, CA, USA.
Copyright 2014 ACM 978-1-4503-2975-0/14/08 ...$15.00.
of recent studies of ultra-voltage scaled designs (i.e., circuits that
operate at near/sub-threshold supply voltage levels) have proven
the value of voltage scaling to very low supply voltage levels esp.
when the performance targets are loose [3, 4]. The wide-range volt-
age scalability of FinFET devices enables them to outperform bulk
CMOS devices in ultra-low power designs [5].
Meanwhile, as power density has continued to increase with the
technology scaling, the accompanying high rate of heat generation
has become a growing concern. The leakage current of a circuit in-
creases exponentially with the increasing temperature [6] and this
positive feedback mechanism between leakage power and temper-
ature can result in a thermal runaway situation. Dynamic thermal
management (DTM) has been proposed as an effective technique
to control the over-heating of the circuit by maintaining the cir-
cuit temperature below a critical temperature threshold, while af-
fecting circuit performance as little as possible. Several DTM re-
sponse mechanisms (control knobs) e.g., fetch-toggling, dynamic
thread migration, frequency throttling and DVFS, have been intro-
duced [7, 8, 9]. A few of researchers have focused on developing
resource management, task assignment, and scheduling policies to
achieve the highest performance [6, 10] or the minimum energy
consumption [11] under the condition that the target system hard-
ware remains temperature-safe.
The previous DTM works have tackled the question of how to
limit the peak temperature on circuit substrates comprised of pla-
nar CMOS devices running in the super-threshold voltage regime
to save power or maximize performance. To the best of our knowl-
edge no previous work has studied the question of optimal DTM
policy design for FinFET-based VLSI circuits that can operate in
any of the super, near or sub-threshold regimes. This is an impor-
tant point because the delay versus temperature behavior of FinFET
devices and circuits is different from that of the conventional bulk
CMOS devices operating in the super-threshold regime.
For commercial bulk CMOS standard cell library operating at
super-threshold V
dd
supply voltages, the worst-case (longest) path
delay occurs at the highest temperature. However, in the near/sub-
threshold regime [12, 13] or in high-vt devices [14], it has been
reported that the delay of these circuits decreases with increasing
temperature. On the other hand, for various circuits designed using
the PTM-MG FinFET libraries under 20nm bulk CMOS technol-
ogy [15], a first observation from our SPICE simulations is that
the circuits run faster at higher temperatures in all supply voltage
regimes (including the super-threshold one.) This will be called
as the Temperature Effect Inversion (TEI) phenomenon. A second
observation is that, in the near/sub threshold regimes, the delay de-
crease for a fixed amount of die temperature increase is larger in
FinFET-based designs compared to planar CMOS based designs.
This paper starts from exploring the delay vs. temperature be-
havior of FinFET-based designs, which forces the worst-case delay
of these circuits to occur at low temperatures (e.g., -25°C). Our
objective is to minimize the circuit energy consumption without
any performance penalty. Given a DVFS schedule derived from
the worst-case (at, say, -25°C) delay at various voltage levels, the
motivation is to scale down the voltage level when the circuit tem-
perature is high enough such that the delay from the lower volt-

age level is no larger than the worst-case delay from the original
higher voltage level. This method can achieve significant energy
reduction without performance penalty due to the following three
reasons: (i) lowering down the voltage level will quadratically re-
duce the dynamic energy of the circuit and also reduce the leakage
energy/power, (ii) lowering down the voltage level may slow down
the rising speed of temperature, or may even reduce the temperature
in presence of a heatsink (e.g., the ambient environment for mobile
devices), and will exponentially reduce the leakage power, and (iii)
the operating frequency determined by the worst-case delay of the
higher original voltage can be maintained after the voltage scaling.
Based on in-depth studies of the influence of the TEI on the
energy consumption of the FinFET circuits and the key idea de-
scribed above, we present a novel DVFS-based thermal manage-
ment method to minimize energy consumption with no performance
loss. In this proposed DTM, we effectively find the optimal tem-
perature point to maximize energy efficiency of the circuits, and
introduce new voltage scaling policies to make the circuits operate
at the optimal point.
Along with a detailed description of our experimental work, we
validate the proposed thermal management algorithm on the four
different FinFET circuits designed based on various PTM-MG tech-
nology libraries. We perform SPICE simulations on each circuit
with various voltage levels in the full (possible) operating tem-
perature range. Experimental results demonstrate some 40% en-
ergy saving (with no performance penalty) can be achieved by the
proposed TEI-aware DTM approach compared to the best-in-class
DTMs that are unaware of this phenomenon.
2. TEMPERATURE EFFECT INVERSION
(TEI) PHENOMENON IN FinFETs
For VLSI circuits, the delay of a logic gate is directly affected by
the driving current (I
on
). As I
on
increases, the logic gate switches
faster, and vice versa. For a conventional MOSFET operating at
superthreshold V
dd
(e.g., 0.9 V), it is well known that the rising
temperature will result in a reduced I
on
and eventually aggravate
the speed of circuit. That is why the worst-case timing corner for
the commercial MOSFET standard cell library at superthreshold
V
dd
occurs at the highest temperature (e.g., 125°C).
It has been reported that fabricated FinFETs operating at su-
perthreshold V
dd
show the opposite behavior of MOSFET, i.e., I
on
increases as the die temperature rises [16]. Some FinFET-based
circuits based on 32nm PTM have shown the similar result when
operating at superthreshold V
dd
[17]. Reference [18] analyzed this
opposite temperature influence on I
on
, illustrating that this effect
results from the bandgap narrowing and carrier mobility changes,
which are induced by tensile stress effect of the insulator in the
FinFET structure. As technology scales down (e.g., beyond 30nm),
the tensile stress from the insulator layer to the fin body (cf. Fig-
ure 2) affects the device characteristics more significantly. In other
words, because the thinner fin body has larger stress, the stress-
induced bandgap narrowing results in a more significant decrease
of the threshold voltage V
th
. And, with increasing of the tempera-
Fig.2. Double-gate FinFET device structure.
Si subtrate
SiO2 (inside: Si fin body)
Figure 2: Three-dimensional structure of the bulk FinFET
ture, the tensile stress becomes larger, which decreases V
th
as well
as induces a slight change of the carrier mobility µ for FinFETs.
Finally, the changes of V
th
and µ can directly affect I
on
of Fin-
FET in the super-threshold operation regime. Generally, I
on
(T ) as
a function of the temperature T can be expressed as:
I
on
(T ) =
(
µ(T )e
V
gs
V
th
(T )
S(T )
: if V
gs
< V
th
µ(T )(V
gs
V
th
(T ))
β
: otherwise,
(1)
where V
gs
is the gate-source voltage, S is the subthreshold swing, β
is the velocity saturation effect factor. S, µ, and V
th
are the temper-
ature dependent parameters. Due to the tensile stress with rising T ,
decreasing V
th
along with a slight change of µ result in an increas-
ing I
on
, thereby decreasing the delay of logic gate.
Meanwhile, conventional MOSFETs operating in the sub/near-
threshold regime or high-vt devices have shown the similar phe-
nomenon (indeed, more significant than what was observed in Fin-
FETs with super-threshold V
dd
) that the circuit delay decreases with
the increasing temperature [12, 13, 14]. As temperature increases,
µ and V
th
of MOSFETs decrease while S increases. From (1), I
on
in the sub-threshold regime is exponentially and dominantly de-
pendent on V
th
and S, which is different from the case that I
on
is
a nearly linear function of V
th
and µ in the super-threshold regime.
As a consequence, different from the super-threshold regime where
the slightly stronger effect of µ than that of V
th
causes decreasing
I
on
with increasing T , the changes of V
th
and S considerably in-
creases I
on
in the sub/near-threshold regime, and thus the gate can
run much faster.
I
on
of FinFETs operating in the sub/near-threshold regime also
has the same exponential dependency on V
th
and S. Combined with
the tensile stress effect, FinFETs in the sub/near-threshold regime
exhibit a significant delay reduction as the temperature goes high.
We conclude that temperature increase makes FinFETs run faster
at all the supply voltage levels. As stated earlier, we call this phe-
nomenon temperature effect inversion (TEI) in FinFETs.
Figure 1 shows simulated results from four FinFET technolo-
gies: 20nm, 16nm, 14nm and 10nm. We can observe that all the
technologies beyond 20nm clearly show the TEI phenomenon. The
delay results of each technology is normalized by the delay at the
nominal V
dd
(in the super-threshold) at 125°C , which is shown as
the dashed line in the figure. We can see that the delay at 125°C is
not the worst case any more, but in fact the best case. Rather, the
worst case delay for each V
dd
level occurs at the lowest temperature
(e.g., -25°C).
20 0 20 40 60 80 100 120
1
2
3
4
5
6
7
data1
data2
data3
data4
data5
data6
data7
0.3V
7
6
4
1
-20
0 20
40
60 80
100
Temperature ()
Normalized delay
0.4V
0.5V
0.6V
0.7V
0.8V
5
120
20nm bulk
FinFET
20 0 20 40 60 80 100 120
2
4
6
8
10
12
14
16
18
20
22
24
26
data1
data2
data3
data4
data5
-20
0 20
40
60 80
100
Temperature ()
120
10
8
6
2
1
Normalized delay
4
12
14
16
18
20
22
24
26
0.45V
0.5V
0.6V
0.7V
0.8V
14nm bulk
FinFET
20 0 20 40 60 80 100 120
2
4
6
8
10
12
14
16
18
20
22
24
26
28
data1
data2
data3
data4
data5
data6
-20 0 20 40 60 80 100
Temperature ()
120
0.45V
0.5V
0.6V
0.7V
0.75V
0.55V
10
8
6
2
1
Normalized delay
4
12
14
16
18
20
22
24
26
10nm bulk
FinFET
20 0 20 40 60 80 100 120
0
2
4
6
8
10
12
14
16
18
20
data1
data2
data3
data4
data5
data6
0.45V
10
8
6
2
1
-20 0 20 40 60
80
100
Temperature ()
Normalized delay
0.5V
0.6V
0.7V
0.8V
0.85V
4
12
14
16
18
20
120
16nm bulk
FinFET
3
2
0.9V
Figure 1: Delay at different temperatures and supply voltage levels, from FinFET-based FO4 inverter chain simulations.

-20 0
20
40
60 80
100
Temperature ()
1
0.2
0
Normalized power
0.4
0.6
0.8
Figure 3: Leakage power at different temperatures and supply
voltage levels, based on the 20nm FinFET technology.
3. POWER AND THERMAL MODELS
The power consumption of VLSI circuits has two components: a
dynamic part and static (leakage) part. The dynamic power P
dynamic
is given by P
dynamic
= αCV
2
dd
f , where α is the activity factor, C is
the switching capacitance, and f is the clock frequency. It is known
that the static power P
static
has a dependence on the die temperature
T
die
and V
dd
, which can be expressed as:
P
static
(T
die
,V
dd
) = V
dd
c
1
T
2
die
e
c
2
V
dd
+c
3
T
die
+ c
4
e
(c
5
V
dd
+c
6
)
, (2)
where the first term is the sub threshold leakage, and the second
term after the plus symbol is the gate leakage; c
1
to c
6
are tech-
nology dependent parameters [11]. Figure 3 shows the changes of
P
static
as a function of the elevated T
die
at different V
dd
s, resulted
from the simulations based on the 20nm bulk FinFETs.
We use the conventional RC-circuit thermal model, which is
shown in Figure 4 (a) [19]. In the figure, P
circuit
denotes the heat
generated by the circuit, which is the sum of P
dynamic
and P
static
;
P
amb
is the heat dissipated to the ambience; T
amb
is the ambient
temperature; and C
die
and R
die-amb
are the thermal capacitance of
the circuit die and the thermal resistance from the die to the am-
biance, respectively. Because we target the whole mobile device,
modeling the on-chip thermal variations within the device [20] is
less critical. Thus we do not account for thermal variations in this
paper. Additionally, we do not include a separate heat sink, be-
cause in our target device there is none. Notice that, if we target a
large scale chip that equips heatsinks or coolers, the spatial thermal
variations should be taken into consideration, which may require
to develop the more sophisticated thermal models and accompany-
ing control logics (e.g., the feedback controller) to be robust to the
modeling errors. However, they are beyond the scope of this paper.
Applying Kirchhoff equations to the RC-circuit thermal model
in Figure 4 (a), we have:
C
die
dT
die
dt
= P
circuit
T
die
T
amb
R
die-amb
. (3)
Figure 4 (b) shows a conceptual relationship between P
circuit
and
P
amb
, where the two P
circuit
levels are resulted from the high V
dd
and low V
dd
. When P
circuit
= P
amb
, i.e., dT
die
/dt = 0 in (3), T
die
is stable. We call this point the equilibrium temperature T
eq
. T
high
eq
and T
low
eq
in the figure denote the equilibrium temperatures for the
high V
dd
case and low V
dd
case, respectively.
Due to the strong dependence of P
static
on T
die
and V
dd
from (2),
the amount of differences between the two P
circuit
levels from the
high V
dd
and low V
dd
, which is indicated by the arrows in Figure 4
(b), increases super-linearly with increasing T
die
and V
dd
. Simi-
larly, the differences between the two T
eq
s also follow the super-
linear trend for the given R
die-amb
, a fixed design parameter. Hence,
for some high V
dd
levels, it is possible that the corresponding T
eq
s
exceed the die temperature limit (e.g., 90°C), or such T
eq
s do not
exist at all. While R
die-amb
directly affects T
eq
, another design pa-
rameter, C
die
, influences how fast the die temperature reaches either
Power
data1
data2
High Vdd
Low Vdd
T
amb
T
low
eq
T
high
eq
P
amb
P
circuit
P
circuit
P
amb
P
circuit
T
amb
T
die
T
die
(a) (b)
C
die
R
die-amb
Figure 4: (a) RC-circuit thermal model, and (b) the effect of the
temperature and power variation
Table 1: Simulation results of T
T
T
eq
and the time to reach T
T
T
eq
or
90°C from 20nm FinFET test circuits
V
dd
(V ) 0.50 0.55 0.60 0.65 0.70 0.75 0.80
T
eq
(°C) 31.8 34.2 38.2 44.5 N/A N/A N/A
Time(sec) 1310 1465 1873 2375 3231 1600 1039
T
eq
, if it exists, or the die temperature limit, otherwise. Especially,
the time to reach the die temperature limit is an important design
factor, because it determines how long a circuit can operate under
the high voltage level. Table 1 shows the T
eq
levels and the times
from 0°C to T
eq
or the die temperature limit, 90°C, from the 20nm
bulk FinFET test circuits. From the measurement on ARM Cortex-
A8, R
die-amb
, C
die
and T
amb
are set to be 35.8 K/W, 9.0 J/K and
25°C, respectively [19]. The power from the test circuit is scaled
so that the circuit with V
dd
=0.7V has the same trend of temperature
increase that ARM Cortex-A8 shows with the measured R
die-amb
,
C
die
and T
amb
. The details will be explained at Section 5.
The previous work on DTM has mainly focused on cases where
T
eq
does not exist, and focused on how to avoid exceeding the die
temperature limit with inevitable performance penalties: for ex-
ample, lowering the clock frequency or both frequency and volt-
age levels to reduce P
circuit
, thereby to cool down T
die
. Different
from the previous work, we present a novel DTM algorithm in the
following section, which exploits the TEI phenomenon to improve
energy efficiency of the circuit while neither exceeding the die tem-
perature limit nor losing any performance.
4. TEI-AWARE DTM
4.1 Influence of TEI on energy consumption
Due to the TEI phenomenon, the worst-case delays occur at the
low temperature in FinFET circuits. Therefore, for a given tar-
get clock frequency, the corresponding voltage level of the circuit
should be set according to the worst-cased circuit delay, which oc-
curs at the lowest die temperatures. This is needed to guarantee cor-
rect circuit operation in the full range of the operating temperature.
We call this voltage level the base voltage level, V
base
, associated
with a target clock frequency, f
target
.
Consider a FinFET-based circuit running at f
target
. As time goes
by, the die temperature T
die
rises. Because of the TEI phenomenon,
the FinFET-based circuit is getting faster with rising temperature,
which allows us to drop the supply voltage level below V
base
while
maintaining f
target
. Of course, we have to wait for T
die
to reach a
predetermined level (which we will call the threshold temperature,
T
th
) before we can drop the supply voltage level. This is because we
have a finite number of discrete supply voltage levels, so the move
from a higher initial voltage level to the next lower voltage level can
only happen when the delay decrease due to the temperature rise is
some minimum amount so that correct circuit operation at lower
voltage level can be ensured. Note that if T
th
exists, then this can
significantly reduce the power consumption of the circuit due to the
quadratic dependence of P
dynamic
and the exponential dependence

0 200 400 600 800 1000 1200
30
50
70
90
0 200 400 600 800 1000 1200
0
1
2
3
4
0 200 400 600 800 1000 1200
0
1
2
3
4
TIme
3
0
5
0
7
0
9
0
Temperature ()
Normalized power
0
1
2
3
4
0
200
400
600
800
1000
1200
Threshold temperature
Power reduction
Tdie from 0.75V
Tdie from 0.65V
Power by 0.75V
Power by 0.65V
Voltage down
Time (sec)
(c)
T
0.75!0.7
th
T
0.75!0.65
th
25 0 25 50 75 100
1
1.1
1.2
1.3
1.4
25 0 25 50 75 100 125
1
1.1
1.2
1.3
1.4
data1
data2
data3
data4
0.75V
0.70V
0.65V
0.6V
-25 0 25
50
75 100
1
1.1
1.2
1.3
1.4


(b)
0 500 1000 1500
10
10
30
50
70
90
0 500 1000 1500
0
1
2
3
4
0 500 1000 1500
0
1
2
3
4
Threshold temperature
Power reduction
Time (sec)
500
0
-10
10
30
50
70
90
Temperature ()
Normalized power
0
1
2
3
4
Tdie from 0.75V
Tdie from 0.70V
Power by 0.75V
Power by 0.70V
Voltage down
1000
1500
T
0.75!0.6
th
(a)
Figure 5: (a) Threshold temperatures (T
T
T
th
s) at different voltage levels, and two different cases after lowering down the voltage level
at T
T
T
th
: (b) T
T
T
die
increases, and (c) T
T
T
die
decreases, based on the 20nm FinFET based FO4 inverter chain simulation.
of P
static
on the supply voltage level. Furthermore, differently from
the conventional DTM methods, our approach does not scale down
the clock frequency, so there will be no performance loss. Finally,
note that because power dissipation is going down, the temperature
rise in the substrate will be curbed.
Figure 5 (a) shows an example of T
th
levels from multiple dif-
ferent voltage levels, based on the delay values with 20nm FinFET
technology. Note that, for the figure and the remaining part of this
paper, we assume the lowest temperature of test circuits is -25°C,
and the die temperature limit is 90°C. We also assume that a fine-
grained (0.05V) input voltage control can be supported, similar to
existing voltage controllers that power Intel CORE2 E6850 pro-
cessor and ARM CORTEX-A8 with 0.05V difference in adjacent
voltage levels. Then, in the figure, the operating frequency is set by
the worst-case delay from the base voltage level, 0.75V, at -25°C.
We use notation T
base voltagetarget voltage
th
in the figure to denote T
th
in each case. While T
0.750.6
th
exceeds the die temperature limit,
the other threshold temperatures can be exploited in DTM.
Lowering down the voltage levels right after the increased tem-
perature reaching T
th
leads to two possible cases: (Case I) T
die
keeps increasing, or (Case II) T
die
begins decreasing. Case I is be-
cause the equilibrium temperature T
eq
of the lowered voltage level
is higher than T
th
, or such T
eq
does not exist. Case II is because T
eq
of the lowered voltage level lies below T
th
. For Case I, it is intu-
itive that the immediate voltage change at T
th
will not degrade the
performance of the circuit, but give us the opportunity to save en-
ergy. This is illustrated on Figure 5 (b). Because 0.7V voltage level
does not have T
eq
(from Table 1), lowering down the voltage from
0.75V to 0.7V at T
0.750.7
th
=18°C allows the circuit to operate with
the scheduled frequency but consume significantly less energy. On
the other hand, the immediate voltage change at T
th
for Case II will
result in timing violation because the temperature will begin to de-
crease. Therefore, we have to wait for a certain amount of time, un-
til T
die
exceeds T
th
by a certain amount. Then, we can lower down
the voltage level to reduce the power consumption, and keep the
lowered voltage level until the decreasing temperature reaches T
th
.
This is illustrated in Figure 5 (c). Because T
0.750.65
th
equals 61°C,
and T
eq
corresponding to the 0.6V voltage level is 44.5°C (from
Table 1), T
die
decreases after the voltage change.
Different from Figure 5 (b) and (c), each of which considers sim-
ply two available voltage levels, there can be more than two avail-
able voltages levels in reality that can meet the scheduled frequency
condition in the whole temperature range. The availability of the
multiple voltage levels requires more detailed analysis and more
elaborate DTM policy. The following subsections will discuss all
the possible cases in a DVFS schedule to complete a given task.
The proposed optimal DTM policy can be generalized to arbitrary
DVFS schedules.
4.2 Energy optimization
With the given deadline specification of a task, the required (min-
imum) operating frequency f
target
and corresponding base voltage
level V
base
can be determined in order to finish task execution by
deadline. Conventional DTMs of the circuit try not to exceed the
temperature limit T
limit
by forcing to lower down the frequency or
stop execution with performance penalties. Our proposed DTM
method targets to minimize the energy consumption for a given
task, or a given set of tasks, without violating the operating fre-
quency of the initial schedule, and thereby without any perfor-
mance loss. Simultaneously, our DTM slows down the speed of
temperature increase, or makes the die temperature stable at a cer-
tain point below T
limit
, thereby avoiding the performance loss from
such situations when the conventional DTMs inevitably lower the
frequency or stop execution.
Among all the possible voltage levels, if one voltage level V
i
has
a threshold temperature such that T
V
base
V
i
th
< T
limit
, then V
i
may be
exploited instead of V
base
in a certain temperature range. For the re-
mainder of paper, we use a simple notation T
V
i
th
to denote T
V
base
V
i
th
.
Then, we can separate the operating temperature regions by each
available T
th
. More specifically, the i
th
region is R
i
, [T
V
i
th
,T
V
i+1
th
)
for 1 i N, where N is the number of the candidate voltage levels
of the target frequency f
target
. We have V
1
= V
base
> V
2
> ... > V
N
.
Figure 6 (a) and (b) show an example that has three candidate
voltage levels, V
High
= V
base
, V
Mid
and V
Low
, and thus the tempera-
ture regions are divided into three regions. The red curves in both
figures show the minimum energy consumption at each tempera-
ture, according to the lowest voltage level that makes the circuit
work with f
target
at that temperature point. As can be seen from the
figure, the minimum energy point in each region locates at the tem-
perature point where the voltage level is changed, i.e., the threshold
temperature level. Furthermore, from extensive simulations based
on various FinFET libraries, we find that the energy consumption at
T
V
i+1
th
is always higher than that at T
V
i
th
. This is because the leakage
power increases fast as the temperature rises. Therefore, we start
the optimization process from a premise:
I The minimum energy point in R
i
is always at T
V
i
th
, and the corre-
sponding energy consumption is smaller than that at T
V
i+1
th
.
The equilibrium temperature level T
V
i
eq
depends on the ambient
temperature, and hence, it is an uncontrollable factor. The potential
inequality between T
V
i
eq
and T
V
i
th
will not let the circuit operate with
stable temperature T
V
i
th
. Suppose that the initial die temperature is
T
init
, which is in region R
i
. Then the movement of die temperature
T
die
follows the following two rules:
I If T
init
< T
V
i
eq
, T
die
will increase, until T
die
= min{T
V
i
eq
,T
V
i+1
th
}.
I If T
init
> T
V
i
eq
, T
die
will decrease, until T
die
= max{T
V
i
eq
,T
V
i
th
}.
We use Figure 6 (a) as an example of the above rules. Suppose
that T
init
is in R
Mid
, the temperature will eventually be stable at



High Vdd
Mid. Vdd
Low Vdd
RHigh
Tdie increases
RMid.
Tdie increases
T
Mid
th
T
Low
eq
T
Mid
eq

To reach
T
Mid
th
To reach

Hold the
High Vdd
Hold the
Mid. Vdd
1
2
3
4
5


Keep switching
Mid. Low Mid. Vdd
Wait
2
Down
3
Hold the Low Vdd
4
Up
5
2
Repeat
Down
1
RLow
Tdie decreases
Switch

High Vdd
Mid. Vdd
Low Vdd
RHigh
T
Mid
th
T
Mid
eq
(a)
(b)
RMid
RLow
Optimal
Optimal
T
Low
th
T
Low
th
T
Low
th
T
die
T
die
Figure 6: Case studies for (a) Policy I, and (b) Policy II.
T
Mid
eq
which is also in R
Mid
. Then, T
Mid
eq
is the optimal temperature
point where the circuit can achieve the maximum energy saving for
the given task. Similarly, suppose that T
init
is in R
Low
, and T
Mid
eq
is
still in R
Mid
. Then T
Mid
eq
is still the optimal point. That is because
T
Low
eq
is lower than T
Mid
eq
, T
die
with initial voltage V
Low
decreases
from T
init
to T
Low
th
. Then the voltage level switches to V
Mid
in order
to maintain the speed of the circuit. Finally, T
die
will be stable at
T
Mid
eq
. The opposite case that T
init
is in R
High
results in the same
outcomes, because T
High
eq
is higher than T
Mid
eq
and this fact makes
T
die
move to T
Mid
eq
. Therefore, we propose a policy as:
I Policy I: Check if there exists a k such that T
V
k
eq
R
k
for 1 k
N: we have proved that at most one such k exists. If k exists, the
optimal voltage level is V
k
and the optimal and stable tempera-
ture is T
V
k
eq
. Whatever region T
init
starts in, we need to use the
corresponding voltage level of the region, i.e., the lowest volt-
age in the region that meets the frequency condition, and then
keep changing the voltage level whenever T
die
reaches a region
boundary. Eventually die temperature will arrive at T
V
k
eq
.
Now we discuss the case when no such k exists. In this case, T
die
keeps increasing in all the regions until the region i with T
V
i
eq
lower
than T
V
i
th
. In this case T
die
should decrease in R
i
. Then, the mini-
mum energy consumption of the circuit is at T
V
i
th
, because (i) using
high voltage level than V
i
only makes T
die
increase, thus consuming
more energy, (ii) T
die
can not further decrease than T
V
i
th
. This case
is illustrated in Figure 6 (b). In the figure, T
Mid
eq
locates higher than
T
Mid
th
, and T
High
eq
should be higher than T
Mid
eq
. Hence, T
die
always
increases in both R
High
and R
Mid
. But, because T
Low
eq
lies below
the region R
Low
(in R
Mid
in the figure), T
die
will decrease in R
Low
if V
Low
is applied. Finally, the optimal temperature is T
Low
th
.
Although we know the optimal temperature point in Figure 6 (b),
it is impossible to maintain operating at this point during circuit op-
eration. Therefore, we propose to use V
Mid
for a certain amount of
time to warm up. This process is indicated by
2
. Then continue to
perform:
3
lower down the voltage to V
Low
, and
4
maintain volt-
age V
Low
until T
die
decreases to T
Low
th
where we need to
5
increase
the voltage to V
Mid
. Repeating these process makes the circuit op-
erate near the optimal temperature without timing violation. We
call this region the stationary region because when T
die
enters this
region, it continues staying in that region by doing proper voltage
switchings. The blue-colored region in Figure 6 (b) shows an ex-
ample of the stationary region.
Meanwhile, the amount of time for the warm-up process affects
how far the circuit operates from the optimal point. The shorter
the time is, the higher energy efficiency is achieved. The mini-
mum constraint of such warm-up time is determined by the volt-
age switching time (i.e., the voltage transition latency of DC-DC
converters) that the voltage controller can provide. However, this
responsiveness issue of the voltage controller is beyond the scope
of this paper.
Based on the previous discussion, we propose the second policy
of our DTM:
I Policy II. if Policy I cannot be applied, check whether there ex-
ists k such that T
V
k
eq
< T
V
k
th
. Find the smallest k value if such k
exists, and then the optimal temperature point should be T
V
min(k)
th
.
Whatever the region T
init
starts in, use the corresponding low-
est voltage level of the region. Keep changing the voltage level
whenever T
die
reaches a region boundary until T
die
enters the
stationary region. In the stationary region, we keep performing
2
3
4
5
.
At the end, we point out that if there exists no k such that T
V
k
eq
<
T
V
k+1
th
, T
die
will eventually exceed T
limit
and the task will fail to
finish in time. Of course, conventional DTMs that use only the
base voltage level of the task will make T
die
reach T
limit
even ear-
lier. Compared to conventional DTMs, the proposed DTM could
save a considerable amount of energy before T
die
reaches T
limit
, be-
cause the proposed DTM always selects the lowest (possible) volt-
age level in each region. Furthermore, using lower voltage levels
slows down the temperature rise so that the circuit can operate with
at a high frequency for longer time, while the circuit controlled
by conventional DTMs would have to reduce the frequency earlier
than the proposed DTM.
5. EXPERIMENTAL WORK
We validated our proposed DTM with various FinFET-based cir-
cuits, namely, 50 FO4 inverter chain, 16-bit carry-select adder, 16-
bit multiplier, and 16-bit comparator based on 10nm, 14nm, 16nm,
and 20nm PTM-MG bulk FinFET libraries. All the circuits are de-
signed in the shorted gate mode. We performed Hspice simulation
to obtain the delays and power consumptions of each circuit for
different V
dd
setups and different temperatures. The delays were
obtained from the worst case inputs of the circuits. Notice that we
did not attempt to consider interconnect delays in our simulations.
That is because the characteristics of interconnects used for deeply
scaled FinFET-based circuit fabrics is unknown (i.e., although the
R and C parasitic values of the interconnect go up with temperature,
the current strength of the driver also improves, which can reduce
the wire delay.) We determined the minimum and maximum tem-
perature that the circuits operate as -25°C and 90°C, respectively.
Based on the worst delay at -25°C for each V
dd
, we found the avail-
able voltage levels, which are lower than the base V
dd
but have

Citations
More filters
Journal ArticleDOI
TL;DR: This work presents the first ever implementation of a 4-core cluster fabricated using conventional-well 28 nm UTBB FD-SOI technology, and demonstrates the ability to compensate for up to 99.7% of chips for process variation with only ±0.2 V of body biasing.
Abstract: Ultra-low power operation and extreme energy efficiency are strong requirements for a number of high-growth application areas, such as E-health, Internet of Things, and wearable Human–Computer Interfaces. A promising approach to achieve up to one order of magnitude of improvement in energy efficiency over current generation of integrated circuits is near-threshold computing. However, frequency degradation due to aggressive voltage scaling may not be acceptable across all performance-constrained applications. Thread-level parallelism over multiple cores can be used to overcome the performance degradation at low voltage. Moreover, enabling the processors to operate on-demand and over a wide supply voltage and body bias ranges allows to achieve the best possible energy efficiency while satisfying a large spectrum of computational demands. In this work we present the first ever implementation of a 4-core cluster fabricated using conventional-well 28 nm UTBB FD-SOI technology. The multi-core architecture we present in this work is able to operate on a wide range of supply voltages starting from 0.44 V to 1.2 V. In addition, the architecture allows a wide range of body bias to be applied from −1.8 V to 0.9 V. The peak energy efficiency 60 GOPS/W is achieved at 0.5 V supply voltage and 0.5 V forward body bias. Thanks to the extended body bias range of conventional-well FD-SOI technology, high energy efficiency can be guaranteed for a wide range of process and environmental conditions. We demonstrate the ability to compensate for up to 99.7% of chips for process variation with only ±0.2 V of body biasing, and compensate temperature variation in the range −40 °C to 120 °C exploiting −1.1 V to 0.8 V body biasing. When compared to leading-edge near-threshold RISC processors optimized for extremely low power applications, the multi-core architecture we propose has 144× more performance at comparable energy efficiency levels. Even when compared to other low-power processors with comparable performance, including those implemented in 28 nm technology, our platform provides 1.4× to 3.7× better energy efficiency.

63 citations

Proceedings ArticleDOI
17 Apr 2016
TL;DR: In this article, the authors characterized the flip-flop soft error rates of 20-nm and 16-nm bulk FinFET technologies over temperature with different supply voltages.
Abstract: Alpha particle-induced flip-flop soft-error rates (SER) for 20-nm bulk planar and 16-nm bulk FinFET technologies are characterized over temperature with different supply voltages. Experimental results indicate that the 16-nm FinFET SER changes insignificantly with temperature while the 20-nm planar SER increases by ∼2x over the same temperature range. 3D TCAD and circuit-level simulations show changes in single-event transient (SET) pulse width and logic gate delay are the controlling factors, with opposing influences on SER.

28 citations


Cites background from "Dynamic thermal management for FinF..."

  • ...TEI and different threshold voltages are responsible for the transistor current differences in response to increased temperature....

    [...]

  • ...This inverted temperature dependence (ITD) is called temperature effect inversion (TEI) [11]....

    [...]

Proceedings ArticleDOI
07 Nov 2016
TL;DR: This work proposes an aging-aware algorithm, dubbed AgingMin, to select the optimal TEI-aware voltage/frequency operation points for decelerating the aging effect, and results show that AgingMin improves the classic 10-year system lifetime by an average of 1.61 years while introducing less than 1% power overhead when compared to existing state-of-the-art techniques.
Abstract: Power and thermal issues are the main constraints for high-performance multi-core systems. As the current technology of choice, FinFET is observed to have lower delay under higher temperature in super-threshold voltage region, an effect called temperature effect inversion (TEI). While it has been shown that system performance can be improved under power constraints, as technology aggressively scales down to sub-20nm nodes, thermal issues also emerge as important reliability concerns throughout the system lifetime. To the best of our knowledge, we are the first to provide a comprehensive evaluation of both TEI and aging effects on the performance and power of FinFET-based multi-core systems with multiple voltage/frequency levels. Our experimental results show that aging effects can be reduced by up to 53.59% by exploiting the TEI effect. Based on a combined multivariate objective for power and aging, this work proposes an aging-aware algorithm, dubbed AgingMin, to select the optimal TEI-aware voltage/frequency operation points for decelerating the aging effect. Experimental results show that AgingMin improves the classic 10-year system lifetime by an average of 1.61 years while introducing less than 1% power overhead when compared to existing state-of-the-art techniques.

26 citations


Cites background or result from "Dynamic thermal management for FinF..."

  • ...Existing work has already exploited this behavior for energy [18] and performance [2] optimization purposes, using the key insight that the TEI-induced tensile stress causes a slight increase in the carrier mobility....

    [...]

  • ...This more complicated coupling effect with counteracting interdependencies has been previously studied and motivated by different groups [18, 2, 22]....

    [...]

  • ...[18] proposed a dynamic thermal management policy for FinFET-based circuits that exploits the TEI...

    [...]

  • ...These observations are consistent with the results and the terminology used by several research groups: temperature effect inversion [18, 2], temperature- and time-critical [8], voltage- and time- acceleration [16]....

    [...]

  • ...iso-power or iso-frequency operation [2, 18]....

    [...]

Journal ArticleDOI
TL;DR: In this article, a voltage multiplier with a low start-up voltage is presented for energy harvesting applications, where two voltage doublers are cascaded with the overall conversion ratio of 2 and 4.
Abstract: In this paper a compact, fully-integrated voltage multiplier with a low start-up voltage is presented for energy harvesting applications. Two voltage doublers are cascaded with the overall conversion ratio of 2 and 4. The voltage multiplier has a 2-phase clock signal with a wide range of operating frequency from 1 kHz to 1 MHz. Cascading and positive feedback with cross-coupled gates have been used to increase the efficiency and conversion ratio of the converter. The DC-DC converter has an efficiency of more than 70% when operating from a 0.34 V input voltage and generating 1.28 V output voltage. The proposed voltage multiplier has a power consumption of 36 nW to 1.24 μW for input voltage range of 280–450 mV in 0.18 μm CMOS technology.

24 citations

Journal ArticleDOI
TL;DR: This study shows that besides the performance and power benefits, FinFET devices show significant reduction of short-channel effects and extremely low leakage, and many of the electrical characteristics are close to ideal as in old long-channel technology nodes; FinFets seem to have put scaling back on track.
Abstract: It has been almost a decade since FinFET devices were introduced to full production; they allowed scaling below 20 nm, thus helping to extend Moore’s law by a precious decade with another decade likely in the future when scaling to 5 nm and below. Due to superior electrical parameters and unique structure, these 3-D transistors offer significant performance improvements and power reduction compared to planar CMOS devices. As we are entering into the sub-10 nm era, FinFETs have become dominant in most of the high-end products; as the transition from planar to FinFET technologies is still ongoing, it is important for digital circuit designers to understand the challenges and opportunities brought in by the new technology characteristics. In this paper, we study these aspects from the device to the circuit level, and we make detailed comparisons across multiple technology nodes ranging from conventional bulk to advanced planar technology nodes such as Fully Depleted Silicon-on-Insulator (FDSOI), to FinFETs. In the simulations we used both state-of-art industry-standard models for current nodes, and also predictive models for future nodes. Our study shows that besides the performance and power benefits, FinFET devices show significant reduction of short-channel effects and extremely low leakage, and many of the electrical characteristics are close to ideal as in old long-channel technology nodes; FinFETs seem to have put scaling back on track! However, the combination of the new device structures, double/multi-patterning, many more complex rules, and unique thermal/reliability behaviors are creating new technical challenges. Moving forward, FinFETs still offer a bright future and are an indispensable technology for a wide range of applications from high-end performance-critical computing to energy-constraint mobile applications and smart Internet-of-Things (IoT) devices.

23 citations

References
More filters
Proceedings ArticleDOI
20 Jan 2001
TL;DR: This work investigates dynamic thermal management as a technique to control CPU power dissipation and explores the tradeoffs between several mechanisms for responding to periods of thermal trauma and the effects of hardware and software implementations.
Abstract: With the increasing clock rate and transistor count of today's microprocessors, power dissipation is becoming a critical component of system design complexity. Thermal and power-delivery issues are becoming especially critical for high-performance computing systems. In this work, we investigate dynamic thermal management as a technique to control CPU power dissipation. With the increasing usage of clock gating techniques, the average power dissipation typically seen by common applications is becoming much less than the chip's rated maximum power dissipation. However system designers still must design thermal heat sinks to withstand the worse-case scenario. We define and investigate the major components of any dynamic thermal management scheme. Specifically we explore the tradeoffs between several mechanisms for responding to periods of thermal trauma and we consider the effects of hardware and software implementations. With approximate dynamic thermal management, the CPU can be designed for a much lower maximum power rating, with minimal performance impact for typical applications.

882 citations


"Dynamic thermal management for FinF..." refers background in this paper

  • ...This is due to more effective channel Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro.t or commercial advantage and that copies bear this notice and the full citation on the .rst page....

    [...]

  • ...This is due to more effective channel Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro.t or commercial advantage and that copies bear this notice and the full…...

    [...]

Journal ArticleDOI
22 Jan 2010
TL;DR: In this paper, the authors define and explore near-threshold computing (NTC), a design space where the supply voltage is approximately equal to the threshold voltage of the transistors.
Abstract: Power has become the primary design constraint for chip designers today. While Moore's law continues to provide additional transistors, power budgets have begun to prohibit those devices from actually being used. To reduce energy consumption, voltage scaling techniques have proved a popular technique with subthreshold design representing the endpoint of voltage scaling. Although it is extremely energy efficient, subthreshold design has been relegated to niche markets due to its major performance penalties. This paper defines and explores near-threshold computing (NTC), a design space where the supply voltage is approximately equal to the threshold voltage of the transistors. This region retains much of the energy savings of subthreshold operation with more favorable performance and variability characteristics. This makes it applicable to a broad range of power-constrained computing segments from sensors to high performance servers. This paper explores the barriers to the widespread adoption of NTC and describes current work aimed at overcoming these obstacles.

767 citations

01 Jan 2010
TL;DR: The barriers to the widespread adoption of near-threshold computing are explored and current work aimed at overcoming these obstacles are described.
Abstract: Power has become the primary design constraint for chip designers today. While Moore's law continues to provide additional transistors, power budgets have begun to prohibit those devices from actually being used. To reduce energy consumption, voltage scaling techniques have proved a popular technique with subthreshold design representing the endpoint of voltage scaling. Although it is extremely energy efficient, subthreshold design has been relegated to niche markets due to its major performance penalties. This paper defines and explores near-threshold computing (NTC), a design space where the supply voltage is approximately equal to the threshold voltage of the transistors. This region retains much of the energy savings of subthreshold operation with more favor- able performance and variability characteristics. This makes it applicable to a broad range of power-constrained computing segments from sensors to high performance servers. This paper explores the barriers to the widespread adoption of NTC and describes current work aimed at overcoming these obstacles.

695 citations


"Dynamic thermal management for FinF..." refers background in this paper

  • ...…the results of our investigations into the aforesaid temperature effect inversion (TEI) and proposes a novel dynamic thermal management (DTM) algorithm, which exploits this phenomenon to minimize the en­ergy consumption of FinFET-based circuits without any apprecia­ble performance penalty....

    [...]

Journal ArticleDOI
TL;DR: In this article, a self-aligned double-gate MOSFET structure (FinFET) is used to suppress the short-channel effects, which shows good performance down to a gate-length of 18 nm.
Abstract: High-performance PMOSFETs with sub-50-nm gate-length are reported. A self-aligned double-gate MOSFET structure (FinFET) is used to suppress the short-channel effects. This vertical double-gate SOI MOSFET features: 1) a transistor channel which is formed on the vertical surfaces of an ultrathin Si fin and controlled by gate electrodes formed on both sides of the fin; 2) two gates which are self-aligned to each other and to the source/drain (S/D) regions; 3) raised S/D regions; and 4) a short (50 nm) Si fin to maintain quasi-planar topology for ease of fabrication. The 45-nm gate-length p-channel FinFET showed an I/sub dsat/ of 820 /spl mu/A//spl mu/m at V/sub ds/=V/sub gs/=1.2 V and T/sub ox/=2.5 mm. Devices showed good performance down to a gate-length of 18 nm. Excellent short-channel behavior was observed. The fin thickness (corresponding to twice the body thickness) is found to be critical for suppressing the short-channel effects. Simulations indicate that the FinFET structure can work down to 10 nm gate length. Thus, the FinFET is a very promising structure for scaling CMOS beyond 50 nm.

443 citations

Journal ArticleDOI
TL;DR: For both low-power and high-performance applications, DGCMOS-FinFET offers a most promising direction for continued progress in VLSI.
Abstract: Double-gate devices will enable the continuation of CMOS scaling after conventional scaling has stalled. DGCMOS/FinFET technology offers a tactical solution to the gate dielectric barrier and a strategic path for silicon scaling to the point where only atomic fluctuations halt further progress. The conventional nature of the processes required to fabricate these structures has enabled rapid experimental progress in just a few years. Fully integrated CMOS circuits have been demonstrated in a 180 nm foundry-compatible process, and methods for mapping conventional, planar CMOS product designs to FinFET have been developed. For both low-power and high-performance applications, DGCMOS-FinFET offers a most promising direction for continued progress in VLSI.

413 citations

Frequently Asked Questions (22)
Q1. What are the contributions in "Dynamic thermal management for finfet-based circuits exploiting the temperature effect inversion phenomenon" ?

This paper presents the results of their investigations into the aforesaid temperature effect inversion ( TEI ) and proposes a novel dynamic thermal management ( DTM ) algorithm, which exploits this phenomenon to minimize the energy consumption of FinFET-based circuits without any appreciable performance penalty. 

using lower voltage levels slows down the temperature rise so that the circuit can operate with at a high frequency for longer time, while the circuit controlled by conventional DTMs would have to reduce the frequency earlier than the proposed DTM. 

Because of the TEI phenomenon, the FinFET-based circuit is getting faster with rising temperature, which allows us to drop the supply voltage level below Vbase while maintaining ftarget . 

Several DTM response mechanisms (control knobs) e.g., fetch-toggling, dynamic thread migration, frequency throttling and DVFS, have been introduced [7, 8, 9]. 

Their proposed DTM method targets to minimize the energy consumption for a given task, or a given set of tasks, without violating the operating frequency of the initial schedule, and thereby without any performance loss. 

For a conventional MOSFET operating at superthreshold Vdd (e.g., 0.9 V), it is well known that the rising temperature will result in a reduced Ion and eventually aggravate the speed of circuit. 

Double-gate FinFET device structure.ture, the tensile stress becomes larger, which decreases Vth as well as induces a slight change of the carrier mobility µ for FinFETs. 

With the dramatic downscaling of layout geometries, the traditional bulk CMOS technology has hit critical roadblocks, namely increasing leakage current and power consumption induced by the short-channel effects (SCEs) and the increasing variability levels. 

Experimental results demonstrate some 40% energy saving (with no performance penalty) can be achieved by the proposed TEI-aware DTM approach compared to the best-in-class DTMs that are unaware of this phenomenon. 

as power density has continued to increase with the technology scaling, the accompanying high rate of heat generation has become a growing concern. 

their DTM slows down the speed of temperature increase, or makes the die temperature stable at a certain point below Tlimit , thereby avoiding the performance loss from such situations when the conventional DTMs inevitably lower the frequency or stop execution. 

Given a DVFS schedule derived from the worst-case (at, say, -25°C) delay at various voltage levels, the motivation is to scale down the voltage level when the circuit temperature is high enough such that the delay from the lower volt-age level is no larger than the worst-case delay from the original higher voltage level. 

The leakage current of a circuit increases exponentially with the increasing temperature [6] and this positive feedback mechanism between leakage power and temperature can result in a thermal runaway situation. 

Notice that, if the authors target a large scale chip that equips heatsinks or coolers, the spatial thermal variations should be taken into consideration, which may require to develop the more sophisticated thermal models and accompanying control logics (e.g., the feedback controller) to be robust to the modeling errors. 

With the given deadline specification of a task, the required (min-imum) operating frequency ftarget and corresponding base voltage level Vbase can be determined in order to finish task execution by deadline. 

The previous DTM works have tackled the question of how to limit the peak temperature on circuit substrates comprised of planar CMOS devices running in the super-threshold voltage regime to save power or maximize performance. 

Based on the scaled power and the ambient temperature set to 25°C, the authors finally derived the equilibrium temperature for each circuit and each voltage level. 

for a given target clock frequency, the corresponding voltage level of the circuit should be set according to the worst-cased circuit delay, which occurs at the lowest die temperatures. 

This method can achieve significant energy reduction without performance penalty due to the following three reasons: (i) lowering down the voltage level will quadratically reduce the dynamic energy of the circuit and also reduce the leakage energy/power, (ii) lowering down the voltage level may slow down the rising speed of temperature, or may even reduce the temperature in presence of a heatsink (e.g., the ambient environment for mobile devices), and will exponentially reduce the leakage power, and (iii) the operating frequency determined by the worst-case delay of the higher original voltage can be maintained after the voltage scaling. 

The minimum constraint of such warm-up time is determined by the voltage switching time (i.e., the voltage transition latency of DC-DC converters) that the voltage controller can provide. 

As a consequence, different from the super-threshold regime where the slightly stronger effect of µ than that of Vth causes decreasing Ion with increasing T , the changes of Vth and S considerably increases Ion in the sub/near-threshold regime, and thus the gate can run much faster. 

As can be seen from the figure, the minimum energy point in each region locates at the temperature point where the voltage level is changed, i.e., the threshold temperature level.