scispace - formally typeset
Proceedings ArticleDOI

Dual-Threshold Voltage Technique for Asynchronous Pre-Charge Full Buffer Linear-Pipelines

Reads0
Chats0
TLDR
In order to reduce leakage power an algorithm for assigning a high threshold voltage is proposed, which can achieve on average 30% savings for leakage power, while there is no performance penalty.
Abstract
Scaling the technology and reducing the feature size in integrated circuits have caused leakage power consumption to become one of the main challenges to the digital design Dual-threshold CMOS circuit, which has both high and low threshold transistors in a single chip, can be used to deal with the leakage problem in high performance applications This paper presents dual-threshold voltage technique for reducing leakage power dissipation of Pre-Charge Full Buffer asynchronous linear-pipelines while still maintaining high performance We employed Folded Dependency Graph to produce a formal performance analysis In order to reduce leakage power an algorithm for assigning a high threshold voltage is proposed Results obtained indicate that our proposed technique can achieve on average 30% savings for leakage power, while there is no performance penalty

read more

Content maybe subject to copyright    Report

Dual-Threshold Voltage Technique for Asynchronous Pre-Charge Full Buffer
Linear-Pipelines
Behnam Ghavami Hossein Pedram
ghavamib@aut.ac.ir, pedram@ce.aut.ac.ir
Computer Engineering Department, Amirkabir University of Technology (Tehran Polytechnic)
424 Hafez Ave, Tehran 15785, Iran
Abstract
Scaling the technology and reducing the feature size
in integrated circuits have caused leakage power
consumption to become one of the main challenges to
the digital design. Dual-threshold CMOS circuit, which
has both high and low threshold transistors in a single
chip, can be used to deal with the leakage problem in
high performance applications. This paper presents
dual-threshold voltage technique for reducing leakage
power dissipation of Pre-Charge Full Buffer
asynchronous linear-pipelines while still maintaining
high performance. We employed Folded Dependency
Graph to produce a formal performance analysis. In
order to reduce leakage power an algorithm for
assigning a high threshold voltage is proposed. Results
obtained indicate that our proposed technique can
achieve on average 30% savings for leakage power,
while there is no performance penalty.
Keywords
Asynchronous Circuit, Leakage Power, Dual-Vt, PCFB.
1. Introduction
Reduction in the size and the growth in number of
the transistors in contemporary circuits signify the
problem of global synchronization. One solution is to
eliminate the global clock signal and take advantages of
asynchronous design methods. As asynchronous
circuits gain popularity due to their potential
advantages, such as dynamic power saving and high
performance, the complexity of design and synthesis
methods is highlighted. Among the numerous
asynchronous design styles being developed, template-
based have decreased the complexity of design effort
[1].
As of today, high performance digital design is
formidably challenged by high power consumption. In
the sub-micron regime, leakage currents make up a
significant portion of the total power consumption in
high-performance digital circuits [2]. In asynchronous
circuits as one class of VLSI circuits, leakage power
increases with the scaling of CMOS manufacturing
technology into deep sub-micron era. Hence, designers
require techniques that reduce leakage power while
maintaining high performance of these circuits.
Main component of the leakage power is due to the
subthreshold leakage current and it is becoming an
increasingly dominant component of overall power
consumption in deep sub-micron technologies [3].
There is a lot of researches around employing
subthreshold leakage power reduction techniques in
synchronous circuits [2][4] .
In this paper, we have introduced an efficient
methodology for employing dual-threshold voltage
techniques in Pre-Charge Full Buffer (PCFB)
asynchronous linear-pipelines. We proposed the
assignment of low threshold voltage (low-VT) and high
threshold voltage (high-VT) to the transistors in
templates of pipeline, in such a way that the delay
remains the same as in the case of all low-VT designs,
but reduces subthreshold leakage current significantly.
The remainder of this paper is organized as follows. We
discuss about Quasi Delay Insensitive (QDI)
asynchronous linear-pipeline in Section 2. Section 3 is a
review over background of dual-Vt technique and some
discussions. Then Section 4 elaborates the proposed
methodology for dual-Vt asynchronous linear pipeline.
Section 5 is about our experimental results in detail by
the use of some related test-benches. Finally, some
conclusions are drawn in Section 6.
2. QDI Asynchronous linear-Pipeline
Asynchronous circuits represent a class of circuits
not controlled by a global clock but rely on exchanging
local request and acknowledge signaling for the purpose
of synchronization. An asynchronous circuit is called
delay-insensitive (DI) if it preserves its functionality,
independent of the delays of gates and wires[5]. It is
shown that the range of the circuits that can be
implemented completely DI is very limited. Therefore
some timing assumptions exist in different design styles
that must hold to ensure the correctness of the circuit.
Different techniques distinguish themselves in the
978-1-4244-1680-6/07/$25.00 ©2007 IEEE

choice of the compromises to the delay-insensitivity.
Quasi delay insensitive (QDI) circuits are like DI
circuits with a weak timing constraint [5]. QDI
asynchronous circuits are composed of concurrent
processes connected through handshake channels.
These processes can be decomposed into fine grain
processes that all fit in a fine grained template.
At present, most QDI pipelines are designed using
pre-charged half buffer (PCHB) and pre-charged full
buffer (PCFB) templates [1]. This paper discusses the
idea of leakage power reduction in the context of QDI
asynchronous linear PCFB pipelines.
Linear-pipeline is defined as pipeline structure
that contains no fork or join. A template at
n
stage
becomes active when it senses the presence of an
incoming data from function block of
1n
stage
. It then
performs the computation and sends the result via
output channels to
1+n
stage
. Communications through
channels are controlled by handshake protocols. One of
the major protocols used in asynchronous circuits is
four-phase protocol [5]. Buffer Cycle Time,
B
C
, is the
time needed by the buffer at stage n to process a
complete data pattern. To isolates the pipeline from the
environment effects, we assume that the starting of the
pipeline (TX) produces a new token as soon as it has
the acknowledgment and ending point of pipeline (RX)
produces the acknowledgment as soon as it has a new
token. Figure 2 shows a PCFB linear pipeline [8].
N
FB
1
FB
1
Buf
N
Buf
Figure 1. PCFB linear-pipeline
3. Dual-Threshold voltage CMOS:
Background and Discussion
Researchers have proposed different circuit
techniques to reduce subthreshold leakage power of
synchronous circuits [2].
Multiple-threshold CMOS circuit, which has both
high and low threshold transistors in a single chip, can
be used to deal with the subthreshold leakage problem
in low power and high performance applications. The
high threshold transistors can suppress the subthreshold
leakage current, while the low threshold transistors are
used to achieve the high performance. Recently several
multiple-threshold CMOS circuit design techniques
have been provided.
In dual-threshold technique, a higher threshold
voltage can be assigned to some transistors in non-
critical paths so as to reduce leakage current, while the
performance is maintained due to the low threshold
transistors in the critical path(s). Figure 1 illustrates the
basic idea of a dual-Vth scheme in synchronous
circuits.
Figure 2. Dual Vth scheme in synchronous circuits
Since dual-threshold design technique can reduce
both active leakage power and standby leakage power
without delay and area overheads, it is very attractive
for low voltage and high performance circuit design.
However, due to the complexity of a circuit, not all the
transistors in non-critical paths can be assigned a high
threshold voltage; otherwise, the critical path may
change, thereby increasing the critical delay.
Recently, researchers have proposed many design
techniques, for selecting and assigning an optimal high
threshold voltage to gates of synchronous circuits which
reduce leakage power under performance constraints
[2]. But in case of asynchronous circuits, applying these
techniques has serious problems. This is due to the fact
that estimation and analysis of the performance of
asynchronous circuit remains somewhat of a stumbling
block because of dependencies between highly
concurrent events [6][7].
While synchronous performance estimation is based
on a static critical path analysis affected only by the
delay of components and interconnecting wires, it has
been shown that the performance estimation of
asynchronous circuit is more complex [6]. In
asynchronous circuits, the operation of a system
proceeds at a rate determined by speed of its individual
components, and the sequencing of the operation of
components. The techniques required to analyze
asynchronous systems resemble those used to determine

the clock period of a synchronous system, which is,
summing the delays along the longest path through the
combinational logic connecting adjacent latches. In the
clocked case, the critical path has a clear beginning and
a clear end because all paths are broken by latches. But
importantly, no clear separation is available in
asynchronous circuits. Analysis procedures must deal
directly with cyclic critical paths; thus, existing critical-
path analysis tools cannot be easily applied to this
problem.
In brief, traditional dual-Vth techniques can not be
employed directly to asynchronous circuits. So, in next
sections, we first introduce asynchronous linear
pipelines, and then an abstract performance model of
this circuits on which the dual-Vt problem can be
applied is proposed. Non-linear pipelines are under
investigation.
4. Dual-Vt PCFB linear-pipelines design
methodology
To employ Dual-Vt technique, at first a suitable
performance model for PCFB linear pipelines is
presented, and then an efficient algorithm for assigning
high and low Vt to components of PCFB pipeline is
proposed.
4.1 Formal Performance-Analysis
For analyzing the performance of PCFB pipeline, a
formal and detailed equation is needed. We need exact
equation to assign the suitable Vt for their circuits
without affecting the performance. In order to
determine the cycle time of a pipeline, it is necessary to
analyze the dependencies of the required sequence of
transitions. These dependencies can be drawn in a
marked directed graph where the nodes of the graph
correspond to specific rising or falling transitions of
circuit components, and the edges represent the
dependencies of each transition on the outputs of other
components. The delay of each transition is represented
by a value attached to the corresponding node in the
graph. These graphs will be called “Dependency
Graphs” [8][9].
If all the stages have the same function blocks, the
graph can be folded. Each edge in the Folded Graph is
annotated with an integer weight giving the offset in
stage indices to which that dependency refers. Cycles in
the Folded Graph whose edge weights sum to zero
correspond to the cycles in the original Dependency
Graph and thus the zero-weight cycle with the largest
sum of node delay values gives the cycle time. More
details can be found in [8]. Figure 3 shows the folded
dependency-graphs of the PCFB Pipeline.
Figure 3. The Folded Dependency Graph for
the PCFB pipeline
The possible loops of PCFB pipeline are, (F, C), (F,
Ack, Int, Ack, Int, C) with index of (+1) and
(Ack, C, Ack, Int), (Ack, C, Ack, C, F),
(Ack, C, Int), (Ack, C, Int, C), with index of (-
1). Finally, the loop (F, Ack, Int, Ack, C) has an
index of (0). The resultant cycle time equation is shown
in eq.1.
PCFB
C = Max [ (TF +TAck +TInt +TAck +TC ),
(Max[(TF +TC ),(TF +TAck +TInt +TAck +TInt +TC
)] + Max[(TAck +TC +TAck +TInt ),(TAck +TC൹
+TAck +TC + TF ),(TAck +TC +TInt ), (TAck +TCൻ
+TInt +TC ) ] ) ]. (1)
The propagation delay through
x
node denoted as
)(xT
or
)( xT
, defines how quickly the output
responds to a change in the input. Using RC-delay
model, relation of propagation delay by value of Vt is
resolved [4].
4.2 Dual-VT Assignment
We propose the assignment of low threshold voltage
(low-VT) and high threshold voltage (high-VT) to the
transistors in PCFB template in such a way that the
delay remains same as in case of all low-VT design, but
reduces sub-threshold leakage current significantly.
Circuits are comprised linear and non-linear pipelines.
In this paper only linear pipelines are considered. The
first step of this algorithm is to initialize a circuit with a
single low threshold. The low-threshold is determined
by the performance requirement. After initialization, all
delay parameters associated with each node are
computed. Using eq.1, the cycle time of the pipeline is
determined. At this step, all elements that don not
participate in the critical loop are resolved and assigned
to high-Vt. After updating the network for high-Vt, the
parameters of circuit are updated. The pseudo-code for
the initialization procedure is shown below.
5. Experimental Results
The method to reduce leakage power using dual-
threshold-voltage transistors has been implemented in C
under the Berkeley SIS environment. All the simulation
results were obtained using HSPICE with the BSIM3V3
model for a 0.18μ MOSIS process. The effective
channel length of the transistor is taken as 0.18μm and

the gate oxide thickness is taken as 40
D
Α
. For
simplicity, all transistors are assumed to have the same
channel length of 0.18μm, while the channel widths for
nMOSFETs and pMOSFETs are assumed to be 0.54μm
and 1.62μm, respectively. The sub-threshold swing
coefficient () is taken as 1.44 and the body effect
coefficients () and DIBL coefficients () are 0.03 and
0.21 for nMOSFETs and 0.02 and 0.11 for pMOSFETs,
respectively. For the active mode and standby mode of
circuit, temperatures are assumed as 110
C
D
and 25
C
D
,
respectively. The supply voltage is assumed 1.0V and
zero biased low threshold voltage and high threshold
voltage used in our experiments is 0.2V and 0.5V,
respectively.
Assign-Vt () {
1. for each pipeline of circuit, DO
{
2. Calculate the propagation delay Tphl(x), Tplh(x) of
each node x.
3. Determine the possible loops
4. Calculate the
PCFB
C
of pipeline.
5. Until there is unmarked nod repeat
{
6. Update network with assign high-Vt to one node and
mark it.
7. Update the new propagation delay Tphl(x), Tplh(x)
of updated node.
8. Update the delay of loops.
9. Calculate the new
PCFB
C
of pipeline.
10. If ( new
PCFB
C
<=
PCFB
C
): go to step 5,
Else: remove high-Vt from last node and go to step 5
}
} }
Persia[10] is a QDI synthesis toolset that is
employed to synthesis our benchmarks. In order to
feasibility of similar functional units in a pipeline,
folded dependency graph model, technology-mapping
was used to map the circuits to a library which contains
NAND gates. We have tested our approach with
benchmark and experimental results are presented in
Table 1. Leakage power in active and standby mode of
the operation of the circuits with single-VT and dual-
VT realizations is shown. It is observed that, on the
average, in dual-VT PCFB circuits 36% and 24%
leakage power can be reduced in active mode and
standby mode, respectively.
6. Conclusion
Reduction in leakage power has become an
important concern in low power and high performance
applications. In this paper, we introduced dual-
threshold design technique for asynchronous PCFB
linear pipeline. Formal performance analysis using the
folded-graph method has been done. In order to reduce
leakage power under performance constraints starting
with a single low Vth circuit, an algorithm for assigning
a high threshold voltage is proposed. Results show that
both active and standby leakage power can be reduced
by more than 30% for some of the circuits.
Table 1: Active and standby leakage power savings
for dual-Vth PCFB circuits
Leakage power (uW) in
active mode
Leakage power (uW) in
standby mode
Circuit
Gate
#
Single-VT
Dual-
VT
%
Single-
VT
Dual-VT %
DiffEq
548 311.4 214.8 31 17.4 13.74 21
GFAdder
358 195.3 140.6 28 9.6 7.77 19
HammingEnc
503 291.7 177.9 39 15.9 11.76 26
Syndrome
625 330.9 215.0 35 20.1 14.47 28
ChienForney
807 461.8 290.9 37 26.5 15.9 40
RiBM
1103 627.1 407.6 35 34.1 26.25 23
ReedSolomon
2505 1561.8 1030 34 73.4 52.84 28
7. References
[1] A. M. Lines "Pipelined Asynchronous circuits" MSc
Thesis, California Institute of Technology, June 1995,
revised 1998
[2] Amit Agarwal, Saibal Mukhopadhyay, Arijit
Raychowdhury, Kaushik Roy, Chris H. Kim. “Leakage
power analysis and reduction for nanoscale circuits”,
ieee 2006.
[3] S. Borkar, “Design Challenges of Technology Scaling,’’
IEEE Micro, vol. 19, no. 4, July-Aug. 1999, pp. 23-29.
[4] S. Mutoh et al., “1-V Power Supply High-Speed Digital
Circuit Technology with Multithreshold Voltage
CMOS,” IEEE J. Solid-State Circuits, vol. 30, no. 8,
Aug. 1995, pp. 847-854.
[5] Jens Sparso, Steve Furber, “Principles of Asynchronous
Circuit Design – A System Perspective”, Kluwer
Academic Publishers, 2002.
[6] Sangyun kim and peter a. beerel. pipeline optimization
for asynchronous circuits: complexity analysis and an
efficient optimal algorithm. IEEE Trans. on computer-
aided design of integrated circuits and systems, vol. 25,
no. 3, march 2006
[7] C. V. Ramamoorthy and G. S. Ho, “Performance
evaluation of asynchronous concurrent systems using
Petri nets,IEEE Trans. Softw. Eng., vol. 6, no. 5, pp.
440–449, Sep. 1980.
[8] Eslam Yahya, M.Renaudin, QDI Latches Characteristics
and Asynchronous Linear-Pipeline Performance
Analysis, Research Report, TIMA-RR--06/-01--FR
(2006).
[9] Ted Williams: Performance of Iterative Computation in
Self-Timed Rings. Journal of VLSI Signal Processing,
7, 17-31 (1994).
[10] Persia Site: http://www.async.ir/persia/persia.php
Citations
More filters
Journal Article

QDI Latches Characteristics and Asynchronous Linear-Pipeline Performance Analysis

TL;DR: This paper introduces verified Standard-Logic schematics for QDI asynchronous latches and analyzes their characteristics, and proposes a new formal method to analyze the performance of asynchronous linear-pipeline.
Journal ArticleDOI

Low power asynchronous circuit back-end design flow

TL;DR: This paper introduces a framework for the synthesis of low leakage power asynchronous circuits while maintaining performance requirements and presents an efficient methodology for static estimation of average performance of asynchronous circuits with choices at the template level.
Proceedings ArticleDOI

Power Optimization of Asynchronous Circuits through Simultaneous Vdd and Vth Assignment and Template Sizing

TL;DR: This paper introduces a methodology for the optimization of total power consumption of template based asynchronous circuits via dual Vdd assignment, dual Vth assignment and template sizing while maintaining performance requirements.
Journal ArticleDOI

Leakage power reduction of asynchronous pipelines

TL;DR: In order to reduce leakage power an efficient algorithm for selecting and assigning high threshold voltage to templates of a pipeline is proposed and can achieve on average 40% savings for leakage power, while there is no performance penalty.
References
More filters
Journal ArticleDOI

1-V power supply high-speed digital circuit technology with multithreshold-voltage CMOS

TL;DR: In this article, a multithreshold-voltage CMOS (MTCMOS) based low-power digital circuit with 0.1-V power supply high-speed low power digital circuit technology was proposed, which has brought about logic gate characteristics of a 1.7ns propagation delay time and 0.3/spl mu/W/MHz/gate power dissipation with a standard load.
Journal ArticleDOI

Design challenges of technology scaling

Shekhar Borkar
- 01 Jul 1999 - 
TL;DR: In this article, the authors look closely at past trends in technology scaling and how well microprocessor technology and products have met these goals and project the challenges that lie ahead if these trends continue.
Book

Principles of Asynchronous Circuit Design: A Systems Perspective

TL;DR: Industrial designers with a background in conventional (clocked) design to be able to understand asynchronous design sufficiently to assess what it has to offer and whether it might be advantageous in their next design task.
Book

Performance evaluation of asynchronous concurrent systems using Petri nets

TL;DR: An extended timed Petri net model is used to model clearly the synchronization involved in these systems, and it is found that the computational complexity involved increases in the same order as they are listed above.
Related Papers (5)