scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Signal Delay in RC Tree Networks

TL;DR: Upper and lower bounds for delay that are computationally simple are presented in this paper and can be used to bound the delay, given the signal threshold, and to certify that a circuit is "fast enough," given both the maximum delay and the voltage threshold.
Abstract: In MOS integrated circuits, signals may propagate between stages with fanout. The exact calculation of signal delay through such networks is difficult. However, upper and lower bounds for delay that are computationally simple are presented in this paper. The results can be used 1) to bound the delay, given the signal threshold, or 2) to bound the signal voltage, given a delay time, or 3) certify that a circuit is "fast enough," given both the maximum delay and the voltage threshold.

Summary (2 min read)

Introduction

  • In ~S integrated circuits, a given inverter or logic node may drive several gates, some of them through long wires whose distributed resistance and capacitance may not be negligible.
  • The work reported here has led to a computationally simple technique for finding upper and lower bounds for the delay.
  • The resistance of the metal line is neglected, but its parasitic capacitance remains.
  • Capacitances associated with the pullup source diffusion, contact cuts, and the gates being driven are included.
  • The work reported here actually applies to voltage sources other than steps, and an example appears below with a saturated ramp input source.

Analysis

  • Consider any resistor tree with no node at ground.
  • For simplicity the 18th Design Automation Conference Paper 30.2 examples in this paper involve only lumped resistors and capacitors and uniform RC lines.
  • The tree representing the signal path is driven at the input with a unit step voltage.
  • It is assumed that the output voltages cannot be calculated easily.
  • For the moment consider only lumped capacitors; the theory is similar if the distributed lines are considered also.

The resistance

  • Rke is defined as the resistance of the portion of the path between the input and e, that is common with the path between the input and node k.
  • The sum (over all the capacitors in the network) different output nodes, Tp is the same for all outputs.
  • TDe i Tp. (4) For nonuniform RC lines (i.e., RC trees without side branches) TDe = Tp. A detailed derivation [I] has the dimensions of time, and is equal to the first-order moment of the impulse response, which has been called "delay" by Elmore [3] .
  • The general form of all these bounds is illustrated in Figure 4 . 17) it can be seen that bounds for the ramp response can be obtained simply by integrating the unit step bounds.

Practical Algorithms

  • One way to use the inequalities of the previous sections is to consider the overall RC tree, and compute for each capacitor the appropriate Rke and Rkk so that Tp, TDe , and TRe for each output can be found.
  • Of course for distributed lines the sums are replaced by appropriate integrals.
  • The calculations necessary for each output require time proportional to the square of the number of elements.
  • An alternate approach is to build up the network by construction, and calculate independently for each of the partially constructed networks enough information to permit the final calculation of Tp, TDe , and TRe.
  • Programs that implement this approach appear elsewhere, in both a restricted form [2] and a more general form [I] .

Conclusions

  • A computationally efficient method for calculating the signal delay through MOS interconnect lines with fanout has been described.
  • Tight upper and lower bounds for the step response of RC trees have been presented.
  • Linear-time algorithms exist for calculating these bounds from an algebraic description of the tree.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

SIGNAL DELAY IN RC TREE NETWORKS*
Paul Penfield, Jr.
Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology, Cambridge, MA 02139
Jorge Rubinstein
Digital Equipment Corporation
75 Reed Road, Hudson, b~ 01749
Abstract
In MOS integrated circuits, signals may
propagate between stages with fanout. The HOS
interconnect may be modeled by an RC tree, Exact
calculation of signal delay through such networks is
difficult. However, upper and lower bounds for
delay that are computationally simple are presented
here. The results can be used (I) to bound the
delay, given the signal threshold; or (2) to bound
the signal voltage, given a delay time; or (3) to
certify that a circuit is "fast enough", given both
the maximum delay and the voltage threshold.
Introduction
In ~S integrated circuits, a given inverter or
logic node may drive several gates, some of them
through long wires whose distributed resistance and
capacitance may not be negligible. There does net
seem to be reported in the literature any simple
method for estimating signal propagation delay in
such circuits, nor is there any general theory of
the properties of RC trees, as distinct from RC
lines. The work reported here has led to a
computationally simple technique for finding upper
and lower bounds for the delay. The technique is of
importance for VLSI designs in which the delay
introduced by the interconnections may be comparable
to or longer than active-device delay. This can be
the case for polysilicon wires as short as 1 mm,
with 4-micron devices. The importance of this
technique grows as the wiring lengths increase or
feature sizes decrease.
*This work was supported in part by Digital
Equipment Corporation, in part by the Advanced
Research Projects Agency of the Department of
Defense and monitored by the Office of Naval
Research under Contract N00014-C-80-0622, and in
part by the Air Force under Contract number AFOSR
4-9620-80-0073.
Consider the circuit of Figure I. The slowest
transition (and therefore presumably the one of most
interest) occurs when the driving inverter shuts off
and its output voltage rises from a small value to
VDD. During this process the various parasitic
capacitances on the output are charged through the
pullup transistor. Figure 2 shows a simple model of
this circuit for timing analysis. The pullup, which
is nonlinear, is approximated by a linear resistor,
and the transition is represented by a voltage
source going from 0 to VDD at time t = 0.
(Later, for simplicity, a unit step will be
considered instead,) The polysilicon lines are
represented by uniform RC lines. The resistance of
the metal line is neglected, but its parasitic
capacitance remains. Capacitances associated with
the pullup source diffusion, contact cuts, and the
gates being driven are included. Any nonlinear
capacitances are approximated by linear ones. The
work reported here actually applies to voltage
sources other than steps, and an example appears
below with a saturated ramp input source.
In general, the circuit response cannot be
found in closed form. The results of this paper can
be used to calculate upper and lower bounds to the
delay that are very tight in the case where most of
the resistance is in the pullup. The theory as
presented here does not explicitly deal with non-
linearities and therefore does not apply to signal
propagation through pass transistors unless they are
modelled as linear resistors. A more complete
discussion of this theory will appear elsewhere [I],
[2].
Analysis
An RC tree is defined as follows. Consider any
resistor tree with no node at ground. From each
node in this tree a capacitor to ground may be
added, and any resistor may be replaced by a
distributed RC line. Although nonuniform RC lines
may appear in an RC tree, for simplicity the
18th Design Automation Conference Paper 30.2
0146-7123/81/0000-0613500.75 © 1981 IEEE 613

examples in this paper involve only lumped resistors
and capacitors and uniform RC lines. An RC tree has
one input and any number of outputs;. Side branches
may or may not end in a node that is considered as
an output; in fact, outputs may be taken anywhere in
the tree. Nonuniform RC lines are special cases of
RC trees, without any side branches. An important
property of RC trees is that there is a unique path
from any point in the tree to the input.
The tree representing the signal path is driven
at the input with a unit step voltage. (Below, this
result is generalized to other driving voltages.)
Gradually the voltages at all other nodes, and in
particular at all the outputs, rise from 0 to i
volt. It is assumed that the output voltages cannot
be calculated easily. The problem is to find simple
upper and lower bounds for the output voltages, or,
equivalently, to find upper and lower bounds for the
delay associated with each output.
Consider any output node e, and any lumped
capacitor at node k with capacitance C k. For the
moment consider only lumped capacitors; the theory
is similar if the distributed lines are considered
also. One may think of many-stage approximations
for the distributed lines, or one may convert some
summations in the formulas below to a form including
both summations over lumped capacitors and integrals
over distributed ones.
The resistance Rke is defined as the
resistance of the portion of the (unique) path
between the input and e, that is common with the
(unique) path between the input and node k. In
particular, Ree is the resistance between input and
output e and Rkk is the resistance between the
input and node k. Thus Rke ~ Rkk and Rke ~ Ree-
For an illustration, see Figure 3.
The sum (over all the capacitors in the
network)
different output nodes, Tp is the same for all
outputs. It is easily seen that
TRe ! TDe i Tp. (4)
For nonuniform RC lines (i.e., RC trees without side
branches) TDe = Tp. For a single uniform RC line,
Tp = TDe = RC/2, and TRe = RC/3.
A detailed derivation [I] leads to the upper
bounds for the unit step response Ve(t)
Ve(t) i 1 TDe - t
Tp
TDe -t/TRe
Ve(t) ! I - e
Tp
and lower bounds for the unit step response Ve(t)
Ve(t) > 0
Ve(t) ~ i
(5)
TDe
(6)
t + TRe
TDe (Tp - TRe)/T P e-t/Tp
Ve(t ) ~ 1 - e
Tp
(7)
where (9) applies if t ~ Tp - TRe. The tightest
upper bounds are (5) for small t and (6) for
large t. The tightest lower bounds are (7) for
t ~ TDe - TRe, (8) for TDe - TRe ! t 6 Tp - TRe ,
and (9) for Tp - TRe j t.
Bounds for the time, given the unit step
response voltage, are possible because the voltage
is a monotonic function of time (a fact proven in
[I]). Of course
TDe =~kRkeCk (I) t ~ 0
(8)
has the dimensions of time, and is equal to the
first-order moment of the impulse response, which
has been called "delay" by Elmore [3]. Next, define
for each output e two quantities that also have
the dimensions of time,
(9)
Tp = ~k RkkCk (2)
TRe = (~k R~eCk)/Ree" (3)
All three summations extend over all the capacitors
of the network. Each of these three quantities
plays a role in the final delay formulas, but none
of them is equal to the delay. Each can be computed
easily, even in the presence of distributed lines,
and while TRe is in g~neral different for
(10)
and in addition, (5) and (6) can be inverted to
yield
t ~ TDe - Tp[l - Ve(t)]
(11)
t ~ TRe in
TDe
Tp[l - Ve(t)]
(12)
and (8) and (9) yield
t < TDe
TR e
- 1 - Ve(t)
t _< Tp - TRe + Tp in
TDe
Tp[l - Ve(t)]
(13)
(14)
Paper 30.2
614

,I 1
I
Figure I.
VDD
POLY POLY
POLY
1
POL¥ IK
A
GND
Typical ~iOS signal-distribution network. The inverter is
shown driving three gates.
o c
,v~
i
o
Figure 2. Linear-circuit model for the network of Figure I. The
voltage source is a step at time t = 0.
o
INPUT
R 1 R 2 ]
zl
NODE k
R3 Z R4
R5 OUTPUT
..TL. °e
Figure 3. Illustration of resistance terms. For this network,
Rke = R I + R2, Rkk = R 1 + R 2 + R3, and Ree = R 1 + R 2 + R 5.
Paper 30.2
615

i
Ve(t)
/j
t --~
Figure 4. Form of the bounds, with the distances
from the exact solution exaggerated for clarity.
15
8
3,4
Figure 5. Example network. Parameter values
are in ohms and farads.
0
V
__
/
INPUT
/
/
/
6- /
/
/
/
A- /
/
/
.2- / /
/ /
Figure 6.
0
j,
/
l
t
l
1 I I 1 I
200 400 600 800 I000
t --,,"
Upper and lower bounds for the network in Figure 5, with a saturated ramp input.
The exact solution, found from circuit simulation, is shown also.
Paper 30.2
616

where (14) only applies if Ve(t) ~ I - TDe/T P. The
general form of all these bounds is illustrated in
Figure 4.
Arbitrary Input Waveforms
Bounds for the response Ye(t) of an RC tree
to an arbitrary excitation x(t) can be obtained
from the bounds Vue(t ) and Vle(t) just derived
for the unit step response Ve(t).
First, the superposition integral can be used
to obtain Ye(t) as
~0 t dx(t')
--
dt"
Ye(t) = Ve(t - t') dt"
= Ve(t) * dx/dt
where * denotes time convolution. From
(15)
Vle(t) J Ve(t) ! Vue(t)
one obtains, if dx/dt ~ O,
or if
where
(16)
Vle(t) * dx/dt ! Ye (t) ! Vue(t) * dx/dt (17)
dx/dt < 0,
Vue(t) * dx/dt ! Ye (t) J Vle(t) * dx/dt (18)
Vue(t) and Vle(t) are known analytically.
From (17) it can be seen that bounds for the ramp
response can be obtained simply by integrating the
unit step bounds. Equations (17) and (18) apply for
monotonic inputs.
The general case, where the excitation x(t)
has both positive and negative slopes, is treated
elsewhere [I].
As an illustration of the use of these
relations, consider the network of Figure 5, excited
with a saturated ramp. The actual response
(calculated from an expensive simulation) is shown
along with the upper and lower bounds, from (17), in
Figure 6.
Practical Algorithms
One way to use the inequalities of the previous
sections is to consider the overall RC tree, and
compute for each capacitor the appropriate Rke and
Rkk so that Tp, TDe , and TRe for each output
can be found. Of course for distributed lines the
sums are replaced by appropriate integrals. In this
approach, the calculations necessary for each output
require time proportional to the square of the
number of elements.
An alternate approach is to build up the
network by construction, and calculate independently
for each of the partially constructed networks
enough information to permit the final calculation
of Tp, TDe , and TRe. The computation time for
each output is then proportional to the number of
elements, rather than the square of the number.
Programs that implement this approach appear
elsewhere, in both a restricted form [2] and a more
general form [I].
Conclusions
A computationally efficient method for
calculating the signal delay through MOS
interconnect lines with fanout has been described.
Tight upper and lower bounds for the step response
of RC trees have been presented. Linear-time
algorithms exist for calculating these bounds from
an algebraic description of the tree. Substantial
computational simplicity is achieved even in the
presence of RC distributed lines by representing
the RC tree by a small set of suitably defined
characteristic times, which can be calculated by
inspection and used to generate the bounds.
Acknowledgements
The authors are pleased to acknowledge discus-
sions with Steven Greenberg, Llanda Richardson, and
Lance Glasser, and help from Barbara Lory in manu-
script preparation.
References
[I] J. Rubinstein and P. Penfield, Jr.; to be
published.
[2] P. Penfield, Jr., and J. Rubinstein, "Signal
Delay in RC Tree Networks," to appear in Proceedings
of the Second Caltech Conference on VLSI, Pasadena,
CA; January 19-21, 1981.
[3] W. C. Elmore, "The Transient Response of Damped
Linear Networks with Particular Regard to Wide-Band
Amplifiers," Journal of Applied Physics, vol. 19,
no. I, pp. 55-63; January 1948.
Paper 30.2
617
Citations
More filters
Journal ArticleDOI
TL;DR: Asymptotic waveform evaluation (AWE) provides a generalized approach to linear RLC circuit response approximations and reduces to the RC tree methods.
Abstract: Asymptotic waveform evaluation (AWE) provides a generalized approach to linear RLC circuit response approximations. The RLC interconnect model may contain floating capacitors, grounded resistors, inductors, and even linear controlled sources. The transient portion of the response is approximated by matching the initial boundary conditions and the first 2q-1 moments of the exact response to a lower-order q-pole model. For the case of an RC tree model, a first-order AWE approximation reduces to the RC tree methods. >

1,800 citations


Cites background or methods from "Signal Delay in RC Tree Networks"

  • ...The first-order step response approximation in Fig. 7 exhibits an error which may be unacceptable for some delay applications....

    [...]

  • ...In [ 7 ], what corresponds to a first-order AWE response waveform is bounded to what are sometimes overly pessimistic max/min values....

    [...]

  • ...For many MOS circuits, timing analyzers [ 11, [3] are often able to predict the interconnect delay with a simplified model, typically an RC tree [ 7 ], to within 10 percent of a SPICE [8] simulation prediction....

    [...]

  • ...Equation (60) is compared with the SPICE response for this circuit in Fig. 7 ....

    [...]

  • ...Moreover, for simple circuits such as RC trees, the steady-state solution is explicit and the first moment, or Elmore delay can be determined by a tree walk of the circuit graph [ 7 ]....

    [...]

Book ChapterDOI
08 Apr 2002
TL;DR: The StreamIt language provides novel high-level representations to improve programmer productivity and program robustness within the streaming domain and the StreamIt compiler aims to improve the performance of streaming applications via stream-specific analyses and optimizations.
Abstract: We characterize high-performance streaming applications as a new and distinct domain of programs that is becoming increasingly important. The StreamIt language provides novel high-level representations to improve programmer productivity and program robustness within the streaming domain. At the same time, the StreamIt compiler aims to improve the performance of streaming applications via stream-specific analyses and optimizations. In this paper, we motivate, describe and justify the language features of StreamIt, which include: a structured model of streams, a messaging system for control, a re-initialization mechanism, and a natural textual syntax.

1,224 citations


Cites methods from "Signal Delay in RC Tree Networks"

  • ...Spectrumware software radio [8,9]; specifications such as the Bluetooth communications protocol [10], the GSM Vocoder [11], and the AMPS cellular base station[12]; and almost any application developed with Microsoft’s DirectShow library [13], Real Network’s RealSDK [14] or Lincoln Lab’s Polymorphous Computing Architecture [15]....

    [...]

Journal ArticleDOI
TL;DR: This tutorial paper collects together in one place the basic background material needed to do GP modeling, and shows how to recognize functions and problems compatible with GP, and how to approximate functions or data in a formcompatible with GP.
Abstract: A geometric program (GP) is a type of mathematical optimization problem characterized by objective and constraint functions that have a special form. Recently developed solution methods can solve even large-scale GPs extremely efficiently and reliably; at the same time a number of practical problems, particularly in circuit design, have been found to be equivalent to (or well approximated by) GPs. Putting these two together, we get effective solutions for the practical problems. The basic approach in GP modeling is to attempt to express a practical problem, such as an engineering analysis or design problem, in GP format. In the best case, this formulation is exact; when this is not possible, we settle for an approximate formulation. This tutorial paper collects together in one place the basic background material needed to do GP modeling. We start with the basic definitions and facts, and some methods used to transform problems into GP format. We show how to recognize functions and problems compatible with GP, and how to approximate functions or data in a form compatible with GP (when this is possible). We give some simple and representative examples, and also describe some common extensions of GP, along with methods for solving (or approximately solving) them.

1,215 citations


Cites background from "Signal Delay in RC Tree Networks"

  • ...These are all based on gate delay models that are compatible with geometric programming; see [85, 131, 146, 130, 1] for more on such models....

    [...]

Journal ArticleDOI
TL;DR: It is shown that clock frequencies in excess of 200 MHz are feasible in a 3- mu m CMOS process, and a precharge technique with a true single-phase clock, which increases the clock frequency and reduces the skew problems, is used.
Abstract: It is shown that clock frequencies in excess of 200 MHz are feasible in a 3- mu m CMOS process. This performance can be obtained by means of clocking strategy, device sizing, and logic style selection. A precharge technique with a true single-phase clock, which increases the clock frequency and reduces the skew problems, is used. Device sizing with the help of an optimizing program improves circuit speed by a factor of 1.5-1.8. The logic depth is minimized to one instead of two or more, and pipeline structures are used wherever possible. Experimental results for several circuits which work at clock frequencies of 200-230 MHz are presented. SPICE simulation shows that some circuits could work up to 400-500 MHz. >

849 citations

Journal ArticleDOI
TL;DR: In this paper, an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches is presented, where the inputs to the model are the cache size, block size, and associativity, as well as array organization and process parameters.
Abstract: This paper describes an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches. The inputs to the model are the cache size, block size, and associativity, as well as array organization and process parameters. The model gives estimates that are within 6% of Hspice results for the circuits we have chosen. This model extends previous models and fixes many of their major shortcomings. New features include models for the tag array, comparator, and multiplexor drivers, nonstep stage input slopes, rectangular stacking of memory subarrays, a transistor-level decoder model, column-multiplexed bitlines controlled by an additional array organizational parameter, load-dependent size transistors for wordline drivers, and output of cycle times as well as access times. Software implementing the model is available via ftp.

829 citations

References
More filters
Journal ArticleDOI
TL;DR: It is found possible to define delay time and rise time in such a way that these quantities can be computed very simply from the Laplace system function of the network.
Abstract: When the transient response of a linear network to an applied unit step function consists of a monotonic rise to a final constant value, it is found possible to define delay time and rise time in such a way that these quantities can be computed very simply from the Laplace system function of the network. The usefulness of the new definitions is illustrated by applications to low pass, multi‐stage wideband amplifiers for which a number of general theorems are proved. In addition, an investigation of a certain class of two‐terminal interstage networks is made in an endeavor to find the network giving the highest possible gain—rise time quotient consistent with a monotonic transient response to a step function.

1,693 citations

Proceedings ArticleDOI
29 Jun 1981
TL;DR: Upper and lower bounds for delay that are computationally simple are presented here to certify that a circuit is "fast enough", given both the maximum delay and the voltage threshold.
Abstract: In MOS integrated circuits, signals may propagate between stages with fanout. The MOS interconnect may be modeled by an RC tree. Exact calculation of signal delay through such networks is difficult. However, upper and lower bounds for delay that are computationally simple are presented here. The results can be used (1) to bound the delay, given the signal threshold; or (2) to bound the signal voltage, given a delay time; or (3) to certify that a circuit is "fast enough", given both the maximum delay and the voltage threshold.

357 citations

Journal ArticleDOI
TL;DR: In this article, a general solution to the problem of synthesizing positive real functions and matrices of several variables has been presented, except for the two-variable lossless case and the case where a twovariable positive real function is prescribed as a bilinear function with respect to one of the two variables.
Abstract: Positive real functions and matrices of several variables arose in the problem of synthesizing a passive network composed of lumped elements with variable parameters. The importance of these functions and matrices has recently been emphasized by the considerable attention concerning their application to the problem of synthesizing passive networks composed of noncommensurable transmission lines and lumped elements. The problem of synthesizing positive real functions and matrices of several variables has been discussed by several authors. However, the problem has not been solved generally, except for the two-variable lossless case and the case where a two-variable positive real function is prescribed as a bilinear function with respect to one of the two variables. In this paper, a general solution to the above synthesis problem is presented. It is shown that an arbitrarily prescribed n \times n positive real matrix, symmetric or nonsymmetric, of several variables is realizable as the impedance or admittance matrix of a finite passive multivariable n-port. It is further shown that, if the matrix is symmetric, then it is realizable as a bilateral passive n -port. Related problems and discussions are also given.

119 citations

Book ChapterDOI
01 Jan 1983
TL;DR: TV is a timing analyzer for nMOS designs that calculates as much as possible statically, including the direction of signal flow, use, and clock qualification of all transistors.
Abstract: TV is a timing analyzer for nMOS designs. Based on the circuit obtained from existing circuit extractors, TV determines the minimum clock duty and cycle times and verifies that the circuit obeys the MIPS clocking methodology. The delay analysis is an event driven simulation that only uses the values stable, rise, fall, as well as information about clock qualification. TV stresses fast running time, small user input requirements, and the ability to offer die user valuable advice. It calculates as much as possible statically, including the direction of signal flow, use, and clock qualification of all transistors.

73 citations

Proceedings ArticleDOI
01 Jan 1982
TL;DR: This paper describes a program for automatically computing the delay through LSI/VLSI chips which have been laid out using automatic layout programs, which significantly reduces execution time and computer storage.
Abstract: This paper describes a program for automatically computing the delay through LSI/VLSI chips which have been laid out using automatic layout programs. A unique algoristhm for synthesizing RC networks from artwork data, which significantly reduces execution time and computer storage, is included. A novel and simple method for determining the delay through logic gates due to arbitrary RC network load at the output is also presented and discussed.

38 citations