## A-Priori Wirelength and Interconnect Estimation Based on Circuit Characteristics

## Shankar Balachandran <br> Dinesh Bhatia

Center for Integrated Circuits and Systems
University of Texas at Dallas
shankars@utdallas.edu dinesh@utdallas.edu

## Outline

## - Introduction

- Motivation and Prior Work
- Definitions
- Wirelength Prediction
- Routing Demand Prediction

Conclusion

## Introduction

- Interconnect Prediction
- breaks repetitive design convergence loop
- helps in performing early feasibility studies
- A-Priori Prediction
- is done before placement stage
- is used to provide congestion map, wirelength metrics
- can be used for architecture evaluation


## Scope of This Work

- A-Priori wirelength and interconnect prediction for island-style FPGAs
- Bounding box prediction for all wires
- Identifying important circuit characteristics which constrain placement
- Assumes that wirelength is minimized during placement
- Routing demand estimation
- Channel Width calculation
- No prior characterization of placement/router


## Prior Work

- Rent's Rule [TVLSI 2000, Bakoglu]
- Interconnect prediction builds models for architecture, circuit and placement
- Can calculate average/total wirelength, congestion etc.
r Sechen [ICCAD 87]
- Average wirelength of optimized placements
- For all possible bounding boxes, enumerate all possible positions for sources and sinks to calculate average wirelength of the whole netlist
- Hamada et al. [DAC 92]
- Break down nets into cliques and perform neighborhood analysis on them
- Placement is considered a stochastic process
- Wirelength distribution is calculated
- Bodapati et al. [SLIP 00]
- Bounding box estimates using structural analysis
- Needs calibration of placement/router


## Motivation

- Average wirelength is not sufficient
- Rent's rule, Sechen and Hamada et al. report only these figures
- Individual wirelength is useful
- Logic synthesis, floorplanners

Congestion metrics should be quantifiable

- Very important for FPGAs
- Channel Width requirements for routers
- To avoid the chicken and egg problem


## Island Style FPGA



## Methodology



## Definitions [CIRC - TCAD 98, TVESI 02]

- Combinational level $c(x)$ of a node is :
$c(x)=\max \left(c\left(x^{\prime}\right) \mid x^{\prime} \in \operatorname{fanin}(x)\right)+1$
- $\max =\max (c(x))$

- Sequential level $s(x)$ of a node is :
$s(x)=\left\{\begin{array}{l}0 \quad ; x \text { is a PI-node } \\ s\left(x^{\prime}\right)+1 ; x \text { is a FF-node with input } x^{2} \\ \min \left(s^{\prime}\left(x^{\prime}\right) \mid x^{\prime} \in \operatorname{fanin}(x)\right) ; \text { otherwise }\end{array}\right.$
- $\operatorname{smax}=\max (s(x))$;

- Shape : A vector
- Shape $[i]=c_{0} \mathrm{~L} c_{\text {emax }} ; c_{i}=$ number of nodes in level $i$
- For sequential circuits, the combinational shape vector in each level are concatenated back to back


## Definitions(2)

- Reconvergence $R_{x v}$
- Has multiple paths from $x$ to $y$
- $x$ is the origin of reconvergence
- $y$ is the destination of reconvergence
- Always contained within one sequential level
- Number of reconvergences $R N_{x y}=$ Number of paths from $x$ to $y$
- Length of reconvergence $R O_{w}(x)=R I_{w}(y)=\frac{1}{R N_{w w}} \sum_{p \in P_{N}} l(p)$ $P_{x y}$ is the path-set from $x$ to $y$
$l(p)$ is the length of path $p$

$\mathrm{R}_{\mathrm{BE}}: \mathrm{RN}=2$

$$
\begin{aligned}
& l\left(p_{1}=B \rightarrow C \rightarrow E\right)=2 \\
& l\left(p_{2}=B \rightarrow D \rightarrow E\right)=2 \\
& R O_{B E}(B)=R I_{B E}(E)=2
\end{aligned}
$$

$\mathrm{R}_{\mathrm{AG}}: R N=2$

$$
\begin{aligned}
& l\left(p_{1}=A \rightarrow(B E) \rightarrow G\right)=3 \\
& l\left(p_{2}=A \rightarrow F \rightarrow G\right)=2
\end{aligned}
$$

$$
R O_{A G}(A)=R I_{A G}(G)=2.5
$$

## Overview of Our Methodology

## Reconvergence Analysis

Reconvergence Weights

## Bounding Box Estimation

Bounding Box Estimates

## Track Width <br> Estimation

Track Width Estimates

## Wirelength Estimation

- Wirelength of a circuit depends on
- Structural properties of the circuit
- Placement of the circuit
- Nets from Input pads usually feed more nodes than the other nodes
- Hence, classify the nets as logic nets and 10 nets and treat them separately
- Wirelength of an individual net $N$ will depend on
- Number of terminals - $t_{N}$
- Interaction with other nets


## Phase 1 - Minimum Span for Logic Nets

- Assume a net $N$ is tightly placed. Wirelength is optimal when
- Source is placed in the center
- All sinks are tightly packed around the source
- Minimum span $L$ is entirely dependent only on $t_{N}$



## Phase 1 - Minimum Span for 10 -Nets

- Input pads are placed along the periphery
- 10-Nets have many sinks usually, and the location of the Input pads cannot be guessed
- Assume the Input pad is in the corner of the FPGA
- Worst-case calculation of wirelength
- Tight placement of sinks around the pad
Sinks

$$
\begin{aligned}
& 2+3+4+L+L \geq t_{N} \\
& L=\sqrt{2 \cdot t_{N}+9 / 4}-1 / 2
\end{aligned}
$$

Source

$$
\begin{array}{lll}
\mathrm{L}=1 & \mathrm{~L}=2 & \mathrm{~L}=3 \\
\mathrm{~N}=2 & \mathrm{~N}=5 & \mathrm{~N}=9
\end{array}
$$

## Phase 2-Dilation of Nets

- Tight placement is always not possible
- Push and Pull from other nets are ignored
- Net $N$ has no incident reconvergences $\Rightarrow>$ no dilation
- Nhas incident reconvergences $=>$ other nets pull the cells away => dilation
- Net dilation is based on reconvergences on its immediate neighborhood: source, sinks, fanin
- For any node that is an origin of any reconveregnce $\boldsymbol{R}_{x+}$, let the outweight be $R O(x)=\frac{\sum R O_{x^{*}}}{\sum R N_{x^{*}}}$
the average length of all out-bound reconvergences
- For all node that is a destination of any reconvergence $\boldsymbol{R}_{\text {ty }}$ 位 the inweight be

$$
R I(y)=\frac{\sum R I_{* y}}{\sum R N_{* y}}
$$

the average length of all in-bound reconvergences

## Dilation Factor

- Raw weight of a node $x$ is $R W^{\prime}(x)=R I(x)+R O(x)$
- Flip-Flops have many incident reconvergences, hence an adjustment w.r.to LUT-size $k$

$$
R W(x)=\left\{\begin{array}{l}
\log _{k} R W^{\prime}(x) ; \text { if } x \text { is a FF-node } \\
R W(x) \quad \text {; otherwise }
\end{array}\right.
$$

- Dilation on a net $N$ with $v_{N}$ as its source is
- Similar to flip-flops, 10 -Nets have large weights, hence an empirical value for 10 -Nets

$$
\mathrm{R}(N)= \begin{cases}3 / \sqrt{2} & ; \text { if } N \text { is a IO-Net } \\ \mathrm{R}^{\prime}(N) & ; \text { otherwise }\end{cases}
$$

## Phase 3-Uniform Distribution

- Let $p$ be the position in which Shape vector has the maximum value
- The nodes in this level are so many in number that they are expected to be uniformly distributed in the layout
- The nodes in this level are not connected to each other
- They are however strongly connected to the other nodes
- If a node has more than one such cell, the net will dilate more


## Spread Due To Uniform Distribution

- $S P=$ Set of nodes in the peak level
- $\mathrm{N}=$ Minimum required FPGA size
- Construct a hypothetical grid which has only one cell from SP
The hypothetical grid size is related to the FPGA dimension as

$$
G=\mathrm{N} / \sqrt{S P}
$$

- If a net $N$ has some nodes in SP then the span must respect the uniform distribution assumption
- The uniformity factor is calculated as

$$
U=\sqrt{\mid S P \cap \text { fanout }(v \mathrm{v}) \mid} \mid \cdot G
$$

## Bounding Box Span of a Net

- The bounding box span of the net $\mathbf{N}$ depends


## on

- L - the minimum span of the net
- $\mathrm{R}(\mathrm{N})$ - the dilation of the net due to reconvergences
- $U$ - the uniformity factor
- The horizontal span of the net N is

$$
H \operatorname{Span}(N)= \begin{cases}\max (L, U) & ; \text { if } \mathrm{R}(\mathrm{~N})<1 \\ \max (\mathrm{~L} \times \mathrm{R}(\mathrm{~N}), \mathrm{U}) & ; \text { if } \mathrm{R}(\mathrm{~N})>1\end{cases}
$$

The vertical span of the net is same as HSpan

- The total span of the net is

$$
\operatorname{Span}(N)=H \operatorname{Span}(N)+V \operatorname{Span}(N)=2 \cdot H \operatorname{Span}(N)
$$

## Results for Wirelength Estimation

| Circuit | Total | I/ONets | \#Nets | Nets | Total Error | I/O Error |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Error (\%) | Error (\%) | Err < N/4 | Err > N/4 | w/o $R_{N}(\%)$ | w/O $R_{N}(\%)$ |  |
| alu4 | 2.95 | -2.24 | 1299 | 223 | 48.45 | 48.48 |
| apex2 | -19.68 | 29.10 | 1616 | 262 | 53.15 | 66.58 |
| bigkey | -21.45 | 34.37 | 1699 | 8 | -21.42 | 47.48 |
| dsip | 11.08 | -4.07 | 1367 | 3 | 10.94 | -3.46 |
| misex3 | 4.49 | 6.77 | 1170 | 227 | 54.11 | 55.69 |
| pdc | 13.44 | -14.7 | 4.051 | 524 | 65.01 | 45.93 |
| s298 | -5.15 | 0 | 1837 | 94 | 20.86 | 38.16 |
| s38417 | -32.42 | -4.84 | 5955 | 451 | 37.82 | 50.57 |
| seq | 4.66 | 13.24 | 1522 | 228 | 56.11 | 58.8 |
| spla | -0.68 | -14.34 | 3329 | 361 | 58.29 | 46.09 |
| Totals | - | - | 23845 | 2194 |  | - |
| Avg. | $11.6 \%$ | $12.4 \%$ | - | - | 42.61 | 46.13 |

## Individual Wirelengths



## Error in Span Vs Number of Nets



## Routing Demand Estimation

- We use RISA to calculate number of routing elements needed
- An empirical technique based on wirelength of nets with various terminal sizes
- The routing demand is based on two factors
- $q$ - an empirical factor dependent on $t_{N}$
- Bounding Box Sizes
- The actual routing demand for a net N is calculated as

$$
D_{h}^{N}=q \times \frac{1}{\operatorname{HSpan}(N)} ; D_{v}^{N}=q \times \frac{1}{\operatorname{VSpan}(N)}
$$

## Definitions

- $\mathrm{C}=$ Number of Logic Blocks
- $n I O=$ Number of $/ / O$ blocks
- If the circuit is placed in the smallest possible device, its width (also height) is given as

$$
\mathrm{N}=\max (n I O / 4, \sqrt{\mathrm{C}})
$$

-TD = Total Number of Routing Elements Needed

$$
T D=\sum_{N} D_{h}^{N}+D_{v}^{N}
$$

## Channel Width Estimation for Pad Unconstrained Circuits

－Pad－Unconstrained Circuits
－ $\mathrm{N}=\sqrt{\mathrm{C}}$
－TD routing elements are uniformly distributed across the device
－Channel width $W$ is calculated as $W=\frac{T D}{C}=\frac{T D}{N \times N}$

| $\square \square \square \square \square \square \square \square$ |  |  |
| :---: | :---: | :---: |
| 口■ロпппппロロロ |  |  |
| ロロロロロロロロロロ |  |  |
| ロบп |  |  |
| $\square \square \square \square \square \square \square \square \square \square \square$ |  |  |
|  |  |  |
| $\square \square \square \square \square \square \square \square \square \square$ |  |  |
|  |  |  |
| $\square \square \square \square \square \square \square \square \square$ |  |  |

## Channel Width Estimation for Pad Constrained Circuits

## －Pad－Constrained Circuits

－ $\mathrm{N}=n I O / 4$
－Assume that all the logic blocks are placed in the center－consistent with modern placers
－However，TD routing elements should be distributed across the whole device
－Channel width $W$ is calculated as $W=\frac{T D}{C} \times \frac{\sqrt{C}}{N}=\frac{T D}{\sqrt{C} \times N}$

$$
\begin{aligned}
& \square \square \square \square-\square \square \square \square \square \\
& \text { ローロロロロロロロロ } \\
& \frac{I}{N}=
\end{aligned}
$$

$$
\begin{aligned}
& \text { ㅁㅁㅁㅁㅁㅁㅁㅁ } \\
& \text { ㅁロロロロロロロロ } \downarrow
\end{aligned}
$$

## Experimentation - Other Methods Compared

- RISA [ICCAD 94, DAC 2002]
- Post-placement technique
- Add up demands for different sites in the layout and find the maximum channel width
- Yang et al. [ISPD 2001]
- Rentian Method
- Extended for FPGAs in [8]
- Recursive partitioning of circuit and layout
- Worstcase congestion analysis on the boundaries


## Results for Channel Width Estimation

| Gircuit | $W_{\text {VPR }}$ | $W^{\prime}$ | $T$ | $W_{\text {RISA }}$ | $T_{\text {RISA }}$ | $W_{\text {RENT }}$ | $T_{\text {RENT }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| alu4 | 11 | 11.322 | 0.139 | 13.506 | 0.012 | 10.717 | 1.54 |
| apex2 | 12 | 12.981 | 0.234 | 14.911 | 0.022 | 21.322 | 2.49 |
| bigkey | 9 | 5.7 | 0.385 | 12.105 | 0.027 | 4.761 | 2.6 |
| dsip | 7 | 6.452 | 0.298 | 9.699 | 0.019 | 4.176 | 1.97 |
| misex3 | 11 | 11.252 | 0.156 | 13.649 | 0.015 | 11.682 | 1.37 |
| pdc | 16 | 11.991 | 5.034 | 20.418 | 0.103 | 19.067 | 14.61 |
| s298 | 8 | 8.27 | 1.06 | 9.963 | 0.010 | 14.15 | 2.33 |
| s38417 | 8 | 10.544 | 8.712 | 13.192 | 0.035 | 14.963 | 21.43 |
| seq | 12 | 11.826 | 0.208 | 14.53 | 0.019 | 13.468 | 2.07 |
| spla | 15 | 11.593 | 3.148 | 17.663 | 0.06 | 35.149 | 9.42 |
| Total | 109 | 102.29 | 19.38 | 139.63 | 0.324 | 149.455 | 59.86 |
| Error | - | $6.1 \%$ | - | $28.1 \%$ | - | $37.1 \%$ | - |

## Summary

- Identified some important circuit characteristics which dictate placement
- Push and Pull from reconvergences stretch wires
- Reconvergences capture more than the local neighborhood of cells
- $30 \%$ more accuracy with reconvergences factored in
- Bounding box prediction is accurate within $11.6 \%$ of post-placement lengths
- Channel widths are predicted within $6 \%$ of post-route results



## Illustration of Bounding Box Calculation

|  |  | Phase 3 |
| :---: | :---: | :---: |
|  | $\square \square \square \square \square \square \square$ | Node D and Node C are in peak |
|  | $\square \square \square \square \square \square \square$ | level and hence |
|  | $\square \square \square \square \square \square$ | should spread out |
| ase 2 | $\square \square=\square \square \square \square$ | Phase 1 |
| Node A has high | $\square \square \square \square \square \square \square$ | Sinks are tighty |
| reconvergence weight. Pulled | $\square \square \square \square \square \square \square \square \square \square \square \square \square \square$ | placed daround the |
| away from the net | $\square \square \square \square \square \square$ | source node |

## Overview of Our Methodology

## Reconvergence Analysis

- Perform reconvergence analysis within different sequential levels
- Assign weights to nodes based on reconvergences

For every net

- Calculate the minimum possible bounding box
- Find dilation factor using reconvergence weights
- Uniformly distribute peak nodes
- Calculate the actual span using these 3 factors
- Calculate the number of routing elements required using RISA for every net
- Calculate the total number of routing elements
- Distribute this routing demand evenly in the layout to obtain maximum channel width

