# Scan Chain Design for Three-dimensional Integrated Circuits (3D ICs)

Xiaoxia Wu Paul Falkenstern Yuan Xie Computer Science and Engineering Department The Pennylvavia State University, University Park, PA 16802 Email:{xwu,falkenst,yuanxie}@cse.psu.edu

## Abstract

Scan chains are widely used to improve the testability of IC designs. In traditional 2D IC designs, various design techniques on the construction of scan chains have been proposed to facilitate DFT (Design-For-Test). Recently, three-dimensional (3D) technologies have been proposed as a promising solution to continue technology scaling. In this paper, we study the scan chain construction for 3D ICs, examining the impact of 3D technologies on scan chain ordering. Three different 3D scan chain design approaches (namely, VIA3D, MAP3D, and OPT3D) are proposed and compared, with the experimental results for ISCAS89 benchmark circuits. The advantages as well as disadvantages for each approach are discussed. The results show that both MAP3D and VIA3D approaches require no changes of 2D scan chain algorithms, but OPT3D can achieve the best wire length reduction for the scan chain design. The average scan chain wire length of six ISCAS89 benchmarks obtained from OPT3D has 46.0% reduction compared to the 2D scan chain design. To the best of our knowledge, this is the first study on scan chain design for 3D integrated circuits.

# 1 Introduction

In VLSI circuit design, scan chains are introduced to improve the testability of integrated circuits [14]. After logic synthesis, all flip-flops in the circuits are replaced with scan flip flops. These scan flip-flops are connected sequentially to form a scan chain (or multiple scan chains) in a single chip. Each scan flip-flop in the scan chain has two input sources: the output of the previous flip-flop in the scan chain and the output of the combinational circuits. During normal operation, the response at the state outputs is captured in the flip-flop. In testing mode, test vectors are shifted into the registers through the primary input pads and the test output values are shifted out through the primary output pads. The output values are compared with expected values to examine if the circuit is working correctly or not. Fig.1 shows a conceptual example of a scan chain. When Test signal is low, the circuit is in normal mode (the solid paths) and the input to each flip-flop D1 is valid. When Test signal is high, the circuit is in test mode (the dotted paths) and the input to each flip-flop D2 is valid.

Although the scan chain technique offers testing convenience, there is an area overhead coming from both multiplexed data flip-flop and the routing of the stitching wires. Long stitching wires connecting the output of each flip-flop to the input of the next flip-flop increase the area of the circuit, make routing difficult, and influence test performance as well. Since one of the main objectives in design for testability is to minimize the impact of test circuitry on chip performance and cost, it is essential to minimize the wire length of a scan chain. Scan chain ordering techniques are used commonly in chip design to reduce wire length and circuit area [7, 8, 14].



Figure 1. A conceptual example of a scan chain.

As technology scales, interconnect becomes the dominant source of delay and power consumption. Reducing interconnect delay and power consumption has become a major concern in deep submicron designs. Three-dimensional

(3D) technologies are proposed as a promising solution to mitigate interconnect problems [1, 5, 11, 18]. In 3D chips, multiple active device layers are stacked together with direct vertical interconnects. The direct vertical interconnects are named Through Silicon Vias (TSV) or inter-wafer vias. There are several potential benefits in 3D ICs over traditional two-dimensional (2D) designs [18]: shorter global interconnect because the vertical distance (or the length of TSVs) between two layers are usually in the range of 10  $\mu m$  to 100  $\mu m$  [18], depending on different manufacturers; higher performance because of the reduction of average interconnect length, as well as the bandwidth improvement due to die stacking; lower interconnect power consumption due to wiring length reduction (reduced capacitance); higher packing density and smaller footprint; and the support of the implementation of mixed-technology chips.

The fabrication of 3D ICs has already become viable. For example, IBM announced the breakthrough which enables the move from horizontal 2-D chip layouts to 3-D chip stacking in early 2007. To efficiently exploit the benefits of 3D technologies, it is essential to develop related 3D CAD tools for designers to explore 3D IC design space.

Even though scan chain designs for traditional 2D chips have been intensively studied, to the best of our knowledge, there is no prior work on the study of the construction of scan chains for 3D chips. In this paper, we investigate the scan chain construction for 3D ICs, examining the impact of 3D technologies on the scan chain ordering. Different 3D scan chain design approaches are investigated and compared, and the advantages as well as disadvantages for each approach are discussed.

The rest of the paper is organized as follows: Section 2 presents related work on 3D technologies and 2D scan chain ordering; Section 3 describes the design methodology and possible methods for 3D scan chain construction; Section 4 uses a genetic algorithm based approach to evaluate different methods under various constraints; Section 5 presents the experimental results on ISCAS89 benchmark circuits. Finally, the conclusion is provided in Section 6.

# 2 Related Work

3D technologies have attracted lots of attention from industry and academia recently, spanning from 3D fabrication techniques [13, 15] to 3D microarchitecture designs [2, 12]. In the 3D EDA field, early design analysis 3D tools and 3D physical tools are developed in the last several years [3, 4, 10, 16].

In 2D IC designs, the scan chain ordering techniques have been proposed to reduce wire length, power consumption, and improve fault coverage [7–9, 14, 17]. The minimization of stitch wire length for scan chains is similar to the travelling salesman problem (TSP), which is an NP-hard



Figure 2. 2D scan chain design flow

problem. Makar et al. [14] proposed a layout-based approach, in which a scan chain is un-stitched during the scan chain insertion process and is reordered and connected after placement. Hirech et al. [8] integrated the scan chain ordering process into synthesis-based design optimization, which is after floorplanning or place-and-route. At the stage after placement, physical design information is available and the location of cells provide more realistic value to the scan chain model. Both of these two papers are based on cell-to-cell Manhattan distance, which is a symmetric TSP problem. Gupta et al. [7] considered scan chain ordering to be an asymmetric TSP problem, which is based on pin-to-pin Manhattan distance.

Till now, all the scan chain ordering techniques were studied in 2D design space and there is no prior work on the construction of scan chains for 3D ICs. In this paper, we study the scan chain construction for 3D ICs, examining the impact of 3D technologies on the scan chain ordering techniques.

## 3 3D Scan Chain Ordering Methodologies

In this section, we first discuss the design flow changes when moving from 2D to 3D for scan chain construction, and then present three different approaches to construct 3D scan chains.

#### 3.1 Design flow for scan chain insertion

One typical 2D design methodology with scan chain design is shown in Fig.2. After design synthesis, the scan chain insertion tool is used on the generated gate level netlist, leaving the scan chain unstitched. After a 2D placement is performed on the netlist, the scan chain ordering procedure is run, producing a reordered scan chain. In the 2D methodology, commercial tools such as Synopsys Design Compiler and Cadence First Encounter can be used in the flow. One key difference between a 2D IC design and a 3D IC design is that the scan cells are now placed in three dimensions, rather than on a 2D plane. In this paper, we evaluate the scan chain construction with minimal changes to the typical 2D design flow. For example, Fig. 3 shows the 3D scan chain ordering methodology we follow. To facilitate placement, a 3D placement and routing tool named PR3D [4] is used in the flow. PR3D partitions the 2D netlist, which is obtained from Cadence First Encounter, into different layers and generates the corresponding DEF file for each layer. The number of layers can be defined in the command line when running PR3D tool. Cell locations are indicated in each DEF file, which are extracted by our 3D scan chain ordering procedure.



Figure 3. 3D scan chain design flow

#### 3.2 3D scan chain design methodologies

When moving from the 2D IC design to the 3D IC design domain, there are several possible methods to connect the scan chain. This section describes the approaches to design 3D scan chains, and discusses the advantages as well as the disadvantages for each approach, using a simple example in Fig. 4 to illustrate the difference. In Fig. 4, the design has two layers, each of which has 3 scan cells to be connected. Note that in a 2D design, the location for each scan cell can be represented as  $(x_i, y_i)$ , but in 3D design the location for each scan cell will be represented as  $(x_i, y_i, L_i)$ , in which  $L_i$  indicates the layer at which the scan cell is located. Also, as mentioned in Section 1, during 3D integration, each wafer (die) is thinned and bonded together, and the length of through-silicon-via (TSV) is usually in the range of  $10\mu m$  to  $100\mu m$ , depending on various 3D integration technologies.

• Approach 1 (VIA3D). The simplest way is to perform 2D scan chain insertion and ordering for each layer sep-



Figure 4. A conceptual example of 3D IC design with 2 layers, each of which has 3 scan cells to be connected.



Figure 5. Approach 1 (VIA3D): Each layer is treated independently, with a 2D scan chain ordering method. Each scan chain is connected to other scan chains in difference layer with a single TSV. This approach will result in minimal number of TSVs.

arately, and then connect N (N is the number of layers) scan chains into one single scan chain by using N - 1 through silicon vias (TSVs). Fig. 5 illustrates such an approach: Nodes 1, 2, and 3 are connected to form a scan chain in layer 1; Nodes 4, 5, and 6 are connected to form a scan chain in layer 2. A through-silicon-via (TSV, the solid line in the figure) is then used to connect these two chains to be a single chain.

- Advantage: Such an approach requires no change to the scan chain ordering algorithm: each layer is processed independently, with a 2D scan chain ordering algorithm. The resultant TSV number is minimized (N - 1 TSVs for N layers).
- *Disadvantage*: Because it is a *locally optimized* approach, it may result in the shortest scan chain for each layer, but the total scan chain length may not be globally optimized.

We call this method to be **VIA3D** scan chain ordering since the number of through silicon vias is minimized.

• Approach 2 (MAP3D). Since the vertical distance between layers is small (in the range of 10 um to 100 um), the second method is to transform a 3D scan chain ordering problem into a 2D ordering problem, by mapping the nodes from several layers into one single layer (i.e.,  $(x_i, y_i, L_i)$  is mapped to  $(x_i, y_i)$ ). A 2D scan chain or-



Figure 6. Approach 2 (MAP3D): (a) All scan cells are mapped to 2D space (i.e.,  $(x_i, y_i, L_i)$  is mapped to  $(x_i, y_i)$ ). A 2D scan chain ordering method is then applied to the design. (b) Such approach ignores the TSV length, and may end up to have many TSVs (the solid lines in the figure).

dering method is then applied to the design. Fig. 6 illustrates such an approach. After mapping the top layer nodes (Node 1, 2, and 3) onto the bottom layer, and performing 2D scan chain ordering, the scan chain order is 4-1-5-2-6-3. Based on such scan chain ordering, in 3D design, if two connected nodes are in different layers, a through silicon via (TSV) is used. In this example, there are 5 TSVs (the solid lines in the figure).

- *Advantage:* Such an approach requires no change to the scan chain ordering algorithm: after mapping all the nodes to a 2D plane, a 2D scan chain ordering algorithm is applied. It is a *global* optimization method.
- *Disadvantage*: The vertical distance between layers is ignored. It may end up to using many TSVs going back and forth between layers.

We call this method to be **MAP3D** approach, because a 3D scan chain ordering problem is mapped to be a 2D scan chain ordering problem.

• Approach 3 (OPT3D) The third approach is optimal(OPT) 3D ordering, from which we try to find the optimal solution for minimized wire length to form the scan chain. In this approach, the distance function includes horizontal cell-to-cell Manhattan distance between cells as well as vertical distance between two layers. In such case, we cannot apply a 2D scan chain ordering algorithm



Figure 7. Approach 3 (OPT 3D): A true 3D scan chain ordering method has to be developed to consider the whole design space. Such approach takes into account the TSV length. A constraint on the number of TSVs can also be applied.

directly. The data structure (for example, the coordinates of a scan cell) may need to be modified. However, we take into account the 3D TSV effect (the length of TSVs and the number of TSVs) in the optimization, and can have full control of the optimization process: for example, we may apply constraints on how many TSVs can be used during scan chain ordering. Fig 7 illustrates such an approach.

- Advantage: Such an approach is a true 3D scan chain ordering optimization: the length of TSVs and the number of TSVs are considered during optimization. Users have full control of the optimization process. It is a *global* optimization method.
- Disadvantage: Modifications to 2D scan chain ordering algorithms are needed before they can be applied.

We call this method to be the **OPT3D** approach, because it is a true 3D scan chain ordering optimization approach.

During 3D design, one of these methods can be chosen according to the requirements, such as via number limitations and the easiness to implement. For example, one may want to reserve as many TSVs as possible for signal routing or for thermal conduction, and choose the *VIA3D* approach. On the other hand, if minimizing scan chain length is more important, and one does not want to make the effort to change the 2D scan chain algorithm, then the *MAP3D* approach can be adopted.

# 4 3D Scan Chain Ordering Algorithm

The previous section describes the methodology of constructing a 3D scan chain, and discusses the pros and cons of three different approaches (namely VIA3D, MAP3D, and OPT3D). However, the approaches are generic, and are not limited to any specific scan chain ordering algorithm. In this section, we use a specific scan chain ordering algorithm based on Genetic Algorithms, to evaluate these different approaches.

As mentioned in Section 2, scan chain ordering for minimizing wire length is similar to the traveling salesman problem (TSP), which is an NP-hard problem. In this section, a 3D scan chain ordering problem can be implemented based on genetic algorithm symmetric TSP, and can be defined as follows: Given the location  $(x_i, y_i)$  and layer number  $L_i$ of each flip-flop cell in a 3D circuit, find a scan chain connecting all the flip-flop cells with the minimized stitch wire length. The algorithm takes N DEF files as inputs (N is the number of layers), extracts the location of flip-flop cells, and outputs the scan chain with minimized wire length and the number of through silicon vias.

#### 4.1 Genetic Algorithm

A Genetic algorithm (GA) [6] is a search and optimization method that mimics the evolutionary principles in natural selection. Fig. 8 shows a genetic algorithm flow example. The solution is usually encoded into a string called a chromosome (in Fig. 8, the chromosome is encoded as binary string). Instead of working with a single solution, the search begins with a random set of chromosomes called the initial population. Each chromosome is assigned a fitness score that is directly related to the objective function of the optimization problem. The population of chromosomes is modified to a new generation by applying three operators similar to natural selection operators: reproduction, crossover, and mutation. Reproduction selects good chromosomes based on the fitness function and duplicates them. Crossover picks two chromosomes randomly and some portions of the chromosomes are exchanged with a probability  $P_c$ . Finally, the mutation operator changes a"1" to a "0" and vice versa with a small mutation probability  $P_m$ . A genetic algorithm successively applies these three operators in each generation until a termination criterion is met. It can very effectively search a large solution space while ignoring regions of the space that are not useful. In general, a genetic algorithm has the following steps: Generation of initial population; Fitness function evaluation; Selection of chromosomes; Reproduction, Crossover, and Mutation operations.

# 4.2 GA-based scan chain ordering framework

In our GA-based scan chain ordering framework, each flip-flop cell in the circuit is given a unique identification number. A possible solution, which is called the chromosome, is a scan chain path represented by an ordered list of numbers corresponding to the nodes, such that every node



Figure 8. Genetic algorithm flow

is visited exactly once.

The fitness function, which decides the surviving chance of a chromosome (a scan chain path), is the wire length of this path. In the fitness evaluation stage, all the paths' fitnesses are calculated. The path with the lowest score is the path with the least wire length and thus the best option compared to the population.

In reproduction, there is a tournament selection where the paths with a lower fitness score beat paths with higher scores. The winners of the tournament are selected to be in the next generation's population. In the crossover stage, a segment of one path is chosen and inserted in the same position into another path. However, since the second path still contains its original nodes, it contains the nodes from the segment twice (once from the original path and once from the insertion of the segment). The original position of the nodes that form the segment are deleted from the second path.

Instead of the classical approach to mutation, where every chromosome in the resulting population has a very small chance of mutating, in our algorithm, the resulting population from reproduction and crossover is copied and mutation operates on this population. Each copied path has a probability of mutating equal to the mutation rate. The next generation's population consists of the winners of the tournament, the children of the crossover, and the result of mutation on their copies. The mutation operator in our algorithm swaps two cities in the path with a 25% chance and reverses a segment between two cities in the path with a 75% chance.

The fitness evaluation, reproduction, crossover, and mu-

tation give a new population for the next generation. These steps are repeated until a set number of iterations or the termination criteria is met. The termination criteria is based on the stability of the best fitness score. If the fitness score has not improved by more than .01% over the last 1000 generations, then the algorithm is terminated.

# 5 Experiments

To evaluate our genetic algorithm based 3D scan chain ordering approaches, we implemente the 3D scan chain ordering algorithm and conducte experiments on a set of IS-CAS89 benchmark circuits. All experiments are performed on a dual Intel Xeon processor (3.2GHz, 2GB RAM) Linux machine. We use MIT Lincoln Lab's 180nm 3D library to perform the synthesis, placement, and routing.

In Table 1, we summarize the wire length comparison among 2D scan chain ordering, VIA3D, MAP3D, and OPT3D approaches. The distance between two layers is set to be  $10\mu m$ .

The first column gives the circuit names selected from the ISCAS89 benchmarks and the number of flip-flop cells (included in the bracket). The number of the flip-flop cells ranges from 74 in s1423 to 1728 in s35932. The second column provides the wire length result obtained from 2D scan chain ordering, which is also based on a genetic algorithm symmetric TSP. The third column is the layer number in 3D circuits, ranging from 2 to 4 layers. The fourth to ninth columns show the wire length and via number resulting from VIA3D, MAP3D, and OPT3D, in which the unit of wire length is  $\mu m$ . The last three columns provide the wire length reduction of VIA3D, MAP3D, and OPT3D approaches over 2D ordering.

From the table, one can observe that OPT3D can achieve the best wire length reduction for the scan chain design. The average reduction from 2D to OPT3D is 46.0% and the maximum reduction is 57.1% for s1423. The average reduction for VIA3D and MAP3D approaches are 16.0% and 35.2%, respectively.

When the number of layers increases, sometimes the scan chain length of MAP3D and VIA3D increases, contrary to the expected results. This happens for MAP3D between layers 3 and 4 for all circuits except s35932, and occurs for VIA3D when increasing to 4 layers in several circuits, as well as for all layers in circuit s35932. The increase of the scan chain length for MAP3D can be attributed to the large via count in the scan chain. The distance traveled between the layers of the scan chain accumulates with a large via count, thus increasing the scan chain length. The increase in scan chain length for VIA3D is caused by needing to connect each scan chain for each layer. Though, when adding another layer, the length of the scan chain decreases on each layer, the number of scan chains requiring connection increases. Therefore, there are a greater number of smaller chains whose total surpasses the lesser number of larger chains.

Table 2 gives the wire length result with via number limits in the OPT3D approach. The via number limit is set to be 20 for the smallest three circuits and 100 for the three largest circuits. It shows that limiting the number of vias increases the wire length but provides a means to control the routing congestion caused by vias. It also indicates that users can have control of the optimization process according to the different requirements.

| Circuits | layer  | wire length | via number | wire length | via number |  |
|----------|--------|-------------|------------|-------------|------------|--|
| Circuits |        |             |            | 0           |            |  |
|          | number | with limit  | with limit | w/o limit   | w/o limit  |  |
| s1423    | 2      | 1289        | 20         | 1265        | 18         |  |
| (74)     | 3      | 962         | 20         | 949         | 22         |  |
|          | 4      | 926         | 20         | 839         | 27         |  |
| s5378    | 2      | 3237        | 20         | 3036        | 46         |  |
| (179)    | 3      | 2701        | 19         | 2386        | 50         |  |
|          | 4      | 2551        | 20         | 2172        | 57         |  |
| s9234    | 2      | 4128        | 20         | 3517        | 46         |  |
| (211)    | 3      | 3028        | 19         | 2862        | 56         |  |
|          | 4      | 2754        | 20         | 2544        | 68         |  |
| s13207   | 2      | 11062       | 100        | 10356       | 145        |  |
| (638)    | 3      | 9506        | 100        | 8461        | 168        |  |
|          | 4      | 8661        | 100        | 7486        | 183        |  |
| s15850   | 2      | 9688        | 100        | 8737        | 113        |  |
| (534)    | 3      | 7743        | 100        | 6881        | 129        |  |
|          | 4      | 7062        | 100        | 6528        | 169        |  |
| s35932   | 2      | 50772       | 100        | 37748       | 480        |  |
| (1728)   | 3      | 37024       | 100        | 31666       | 618        |  |
|          | 4      | 40225       | 100        | 30814       | 706        |  |

# Table 2. OPT3D wire length results with via number constraint

# 6 Conclusion

In traditional 2D IC design, scan chains are widely used to improve the circuit testability. In this paper, for the first time, we study the scan chain construction for 3D ICs. Different 3D scan chain design approaches are investigated and compared, with the design goal of minimizing the stitching wire length. The experimental results show that both MAP3D and VIA3D approaches require no change to 2D scan chain algorithms, but OPT3D can achieve the best wire length reduction for the scan chain design. The average wire length for six ISCAS89 benchmarks obtained from OPT3D has 46.0% reduction compared to the 2D scan chain design. To the best of our knowledge, this is the first study on scan chain designs for 3D integrated circuits.

## 7 Acknowledgement

This research is partially supported by NSF CAREER 0643902, NSF CCF 0702617. The authors acknowledge

|          |         |        |             |            |             |            | <u> </u>    |            |         |         |         |
|----------|---------|--------|-------------|------------|-------------|------------|-------------|------------|---------|---------|---------|
| Circuits | 2D wire | Layer  | VIA3D       | VIA3D      | MAP3D       | MAP3D      | OPT3D       | OPT3D      | VIA3D   | MAP3D   | OPT3D   |
|          | length  | number | wire length | via number | wire length | via number | wire length | via number | over 2D | over 2D | over 2D |
| s1423    | 1954    | 2      | 1538        | 1          | 1365        | 36         | 1265        | 18         | 21.5%   | 30.1%   | 35.3%   |
| (74)     |         | 3      | 1102        | 2          | 1111        | 53         | 949         | 22         | 43.6%   | 43.2%   | 51.4%   |
|          |         | 4      | 989         | 3          | 1148        | 77         | 839         | 27         | 49.4%   | 41.3%   | 57.1%   |
| s5378    | 4529    | 2      | 3536        | 1          | 3249        | 101        | 3036        | 46         | 21.9%   | 28.3%   | 33.0%   |
| (179)    |         | 3      | 2961        | 2          | 2832        | 140        | 2386        | 50         | 34.6%   | 37.5%   | 47.3%   |
|          |         | 4      | 3337        | 3          | 3148        | 213        | 2172        | 57         | 26.3%   | 30.5%   | 52.0%   |
| s9234    | 5505    | 2      | 4301        | 1          | 3923        | 107        | 3517        | 46         | 21.9%   | 28.7%   | 36.1%   |
| (211)    |         | 3      | 3415        | 2          | 3128        | 139        | 2862        | 56         | 38.0%   | 43.2%   | 48.0%   |
|          |         | 4      | 3772        | 3          | 3644        | 253        | 2544        | 68         | 31.5%   | 33.8%   | 53.8%   |
| s13207   | 16254   | 2      | 14388       | 1          | 11664       | 362        | 10359       | 145        | 11.5%   | 28.4%   | 36.3%   |
| (638)    |         | 3      | 14438       | 2          | 10738       | 605        | 8461        | 168        | 11.1%   | 34.0%   | 48.0%   |
|          |         | 4      | 16341       | 3          | 11228       | 793        | 7486        | 183        | -0.5%   | 31.0%   | 54.0%   |
| s15850   | 13950   | 2      | 11930       | 1          | 9459        | 239        | 8737        | 113        | 14.5%   | 32.2%   | 37.4%   |
| (534)    |         | 3      | 11156       | 2          | 8520        | 443        | 6881        | 129        | 20.0%   | 38.9%   | 50.7%   |
|          |         | 4      | 12838       | 3          | 8954        | 604        | 6528        | 169        | 8.0%    | 35.8%   | 53.2%   |
| s35932   | 56358   | 2      | 63673       | 1          | 39449       | 866        | 37748       | 480        | -4.9%   | 35.0%   | 37.8%   |
| (1728)   |         | 3      | 77037       | 2          | 35259       | 1466       | 31666       | 618        | -26.9%  | 41.9%   | 47.8%   |
|          |         | 4      | 80966       | 3          | 36788       | 2090       | 30814       | 706        | -33.4%  | 39.4%   | 49.2%   |
| average  |         |        |             |            |             |            |             |            | 16.0%   | 35.2%   | 46.0%   |
|          |         |        |             |            |             |            |             |            |         |         |         |

Table 1. 2D, VIA3D, MAP3D, and OPT3D scan chain ordering comparison under single clock domain

In the first column, the number of scan cells in each circuit is indicated in the bracket.

IBM's Kerry Bernstein and Albert Young for their invaluable help with the understanding of 3D fabrication process. The authors would like to thank Prof. Krishnendu Chakrabarty from Duke University for the discussion on circuit testability, and thank Prof. Rhett Davis from NCSU for the feedback.

# References

- C. Ababei, Y. Feng, B. Goplen, H. Mogal, T. Zhang, K. Bazargan, and S. S. Sapatnekar. Placement and routing in 3d integrated circuits. *IEEE Design and Test of Computers*, 22(6):520–531, 2005.
- [2] B. Black, D. W. Nelson, C. Webb, and N. Samra. 3d processing technology and its impact on ia32 microprocessors. In *ICCD*, pages 316–318, 2004.
- [3] J. Cong, W. Jie, and Z. Yan. A thermal-driven floorplanning algorithm for 3d ics. In *International Conference on Computer Aided Design*, pages 306–313, 2004.
- [4] S. Das, A. Chandrakasan, and R. Reif. Design tools for 3-d integrated circuits. In *Design Automation Conference*, 2003. *Proceedings of the ASP-DAC 2003. Asia and South Pacific*, pages 53–56, 2003.
- [5] W. R. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A. M. Sule, M. Steer, and P. D. Franzon. Demystifying 3d ics: the pros and cons of going vertical. *IEEE Design and Test of Computers*, 22(6):498–510, 2005.
- [6] D. Goldberg. *Genetic Algorithms in Search, Optimization, and Machine Learning.* Addison-Wesley, New York, 1989.
- [7] P. Gupta, A. B. Kahng, and S. Mantik. Routing-aware scan chain ordering. pages 857–862, 2003.
- [8] M. Hirech, J. Beausang, and G. Xinli. A new approach to scan chain reordering using physical design information. In *International Test Conference*, pages 348–355, 1998.
- [9] X. L. Huang and J. Huang. A routability constrained scan chain ordering technique for test power reduction. In Asia

and South Pacific Conference on Design Automation, page 5 pp., 2006.

- [10] W. L. Hung, G. M. Link, Y. Xie, N. Vijaykrishnan, and M. J. Irwin. Interconnect and thermal-aware floorplanning for 3d microprocessors. In *International Symposium on Quality Electronic Design*, 2006.
- [11] J. W. Joyner and J. D. Meindl. Opportunities for reduced power dissipation using three-dimensional integration. In *Interconnect Technology Conference*, 2002. Proceedings of the IEEE 2002 International, pages 148–150, 2002.
- [12] J. Kim, C. Nicopoulos, D. Park, R. Das, Y. Xie, N. Vijaykrishnan, and C. Das. A novel dimensionally-decomposed router for on-chip communication in 3d architectures. In *ISCA*, 2007.
- [13] K. W. Lee, T. Nakamura, T. Ono, Y. Yamada, T. Mizukusa, H. Hashimoto, K. T. Park, H. Kurino, and M. Koyanagi. Three-dimensional shared memory fabricated using wafer stacking technology. In *International Electron Devices Meeting*, pages 165–168, 2000.
- [14] S. Makar. A layout-based approach for ordering scan chain flip-flops. In *International Test Conference*, pages 341–347, 1998.
- [15] R. Reif, A. Fan, C. Kuan-Neng, and S. Das. Fabrication technologies for three-dimensional integrated circuits. In *International Symposium on Quality Electronic Design*, pages 33–37, 2002.
- [16] Y. Tsai, Y. Xie, N. Vijaykrishnan, and M. J. Irwin. Threedimensional cache design exploration using 3dcacti. In *International Conference on Computer Design*, pages 519– 524, 2005.
- [17] L. Wei, W. Seongmoon, S. T. Chakradhar, and S. M. Reddy. Distance restricted scan chain reordering to enhance delay fault coverage. In *International Conference on VLSI Design*, pages 471–478, 2005.
- [18] Y. Xie, G. H. Loh, B. Black, and K. Bernstein. Design space exploration for 3d architectures. J. Emerg. Technol. Comput. Syst., 2(2):65–103, 2006.