### University of Nebraska - Lincoln ## DigitalCommons@University of Nebraska - Lincoln **CSE Conference and Workshop Papers** Computer Science and Engineering, Department 1981 ## LSI Product Quality and Fault Coverage Vishwani D. Agrawal Bell Laboratories, Murray Hill, NJ Sharad C. Seth University of Nebraska-Lincoln, seth@cse.unl.edu Prathima Agrawal Bell Laboratories, Murray Hill, NJ Follow this and additional works at: https://digitalcommons.unl.edu/cseconfwork Part of the Computer Sciences Commons Agrawal, Vishwani D.; Seth, Sharad C.; and Agrawal, Prathima, "LSI Product Quality and Fault Coverage" (1981). CSE Conference and Workshop Papers. 44. https://digitalcommons.unl.edu/cseconfwork/44 This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in CSE Conference and Workshop Papers by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. ### LSI PRODUCT QUALITY AND FAULT COVERAGE ### Vishwani D. Agrawal Bell Laboratories Murray Hill, New Jersey 07974 ### Sharad C. Seth Dept. of Computer Science University of Nebraska, Lincoln, Nebraska 68588 ### **Prathima Agrawal** Bell Laboratories Murray Hill, New Jersey 07974 ### **ABSTRACT** At present, the relationship between fault coverage of LSI circuit tests and the tested product quality is not satisfactorily understood. Reported work on integrated circuits predicts, for an acceptable field reject rate, a fault coverage that is too high (99 percent or higher). This fault coverage is difficult to achieve for LSI circuits. This paper proposes a model of fault distribution for a chip. The number of faults on a defective chip is assumed to have a Poisson density for which the average value is determined through experiment on actual chips. The procedure, which relates the model to the chip being studied, is simple; one or more fabricated chip lots must be tested by a few preliminary test patterns. Once the model is characterized, the required value of fault coverage can be easily determined for any given field reject rate. The main advantage of such a model is that it adapts itself to the various characteristics of the chip (technology, feature size, manufacturing environment, etc.) and the fault model (e.g., stuck-type faults). As an example, the technique was applied to an LSI circuit; realistic results were obtained. ### 1. INTRODUCTION Tests for a large LSI circuit consist of patterns, applied to the input pins, that exercise either some or all circuit functions. Certain functions have faults that may be sensitive to data patterns, but in most cases, for practical reasons, tests cannot use exhaustive data patterns. Therefore, even though the circuit passes the tests, there is no guarantee that the circuit is fault-free. Thus, there is a need for determining how well a test can isolate a faulty circuit. Fault coverage obtained from a fault simulator is a commonly used criterion for judging the quality of tests. Since most present-day fault simulators can simulate only the single stuck-type faults, fault coverage usually refers to the percentage of these faults detected by the tests. Faults on an actual LSI chip, however, are caused by physical defects, such as shorts or breaks in metallization or diffusion runs, shorting of the substrate with metallization or diffusion, etc. [1]. Although many of these physical faults can be detected by tests that detect the single stuck-type faults, it is difficult to determine which faults may have been left undetected [1-3]. Also, the detection of multiple stuck-type faults is not assured by the single-fault tests [4, p. 21]. Since stuck-type faults represent only a portion of all possible faults, the coverage of stuck-type faults can only be regarded as a figure of merit for the tests. In this paper, we try to answer the question: how is this figure of merit related to the quality of the tested product? The desired value of the single stuck-type fault coverage would depend, of course, on circuit implementation, technology, manufacturing environment, and the required quality level of the tested product. As a rule, test designers attempt to provide as close to 100-percent fault coverage as possible. However, test development and test application costs increase very rapidly as we approach this goal. In reality, a large circuit may have redundancies that make the testing of certain faults impossible or irrelevant. Locating redundancies is a tedious process for which no automatic method is available. If complete design verification could be achieved, the undetected faults could be ignored as redundant. However, no satisfactory method of generating such tests is known. All faults do not occur with the same frequency. The relative frequency of occurrence depends on the technology, design rules, production environment, etc. The evaluation of tests, therefore, should consider these factors. The method described in this paper is based on a model of fault distribution for the chip. Distribution parameters are determined experimentally by examining an actual chip production lot. An analysis then gives the value of fault coverage required for a given quality (field reject rate [5]) of the tested chips. A previous attempt [5] was based on a more restrictive model for fault distribution. It produced satisfactory results for chips with high yield (typically SSI and MSI) but fault-coverage values for larger chips with lower yield were too pessimistic. Our analysis is not restricted to chips of any particular type or size, and can be applied to all scales of integration. ### 2. DEFINITIONS - A Chip area - D<sub>0</sub> Defect density - f Fault coverage - m Number of faults covered by tests - n Number of faults present on a chip - n<sub>0</sub> Average number of faults on a defective chip - $n_{av}$ Average number of faults on a chip - N Total number of possible faults on a chip - p(n) Probability of exactly n faults being present on a chip - P(f) Probability of a chip being found faulty when tested to a fault coverage f - $q_k(n)$ Probability of detecting exactly k faults when the chip has n faults present - r(f) Field reject rate for fault coverage f - y Yield of chips (probability of a manufactured chip being good) - $Y_{bg}(f)$ Probability of a faulty chip being tested as good when the fault coverage of tests is f - $\lambda$ A parameter depending on the variance of $D_0$ ### 3. STATISTICAL MODEL Assume that an integrated circuit chip has n faults. Although a chip can have several types of faults, we assume that they are equivalent to n single-stuck type faults. In other words, the faults present on the chip can be detected by tests that detect n stuck-type faults. Further assume that the yield of good chips is y, and that the number of faults, n, on a faulty chip has a Poisson distribution ([6], p. 156): $$p(n) = (1-y) \frac{(n_0-1)^{n-1}}{(n-1)!} e^{-(n_0-1)},$$ $$n = 1,2,3,...$$ $$p(0) = y,$$ (1) where $n_0$ is the average number\* of faults on a faulty chip. In the above expression, the Poisson density function was shifted to the right by one unit, since it was used for the probability of the number of faults on a defective chip, i.e., $n \neq 0$ , $n = 1,2,3,\cdots$ . From (1), the average number of faults is obtained as $$n_{av} = \sum_{n=0}^{\infty} n \rho(n) = (1-y)n_0.$$ (2) Indeed, the number of terms in this summation should be equal to the maximum number of faults N. In practice, however, the value of $n_0$ is much smaller than the maximum number of faults, and the use of the infinite sum, which allows a simple result, is numerically quite accurate. The distribution of faults, as given by (1), is characterized by the two parameters, y and $n_0$ . Further, we assume that the yield y of the chip is known, at least approximately. In fact, yield of integrated circuits has been widely studied in the past [7-12]. The following formula is often used for calculating chip yield [11,12]: $$y = (1 + \lambda D_0 A)^{-\frac{1}{\lambda}}, \qquad (3)$$ Note that the parameter n<sub>0</sub> is different from the average number of physical defects (D<sub>0</sub>A), which is used for calculating chip yield. In a high-density circuit, a physical defect can produce several logical faults. where A = chip area, $D_0$ = average number of defects per unit area, and $D_0^2 \lambda = \text{variance of } D_0$ . The parameters $D_0$ and $\lambda$ can be determined either experimentally [10], or from results on chips manufactured by the same processing system. The estimation of the remaining parameter, $n_0$ , will be discussed later. ## 4. PROBABILITY OF ISOLATING A FAULTY CHIP Assume that the total number of possible faults on a chip is N, where $N >> n_0$ . We test these chips by the tests that detect m faults. Fault coverage is then f = m/N. Let $q_k(n)$ be the probability of detecting exactly k faults when a chip has n faults present on it. An expression for $q_k(n)$ may be obtained by an analogy to the statistician's game of selecting a ball from an urn. Visualize N balls, one corresponding to each possible fault. Of these, exactly n are black, representing the actual faults on the chip. The remaining N-n balls are white and simply represent the sites of faults that are not present. Each fault covered by the tests is viewed as one ball selected without replacement from the urn. Then $q_k(n)$ is the probability of drawing exactly k black balls in m selections and is given by the hypergeometric density function ([6], pp. 43-44): $$q_k(n) = \frac{\binom{n}{k} \binom{N-n}{m-k}}{\binom{N}{m}}.$$ (4) The probability of passing the chip, having n faults, as good, is $$q_0(n) = \frac{\binom{N-n}{m}}{\binom{N}{m}} \simeq (1-f)^n , \qquad (5)$$ where f=m/N is the fault coverage of tests. The above approximation is quite accurate for $n << \sqrt{N(1-f)/f}$ , and it will be used in the following analysis. For larger values of n, a better closed-form expression is derived in the Appendix, where the accuracy of (5) is also discussed. Since the number of faults n on a bad chip is a random number, the probability (or yield) of a bad chip being tested good, is given by $$Y_{bg}(f) = \sum_{n=1}^{N} q_0(n) p(n).$$ (6) Substituting from (1) and (5), and simplifying, we get $$Y_{ba}(f) \simeq (1-f)(1-y)e^{-(n_0-1)f}$$ (7) The field reject rate r(f) is defined as the ratio of the number of bad chips tested good to the number of all chips that are tested good [5]. Therefore, $$r(f) = Y_{bg}(f)/[y+Y_{bg}(f)].$$ and by substituting from (7), we obtain $$r(f) = \frac{(1-f)(1-y)e^{-(n_0-1)f}}{y+(1-f)(1-y)e^{-(n_0-1)f}}.$$ (8) Figure 1 shows a plot of (8) for two different yields, y=0.80 and 0.20. In each case two curves corresponding to $n_0=2$ and 10 are drawn. The graph illustrates the dependence of test results on the parameter $n_0$ . Consider a yield of 80 percent, Fig. 1 Field reject rate for two chips with yields of 80 percent and 20 percent. say, for an MSI chip. If we wish to test the chip for a field reject rate below 0.5 percent, the fault coverage should be 95 percent for $n_0 = 2$ or 38 percent for $n_0 = 10$ . Similarly, for a yield of 20 percent (which is closer to LSI), one would require a fault coverage of 99 percent or 63 percent depending on whether $n_0$ is taken as 2 or 10. Intuitively, we would expect a smaller chip to have fewer faults than a larger chip. Thus one might have a smaller value of $n_0$ for MSI chips and a higher value for LSI chips. A higher value of $n_0$ , however, requires a lower fault coverage for a given field reject rate, indicating that for LSI chips, a relatively lower fault coverage might be adequate. As pointed out earlier, the parameter $n_0$ not only depends on the chip size, but may also be a function of technology, design rules, processing environment, etc. We will, therefore, use an experimental procedure for determining this parameter. ### 5. DETERMINATION OF no Consider the fraction of chips rejected by tests having a fault coverage f. This fraction is equal to the following probability: $$P(f) = 1 - y - Y_{ba}(f).$$ Substituting from (7), we get $$P(f) = (1-y) \left[ 1 - (1-f)e^{-(n_0 - 1)f} \right]. \tag{9}$$ For a given chip, the yield y can be calculated from (3). To determine $n_0$ , we start with a set of test patterns that need not have a high fault coverage. These patterns are evaluated on a fault simulator in the same order as they would be applied to the chip. A cumulative fault coverage as a function of the number of test patterns is obtained. Next, the patterns are used for testing chips being produced in the processing line. A chip is rejected at the first pattern it fails. A sufficiently large number of chips (say 100 to 200) are tested so that the cumulative fraction of rejected chips can be plotted as a function of the fault coverage. The calculated yield P(f)as computed from (9), is also plotted on the same graph for various values of $n_0$ . The value of $n_0$ closest to the experimental curve is selected for use in the calculation of the required fault coverage. Experience has shown that in LSI testing, a large proportion of chips is rejected by the first few test patterns. Thus, a graph of the fraction of rejected chips and P(f) exhibits a steeply rising straight-line behavior near the origin. The experimental value of this slope can also be used for determining $n_0$ , since from (9) $$P'(f) = \frac{dP(f)}{df} = (1-y)[1+(1-f)(n_0-1)]e^{-(n_0-1)f}$$ and $$P'(0) = (1-y)n_0. (10)$$ Notice that the slope P'(0) is equal to the average number $(n_{av})$ of faults as given by (2). One can determine an experimental value of P'(0) by applying a relatively small number of test patterns to the chips. Also, when the yield is not known, $n_0 \simeq P'(0)$ can be used as an estimate. Notice that P'(0) will be a close approximation for $n_0$ for low yield chips. Since, for a nonzero yield, $P'(0) < n_0$ , using P'(0) in place of $n_0$ will give a pessimistic (or safe) value of fault coverage. In Fig. 1, a lower value of $n_0$ means a higher fault coverage for a given field reject rate. An example using the procedures for determining $\boldsymbol{n}_0$ as outlined here will be given in a later section. # 6. DETERMINATION OF THE REQUIRED FAULT COVERAGE Once $n_0$ has been evaluated for a chip, the required fault coverage for any specified field reject rate can be computed from (8). It is, however, not very convenient to solve (8) for f. If the required field reject rate is r, then from (8), we get $$y = \frac{(1-r)(1-f)e^{-(n_0-1)f}}{r + (1-r)(1-f)e^{-(n_0-1)f}}.$$ (11) The result is plotted in Figs. 2, 3, and 4 for r=0.01,0.005 and 0.001, respectively. Fault coverage can be easily obtained from these graphs. For example, if the field reject rate was specified as one in a thousand, i.e., r=0.001, then from Fig. 4, for yield, y=0.3 and $n_0=8$ , the fault coverage should be about 85 percent. ### 7. EXAMPLE As an example, consider an LSI chip containing about 25,000 transistors for which test patterns had been evaluated on the LAMP fault simulator [13]. Results used here were obtained from testing wafers on the Fairchild Sentry test system [14]. Yield for this chip was estimated to be about 7 percent. The test pattern number, on which the chip first failed, was recorded. The cumulative number of failing chips as a function of the fault coverage is shown in Table 1. The procedure for obtaining the entries in this table can be understood by examining the first line. After the initialization sequence, on the first pattern at which the tester strobed the chip output, 113 of 277 (i.e., 41 percent) chips failed. From fault simulation, the fault coverage on Fig. 2 Fault coverage required for a field reject rate of 1-in-100. Fig. 3 Fault coverage required for a field reject rate of 1-in-200. this pattern was obtained as 5 percent. The results of Table 1 are plotted in Fig. 5, where a family of curves, P(f) versus f for $n_0=1$ through 12, is also plotted. The experimental points closely match the curve corresponding to $n_0=8$ . Also, if we approximate the slope of P(f) at the origin from the data in the first line in Table 1, we get P'(0)=0.41/0.05=8.2. From (10), $n_0=8.2/0.93=8.8$ . Fig. 4 Fault coverage required for a field reject rate of 1-in-1000. ### Result of Chip Test Yield ≈ 0.07 Total number of chips = 277 TABLE 1 | Fault Coverage (percent) | Cumulative<br>Number of<br>Chips Failed | Cumulative<br>Fraction of<br>Chips Failed | |--------------------------|-----------------------------------------|-------------------------------------------| | 5 | 113 | 0.41 | | 8 | 134 | 0.48 | | 10 | 144 | 0.52 | | 15 | 186 | 0.67 | | 20 | 209 | 0.75 | | 30 | 226 | 0.82 | | 36 | 242 | 0.87 | | 45 | 251 | 0.91 | | 50 | 256 | 0.92 | | 65 | 257 | 0.93 | Taking $n_0=8$ , we notice from Fig. 2 that for a 1 percent field reject rate, the fault coverage should be about 80 percent. As Fig. 4 indicates, the fault coverage should be improved to 95 percent in order to achieve a field reject rate of 1-in-1000. The above conclusions differ significantly from those obtained in [5], where the field reject was obtained as $$r = (1-y)(1-f)$$ . From this formula, for r = 0.01, y = 0.07, we get f = 99 percent and for r = 0.001, f = 99.9 percent. These fault coverages are significantly higher than Fig. 5 Determination of $n_0$ from experimental data. those obtained by the analysis presented here and, in fact, represent almost unachievable goals for LSI circuits. Our analysis would have given similar results for $n_0 = 3$ or 4. But $n_0 = 3$ or 4 produces a P(f) versus f curve that disagrees significantly with the experimental result (Fig. 5). If a large chip can be considered to be composed of several smaller chips, the average number of faults on a large faulty chip would be higher. Also, for a given chip area, one would expect the average number of logical faults to be higher for greater circuit density (e.g., in case of fine-line technology). The strength of our model lies in the experimental process by which the model parameter $(n_0)$ is determined for the actual chip being studied. The fault model used in determining the fault coverage (e.g., stuck-type faults) also influences the value of $n_0$ . For instance, let us assume that the tests that detect stuck-type faults detect only a few actual fault modes of the chip. As the tests are applied, the chips are rejected at a slower rate (Fig. 5) and we get a smaller value of $n_0$ . This means (Figs. 2,3,4) that the fault coverage (as measured in terms of stuck-type faults) should be higher. ### 8. CONCLUDING REMARKS In addition to determining fault-coverage requirements for a chip-processing line, the technique presented here, has other applications such as the prediction of the influence of fine-line technology on the testing problem. A given circuit, when implemented with finer design rules, occupies a smaller area. The yield, largely dependent on chip area, would be higher. In Figs. 2, 3, and 4, a higher yield indicates a lower fault-coverage requirement if $n_0$ remains fixed. However, when the circuit is shrunk into finer features, one expects many logical faults to be produced by a physical defect. This phenomenon could result in a higher value of $n_0$ , thereby further reducing the fault-coverage requirement. In our theory, we have introduced a new parameter, $n_0$ , the average number of faults on a defective chip. No attempt has been made to relate $n_0$ to the yield. Yield, which has been extensively studied in the past, is known to depend on chip area and defect density. The average number of faults also depends on chip area and defect density. Further work should establish at least an empirical relationship between yield and average number of faults. Since completion of this work, we have learned of similar work being pursued elsewhere [15]. ### **ACKNOWLEDGEMENT** We appreciate the cooperation of J. Kumar, J. H. Lee, and R. D. Taft in providing Sentry test data, and the assistance of R. Seth in compiling statistical data and E. Edelman and A. Gilmour in preparing this manuscript. ### **APPENDIX** ### Approximations for $q_0(n)$ Starting with equation (5), $$q_0(n) = {\binom{N-n}{m}} / {\binom{N}{m}}$$ (A.1) $$=\frac{(N-m)(N-m-1)\cdots(N-m-n+1)}{N(N-1)\cdots(N-n+1)}$$ $$= (1 - \frac{m}{N})^n \frac{\left[1 - \frac{1}{N - m}\right] \cdots \left[1 - \frac{n - 1}{N - m}\right]}{\left[1 - \frac{1}{N}\right] \cdots \left[1 - \frac{n - 1}{N}\right]}$$ $$q_0(n) = \left[1 - \frac{m}{N}\right]^n \left[1 - \frac{1}{N - m}\right] \left\{1 + \frac{1}{N} + \frac{1}{N^2} + \cdots\right\} \cdots$$ $$\left[1-\frac{n-1}{N-m}\right]\left\{1+\frac{n-1}{N}+\left[\frac{n-1}{N}\right]^2+\cdots\right\}$$ $$\simeq \left[1-\frac{m}{N}\right]^n \left\{1-\frac{m+1}{N(N-m)}+\frac{1}{N^2}\right\}\cdots$$ $$\left\{1-\frac{(n-1)(m+n-1)}{N(N-m)}+\left[\frac{n-1}{N}\right]^2\right\}.$$ Since $\lim_{N\to\infty} \left[1-\frac{x}{N}\right]^N = \exp(-x)$ , for large N, we have $$q_0(n) \simeq \left[1 - \frac{m}{N}\right]^n \exp\left\{-\frac{m+1}{N(N-m)} + \frac{1}{N^2}\right\} \cdots$$ $$\exp\left\{-\frac{(n-1)(m+n-1)}{N(N-m)} + \left[\frac{n-1}{N}\right]^2\right\}$$ $$= \left[1 - \frac{m}{N}\right]^{n} \times$$ $$\exp\left\{-\frac{(m+1) + 2(m+2) + \dots + (n-1)(m+n-1)}{N(N-m)}\right\}$$ $$+\frac{1^2+2^2+\cdots+(n-1)^2}{N^2}$$ $$= \left[1 - \frac{m}{N}\right]^n \exp\left\{-\frac{m(1+2+\cdots+n-1)}{N(N-m)}\right\}$$ $$-(1^{2}+2^{2}+\cdots+(n-1)^{2})\left[\frac{1}{N(N-m)}-\frac{1}{N^{2}}\right]\right\}.$$ From [16, p. 55], $$q_{0}(n) = \left[1 - \frac{m}{N}\right]^{n} \exp\left\{-\frac{mn(n-1)}{2N(N-m)} - \frac{\frac{1}{3}mn(n-1)(n-\frac{1}{2})}{N^{2}(N-m)}\right\}$$ $$= \left[1 - \frac{m}{N}\right]^n \exp\left\{-\frac{mn(n-1)}{2N(N-m)}\left[1 + \frac{2n-1}{3N}\right]\right\}.$$ Substituting $f = \frac{m}{N}$ , and for large N, $$q_0(n) \simeq (1-f)^n \exp\left\{-\frac{fn(n-1)}{2N(1-f)}\right\}.$$ (A.2) Also, $$q_0(n) \simeq (1-f)^n \tag{A.3}$$ where the condition for the last approximation is $n^2 \ll N(1-f)/f$ . The values of $q_0(n)$ , as computed from (A.1), (A.2), and (A.3) are plotted in Fig. 6. For $n \leq 4$ , all three values are the same. For larger n, the approximation (A.2) still coincides with the exact value (A.1). The error of (A.3) is small but can be noticed. ### REFERENCES - [1] J. Galiay, Y. Crouzet, and M. Vergniault, "Physical versus logical fault models in MOS LSI circuits. Impact on their testability," Ninth International Symposium on Fault Tolerant Computing, Madison, Wisconsin, June 20-22, 1979, Digest of Papers, pp. 195-202. - [2] G. R. Case, "Analysis of Actual Fault Mechanisms in CMOS Logic Gates," Proceedings of 13th Design Automation Conference, San Francisco, CA, June 28-30, 1976, pp. 265-270. - [3] R. L. Wadsack, "Fault Modeling and Logic Simulation of CMOS and MOS Integrated Circuits," Bell System Technical Journal, Vol. 57, May-June 1978, pp. 1449-1474. Fig. 6 Approximation for $q_0(n)$ . - [4] M. A. Breuer and A. D. Friedman, Diagnosis and Reliable Design of Digital Systems, Computer Science Press, Potomac, MD, 1976. - [5] R. L. Wadsack, "Fault Coverage in Digital Integrated Circuits," Bell System Technical Journal, Vol. 57, May-June 1978, pp. 1475-1488. - [6] W. Feller, An Introduction to Probability Theory and Its Applications, Vol. I, Wiley, NY, 1968 (Third Edition). - [7] B. T. Murphy, "Cost-Size Optima of Monolithic Integrated Circuits," *Proceedings of the IEEE*, Vol. 52, December 1964, pp. 1537-1545. - [8] R. B. Seeds, "Yield, Economic and Logistic Models for Complex Digital Arrays," 1967 IEEE International Convention Record, Part 6, pp. 60-61. - [9] J. E. Price, "A New Look at Yield of Integrated Circuits," *Proceedings of the IEEE*, Vol. 58, August 1970, pp. 1290-1291. - [10] C. H. Stapper, "Defect Density Distribution for LSI Yield Calculations," *IEEE Transactions on Electron Devices*, Vol. ED-20, July 1973, pp. 655-657. - [11] J. Sredni, "Use of Power Transformation to Model the Yield of ICs as a Function of Active Circuit Area," Proceedings of International Electron Device Meeting, Washington, D. C., December 1975, pp. 123-125. - [12] C. H. Stapper, "On a Composite Model to the IC Yield Problem," IEEE Journal of Solid-State Circuits, Vol. SC-10, December 1975, pp. 537-539. - [13] H. Y. Chang, G. W. Smith, Jr., and R. B. Walford, "LAMP: System Description," Bell System Technical Journal, Vol. 53, October 1974, pp. 1431-1499. - [14] Sentry 600 User's Manual, Part No. 67095498, Fairchild Systems Technology, Palo Alto, CA, December 1973. - [15] D. Griffin, "Estimation of DC Stuck-Fault Quality Levels Through Application of a mixed Poisson model," Proceedings of International Conference on Circuits and Computers (ICCC), Port Chester, NY, October 1-3, 1980, pp. 1099-1102. - [16] D. E. Knuth, The Art of Computer Programming, Vol. 1, Addison-Wesley, Reading, MA, 1975.