scispace - formally typeset
Search or ask a question

Showing papers on "Sample size determination published in 1981"


Book
01 Jan 1981
TL;DR: In this paper, the basic theory of Maximum Likelihood Estimation (MLE) is used to detect a difference between two different proportions of a given proportion in a single proportion.
Abstract: Preface.Preface to the Second Edition.Preface to the First Edition.1. An Introduction to Applied Probability.2. Statistical Inference for a Single Proportion.3. Assessing Significance in a Fourfold Table.4. Determining Sample Sizes Needed to Detect a Difference Between Two Proportions.5. How to Randomize.6. Comparative Studies: Cross-Sectional, Naturalistic, or Multinomial Sampling.7. Comparative Studies: Prospective and Retrospective Sampling.8. Randomized Controlled Trials.9. The Comparison of Proportions from Several Independent Samples.10. Combining Evidence from Fourfold Tables.11. Logistic Regression.12. Poisson Regression.13. Analysis of Data from Matched Samples.14. Regression Models for Matched Samples.15. Analysis of Correlated Binary Data.16. Missing Data.17. Misclassification Errors: Effects, Control, and Adjustment.18. The Measurement of Interrater Agreement.19. The Standardization of Rates.Appendix A. Numerical Tables.Appendix B. The Basic Theory of Maximum Likelihood Estimation.Appendix C. Answers to Selected Problems.Author Index.Subject Index.

16,435 citations


Journal ArticleDOI
TL;DR: It is recommended that the Morisita index be used whenever possible to avoid the complex dealings with effects of sample size and diversity; however, when previous logarithmic transformation of the data is required, theMorisita-Horn or the Renkonen indices are recommended.
Abstract: The effect of sample size and species diversity on a variety of similarity indices is explored. Real values of a similarity index must be evaluated relative to the expected maximum value of that index, which is the value obtained for samples randomly drawn from the same universe, with the diversity and sample sizes of the real samples. It is shown that these expected maxima differ from the theoretical maxima, the values obtained for two identical samples, and that the relationship between expected and theoretical maxima depends on sample size and on species diversity in all cases, without exception. In all cases but one (the Morisita index) the expected maxima depend strongly to fairly strongly on sample size and diversity. For some of the more useful indices empirical equations are given to calculate the expected maximum value of the indices to which the observed values can be related at any combination of sample sizes. It is recommended that the Morisita index be used whenever possible to avoid the complex dealings with effects of sample size and diversity; however, when previous logarithmic transformation of the data is required, which often may be the case, the Morisita-Horn or the Renkonen indices are recommended.

1,048 citations


Journal ArticleDOI
TL;DR: The importance of sample size evaluation in clinical trials is reviewed and a general method is presented from which specific equations are derived for sample size determination or the analysis of power for a wide variety of statistical procedures.

936 citations


Journal ArticleDOI
TL;DR: Test statistics, confidence intervals, and sample size calculations are discussed, and the required sample size may be larger for either null hypothesis formulation than for the other, depending on the specific assumptions made.

820 citations


Journal ArticleDOI
TL;DR: In this article, an asymptotically optimal selection of regression variables is proposed, where the key assumption is that the number of control variables is infinite or increases with the sample size.
Abstract: SUMMARY An asymptotically optimal selection of regression variables is proposed. The key assumption is that the number of control variables is infinite or increases with the sample size. It is also shown that Mallows's Qp, Akaike's FPE -and AIC methods are all asymptotically equivalent to this method.

542 citations


Journal ArticleDOI
01 Jul 1981-Genetics
TL;DR: Application of the formulae to data from an isolated population of Dacus oleae has shown that the effective size of this population is about one tenth of the minimum census size, though there was a possibility that the procedure of sampling genes was improper.
Abstract: The statistical properties of the standardized variance of gene frequency changes (a quantity equivalent to Wright's inbreeding coefficient) in a random mating population are studied, and new formulae for estimating the effective population size are developed. The accuracy of the formulae depends on the ratio of sample size to effective size, the number of generations involved (t), and the number of loci or alleles used. It is shown that the standardized variance approximately follows the chi(2) distribution unless t is very large, and the confidence interval of the estimate of effective size can be obtained by using this property. Application of the formulae to data from an isolated population of Dacus oleae has shown that the effective size of this population is about one tenth of the minimum census size, though there was a possibility that the procedure of sampling genes was improper.

497 citations


Journal ArticleDOI
TL;DR: In this paper, an acceptably accurate approximation for the sampling distribution of the angle between two sample mean directions, conditional on the observed lengths of the vector resultants, is derived for samples drawn from Fisher populations sharing a common true mean direction.
Abstract: Summary An acceptably accurate approximation for the sampling distribution of the angle between two sample mean directions, conditional on the observed lengths of the vector resultants, is derived for samples drawn from Fisher populations sharing a common true mean direction. From this a test is given for the null hypothesis that two populations (with a common precision parameter) share a common true mean direction. This test is then compared with the unconditional test derived by Watson. The conditional test is then extended to an approximate test for the case where the two populations do not share a common precision parameter. The conditional test for populations with a common precision parameter is then extended to the case where it is desired to test simultaneously whether several samples could have been drawn from populations sharing a common true mean direction. The pooled, unbiased estimate for the inverse of the precision parameter is determined. From this a test for homogeneity of the precision parameter is derived for the case of several samples having unequal sample sizes.

417 citations


Journal ArticleDOI
TL;DR: In this article, a regression method is introduced that uses data from well-sampled individuals whose true home ranges are assumed approximately known to predict home-range areas for less well sampled individuals, and the sizes of home ranges estimated by the regression method are half or less than sizes estimated by previous methods in which utilization distributions are assumed to be all of a particular statistical type.

362 citations


Journal ArticleDOI
TL;DR: Chan, Hayya, and Ord as discussed by the authors showed that the residuals from linear regression of a realization of a random walk (the summation of a purely random series) on time have autocovariances which for given lag are a function of time and thereafter that residuals are not stationary.
Abstract: Econometric analysis of time series data is frequently preceded by regression on time to remove a trent component in the date. The resulting residuals are then treated as a stationary series to which procedures requiring stationarity, such as spectral analysis, can be applied. The objective is often to investigate the dynamics of transitory movements in the systems, for example, in econometric models of the business cycle. When the data does consist of a deterministic function of time plus a stationary error then regression residuals will clearly be unbiased estimates of the stationary component. However, if the data is generated by (possibly repeated) summation of a satisfactory and inevitable process then the series cannot be expressed as a deterministic function of time plus a stationary deviation, even though a least squares trend line and the associated residuals can always be calculated for any given finite sample. In a recent paper, Chan, Hayya, and Ord (1977) hereafter CHO) were able to show that a residuals from linear regression of a realization of a random walk (the summation of a purely random series) on time have autocovariances which for given lag are a function of time and thereafter that the residuals are not stationary. Further, CHO established that the expected sample autocovariance function (the expected autocovariances for given lag averaged over the time interval of the sample) is a function of sample size as well as lag and therefore an artifact of the detrending procedure. This function is characterized by CHO in their figure 1 as being effectively linear in lag (although the exact function is a fifth degree polynomial) with the rate of decay from unity at the origin depending inversely on sample size.

331 citations


Journal ArticleDOI
TL;DR: In this paper, the authors showed that the weighted kappa statistic, employing a standard error developed by Fleiss, Cohen, and Everitt (1969), holds for a large number of k cate gories of classification (e.g., 8 < k ≤ 10).
Abstract: The results of this computer simulation study in dicate that the weighted kappa statistic, employing a standard error developed by Fleiss, Cohen, and Everitt (1969), holds for a large number of k cate gories of classification (e.g., 8 < k ≤ 10). These data are entirely consistent with an earlier study (Cicchetti & Fleiss, 1977), which showed the same results for 3 ≤ k ≤ 7. The two studies also indicate that the minimal N required for the valid ap plication of weighted kappa can be easily approxi mated by the simple formula 2k2. This produces sample sizes that vary between a low of about 20 (when k = 3) to a high of about 200 (when k = 10). Finally, the range 3 ≤ k ≤ 10 should encompass most extant clinical scales of classification.

180 citations


Journal ArticleDOI
TL;DR: In this article, a class of new non-parametric test statistics is proposed for goodness-of-fit or two-sample hypothesis testing problems when dealing with randomly right censored survival data.
Abstract: This paper proposes a class of new non-parametric test statistics useful for goodness-of-fit or two-sample hypothesis testing problems when dealing with randomly right censored survival data. The procedures are especially useful when one desires sensitivity to differences in survival distributions that are particularly evident at at least one point in time. This class is also sufficiently rich to allow certain statistics to be chosen which are yery sensitive to survival differences occurring over a specified period of interest. The asymptotic distribution of each test statistic is obtained and then employed in the formulation of the corresponding test procedure. Size and power of the new procedures are evaluated for small and moderate sample sizes using Monte Carlo simulations. The simulations, generated in the two sample situation, also allow comparisons to be made with the behavior of the Gehan-Wilcoxon and log-rank test procedures.

Journal ArticleDOI
TL;DR: In this paper, the Fisher information matrix for the estimated parameters in a multiple logistic regression can be approximated by the augmented Hessian matrix of the moment-generating function for the covariates.
Abstract: The Fisher information matrix for the estimated parameters in a multiple logistic regression can be approximated by the augmented Hessian matrix of the moment-generating function for the covariates. The approximation is valid when the probability of response is small. With its use one can obtain a simple closed-form estimate of the asymptotic covariance matrix of the maximum likelihood parameter estimates, and thus approximate sample sizes needed to test hypotheses about the parameters. The method is developed for selected distributions of a single covariate and for a class of exponential-type distributions of several covariates. It is illustrated with an example concerning risk factors for coronary heart disease.

Book ChapterDOI
01 Jun 1981
TL;DR: In this article, the authors investigate the limits of prior distributions as the parameter a tends to various values, and show that very small values of a (X ) actually mean that the prior has a lot of information concerning the unknown true distribution and is of a form that would be generally unacceptable to a statistician.
Abstract: The form of the Bayes estimate of the population mean with respect to a Dirichlet prior with parameter a has given rise to the interpretation that a ( X ) is the prior sample size. Furthermore, if a ( X ) is made to tend to zero, then the Bayes estimate mathematically converges to the classical estimator, that is, the sample mean. This has further given rise to the general feeling that allowing a ( X ) to become small not only makes the prior sample size small but also that it corresponds to no prior information. By investigating the limits of prior distributions as the parameter a tends to various values, it is misleading to think of a ( X ) as the prior sample size and the smallness of a ( X ) as no prior information. In fact, very small values of a ( X ) actually mean that the prior has a lot of information concerning the unknown true distribution and is of a form that would be generally unacceptable to a statistician.

Journal ArticleDOI
TL;DR: In this article, the effect of sample size on the precision of kriging estimators has been investigated and it was shown that for samples of size less than approximately 50, Kriging offered no clear advantage over least squares in a Bayesean sense.
Abstract: Kriging, a technique for interpolating nonstationary spatial phenomena, has recently been applied to such diverse hydrologic problems as interpolation of piezometric heads and transmissivities estimated from hydrogeologic surveys and estimation of mean areal precipitation accumulations. An important concern for users of this technique is the effect of sample size on the precision of estimates obtained. Comparisons made between conventional least squares and kriging estimators indicate that for samples of size less than approximately 50, kriging offered no clear advantage over least squares in a Bayesean sense, although kriging may be preferable from the minimax viewpoint. A network design algorithm was also developed; tests performed using the algorithm indicated that the information content of identified networks was relatively insensitive to the size of the pilot network. These results suggest that within the range of sample sizes typically of hydrologic interest, kriging may hold more potential for network design than for data analysis.

Journal ArticleDOI
TL;DR: In this article, it is shown that the optimum property of Wald's SPRT for testing simple hypotheses based on i.i.d. observations can be extended to invariant SPRTs like the sequential $t$-test, the Savage-Sethuraman sequential rank-order test, etc.
Abstract: It is well known that Wald's SPRT for testing simple hypotheses based on i.i.d. observations minimizes the expected sample size both under the null and under the alternative hypotheses among all tests with the same or smaller error probabilities and with finite expected sample sizes under the two hypotheses. In this paper it is shown that this optimum property can be extended, at least asymptotically as the error probabilities tend to 0, to invariant SPRTs like the sequential $t$-test, the Savage-Sethuraman sequential rank-order test, etc. In fact, not only do these invariant SPRTs asymptotically minimize the expected sample size, but they also asymptotically minimize all the moments of the sample size distribution among all invariant tests with the same or smaller error probabilities. Modifications of these invariant SPRTs to asymptotically minimize the moments of the sample size at an intermediate parameter are also considered.

Journal ArticleDOI
TL;DR: The results show that the proposed sampling distribution of the test appears to be appropriate only for sample sizes above fifty, and for data where the sample size is ten times the number of variables.
Abstract: A likelihood ratio test to determine whether data arises from a single or a mixture of two normal distributions is investigated by Monte Carlo methods. The results show that the proposed sampling distribution of the test appears to be appropriate only for sample sizes above fifty, and for data where the sample size is ten times the number of variables. For such cases the power of the test is considered and found to be fairly low unless the generalized distance between the components is greater than 2.0.

Journal ArticleDOI
TL;DR: In this paper, the authors considered from a Bayesian viewpoint inferences about the size of a closed animal population from data obtained by a multiple-recapture sampling scheme and showed that strong prior knowledge about the catch probabilities can greatly affect inference about the population size, and it is in this respect that the greatest difference from previous approaches lies.
Abstract: SUMMARY This paper considers from a Bayesian viewpoint inferences about the size of a closed animal population from data obtained by a multiple-recapture sampling scheme. The method developed enables prior information about the population size and the catch probabilities to be utilized to produce considerable improvements in certain cases on ordinary maximum likelihood methods. Several ways of expressing such prior information are explored and a practical example of the uses of these ways is given. The main result of the paper is an approximation to the posterior distribution of sample size that exhibits the contributions made by the likelihood and the prior ideas. The multiple-recapture sampling scheme involves taking samples from a population of animals, at each stage counting the number of marked animals in the sample, marking the previously unmarked animals and returning the sample to the population. The literature is reviewed by Cormack (1968) and the papers most relevant to our problem are those of Chapman (1952) and Darroch (1958). In these papers, maximum likelihood estimates for the population parameters and variances for these estimates are given. Our approach differs from previous ones by introducing prior information about the population size, and about the propensities of the animals to be captured, called 'catch probabilities' by some previous authors. The incorporation of these propensities is a feature not shared by many previous approaches. Darroch (1958) found their introduction into his model does not change the maximum likelihood estimates of the population size. Our approach will show that strong prior knowledge about the catch probabilities can greatly affect inference about the population size, and it is in this respect that the greatest difference from previous approaches lies. Although it is recognized that models of an open population incorporating death and immigration such as those of Jolly (1965) and Seber (1965) are more practically realistic, we feel that our method, which deals only with a closed population, is worth investigating as a step towards providing a Bayesian treatment of the open population problem.

Journal ArticleDOI
TL;DR: In this article, the powers of several nonparametric tests for the two-sample problem with censored data are compared by simulation and the test that performed the best overall was the Peto-Prentice generalized Wilcoxon statistic with an asymptotic variance estimate.
Abstract: The powers of several nonparametric tests for the two-sample problem with censored data are compared by simulation. The tests studied include Gehan's, Efron's, and Peto's, and Prentice's generalized Wilcoxon tests, along with the logrank test. The test with the greatest power changes with the sample sizes, censoring mechanism, and distribution of the random variables. The test that performed the best overall was the Peto-Prentice generalized Wilcoxon statistic with an asymptotic variance estimate.

Journal ArticleDOI
TL;DR: In this paper, a Monte Carlo simulation is used to estimate the upper percentage points of the null distribution of the sample squared multiple correlation coefftcient (R 2) when the number of predictors selected is determined by a stopping rule.
Abstract: A Monte Carlo simulation is used to estimate the upper percentage points of the null distribution of the sample squared multiple correlation coefftcient (R 2) when the number of predictors selected is determined by a stopping rule. In the study, the sample size n and the number of candidate predictors m satisfy 2 ≤ m ≤ 20 and 10 ≤ n – m – 1 ≤ 200, while the F threshold ranges from two to four. Tables of the upper five percent and upper one percent sample R* values are presented and an example is given to illustrate the use of the tables.

Journal ArticleDOI
TL;DR: In this paper, the authors present a review of Bayesian procedures and results for analyzing sharp and non-sharp hypotheses with explicit use of prior information, including the use of power functions in practice.


Journal ArticleDOI
TL;DR: In this article, the authors show that a single systematic sample leads to a loss in efficiency when compared to a multiple systematic sample with the same overall sample size, and also there is no unbiased estimate of variance for systematic sampling.
Abstract: A commonly-used method of estimating the area of a region on a map or aerial photograph is the dot-grid method. The method may be described in general as follows. Consider a map of width M and length N. Divide the map by a grid system into mn rectangles each of area kl, to yield m rows of rectangles and n columns. Each rectangle is of width k and length 1, so M= mk and N= nl. To obtain the sample dots, first choose a dot uniformly distributed on the lower left rectangle of the map. Denote this point by (u, v). The sample dots, one in each rectangle, are located at the set of points {(u + rk, v+sl): r=0,...,m-1; s=O,...,n-1}. This sampling method is, in fact, the aligned systematic sampling method of Quenouille (1949) and the systematic sampling method of Das (1950). Figure 1 illustrates the method. This sampling method may be accomplished easily by using a transparency with systematically located dots on it. The dots are at distances k apart in one direction and I apart in the other. The transparency is laid over the map so that in any rectangular region of area kl on the map one of the transparency dots is uniformly distributed on that area. An estimate of the area of the region of interest is provided by the area of the transparency multiplied by the proportion of dots falling in the region. Because of its simplicity of implementation, systematic sampling is the usual method employed. However, systematic sampling is not necessarilyXan efficient method for area estimation, in the sense of minimizing variance. We show that a stratified sampling design, although less convenient to implement than systematic sampling, is usually more efficient. Also there is no unbiased estimate of variance for systematic sampling. We use 'unbiased' in the sense of 'unbiased with respect to the sampling design'. An unbiased estimate of variance can be obtained, with little loss in convenience, by taking a number of independent repetitions of systematic samples. We show that this procedure leads to a loss in efficiency when compared to a single systematic sample with the same overall sample size. The efficiency comparisons are made using a superpopulation model. The finite population variances for stratified sampling, and for systematic sampling with either a single or multiple random starts, are averaged over the model and compared. We will refer to these finite population variances averaged over the superpopulation as 'average variances' and

Journal ArticleDOI
TL;DR: In this article, the authors discuss finite sample properties of the maximum likelihood estimator 0 in the first-order moving average model and give a theoretical explanation for the concentration of 0 values at the invertibility boundary.
Abstract: SUMMARY In this paper we discuss finite sample properties of the maximum likelihood estimator 0 in the first-order moving average model. We give a theoretical explanation for the concentration of 0 values at the invertibility boundary. We derive the exact distribution of 0 for sample size n= 2 which is found to be of mixed type. For general n we give approximations for pr (0 = 1).

Journal ArticleDOI
TL;DR: A comparison of the relative efficiencies of the specific-locus test (for gene mutations and small deficiencies) and the heritable-translocation test ( for transmissible chromosome rearrangements), in detecting the same proportional increases over the spontaneous frequencies of their respective types of genetic damage, shows that less work is involved in reaching a conclusive result in the Specific- locus test.
Abstract: The binomial approximation of the UMPU (uniformly most powerful unbiased) test for the equality of 2 binomial proportions is shown to be a highly accurate and easily applied method for testing the hypothesis that a given mouse specific-locus mutation frequency is not higher than the spontaneous mutation frequency (43 mutations in 801 406 offspring, for males). Critical sample sizes have been calculated that show at a glance whether P The first hypothesis that the mutation frequency (induced + spontaneous) of treated mice is not higher than the spontaneous mutation frequency is combined with the second hypothesis that the induced mutation frequency of treated mice is no less than 4 times the historical-control mutation frequency to produce a multiple decision procedure with 4 possible decisions: inconclusive result, negative result, positive result, and weak mutagen. Critical sample sizes for the second hypothesis, also with P Positive results can become apparent in relatively small samples. Larger samples, of at least 11 166 offspring, are required to obtain a negative result. If samples of 18 000 are routinely collected (unless positive results are found earlier), 75% of tests of chemicals that are non-mutagens will give a negative result. If the question being asked is not whether a chemical induces gene mutations but, rather, whether the exposure received by humans causes any important risk from gene mutations, a much smaller sample size may be acceptable, under certain conditions. A comparison of the relative efficiencies of the specific-locus test (for gene mutations and small deficiencies) and the heritable-translocation test (for transmissible chromosome rearrangements), in detecting the same proportional increases over the spontaneous frequencies of their respective types of genetic damage, shows that less work is involved in reaching a conclusive result in the specific-locus test. Proposed specific-locus tests using biochemical markers are at a considerable statistical disadvantage compared with the standard test (using 7 visible markers) for which there is available a very large historical control showing a very low mutation rate.

Journal Article
TL;DR: It is inferred that the distribution of the tumor area ratio obeys a log-normal distribution for advanced gastric cancer and that this result may hold for many advanced measurable cancer studies.
Abstract: To date, in cancer clinical trials, treatment programs have been evaluated using objective tumor response as the primary means to demonstrate antitumor activity. This measure, which is based upon a dichotomous outcome whether or not a 50% decrease in tumor area has occurred, is complemented by the alternative measure of the distribution of the ratio of the tumor area taken at a fixed time point compared to the tumor area at the start of protocol treatment. It is inferred that the distribution of the tumor area ratio obeys a log-normal distribution for advanced gastric cancer and that this result may hold for many advanced measurable cancer studies. The identification of this distribution allows for the evaluation of treatment programs using parametric tests. In situations where log-normality does not apply, a Normal Scores test is recommended. This concept may be applied to completed studies to gain additional perspective regarding antitumor activity. A marked reduction in sample size requirements can be achieved when the tumor area ratio is used as a study design criteria. This approach is especially recommended in the phase II setting.

Journal ArticleDOI
TL;DR: In this paper, the minimum sample size for standard T tests with standard normal quantiles is presented, and the results are approximations to the exact solution, which yields the same exact solution.
Abstract: Formulas that yield minimum sample size for standard T tests are presented. Although the results are approximations, they usually yield the exact solution. Involving only standard normal quantiles, they could be used in an elementary course.

Journal ArticleDOI
TL;DR: In this article, an economic model is developed which determines the design of joint X-and R-control charts to minimize costs, based on Duncan's approach to the economic design of control charts.
Abstract: An economic model is developed which determines the design of joint X- and R-control charts to minimize costs. The design parameters are sample size, width of X -control chart limits, width of R-control chart limits, and sampling interval in hours. Duncan's approach to the economic design of control charts is used. The control states permitted are the mean and variance in control, mean only out of control, variance only out of control, and both mean and variance out of control. The model developed considers the situation in which a second process parameter can go out of control after the first process parameter has gone out of control. Central composite experimental designs and a pattern search technique are used to optimize the model. Experience with the model indicates that the cost of a traditional design of an X- and R-control chart can be considerably higher than the cost of the optimum design. An example is presented.

Journal ArticleDOI
TL;DR: In this article, the authors present statistical concepts that should be incorporated in the initial planning of a ground-water quality monitoring program, including simple random sampling, systematic sampling, and stratified random sampling.
Abstract: Recent emphasis on the need to protect ground-water quality has resulted in an increased interest in ground-water quality monitoring, particularly that monitoring performed in support of a regulatory ground-water quality management program. Such monitoring must involve intensive surveys or special studies, as well as routine trend types of sampling. In both cases, adequate monitoring strategies require careful consideration of the statistical aspects of sampling theory. The purpose of this paper is to present statistical concepts that should be incorporated in the initial planning of a ground-water quality monitoring program. The means of incorporating statistical theory into ground-water quality monitoring suggested in this paper involves selecting the number of samples required based on a specified confidence interval about the mean of the variable under consideration. This approach requires that the variance of the sample mean be known. The expression for Var(x) will depend on the correlation structure of the population in question. If the observations taken can be assumed independent in both space and time (i.e., no spatial or serial correlation exists), then the number of samples required can be determined in a very straightforward manner. However, if the samples are correlated, part of the information contained in one observation will be contained in other observations as well. As a result, the sample size must be increased in order to achieve the same level of information that would be obtained in uncorrelated observations. Various sampling techniques can be employed in a ground-water monitoring plan, including simple random sampling, systematic sampling, and stratified random sampling. Each technique has certain advantages and disadvantages with regard to ground-water monitoring. An overall monitoring program should incorporate the most effective sampling techniques in order to achieve optimum information content from a minimum number of samples.

Journal ArticleDOI
Philip Heidelberger1, Peter D. Welch1
TL;DR: In this paper, a method of confidence interval generation based on the estimation of p(0) through the least squares fit of a quadratic to the logarithm of the periodogram is presented.
Abstract: This paper addresses two central problems in simulation methodology: the generation of confidence intervals for the steady state means of the output sequences and the sequential use of these confidence intervals to control the run length. The variance of the sample mean of a covariance stationary process is given approximately by p(0)/N, where p(f) is the spectral density at frequency of and N is the sample size. In an earlier paper we developed a method of confidence interval generation based on the estimation of p(0) through the least squares fit of a quadratic to the logarithm of the periodogram. This method was applied in a run length Control procedure to a sequence of batched means. As the run length increased the batch means were rebatched into larger batch sizes so as to limit storage requirements. In this rebatching the shape of the spectral density changes, gradually becoming flat as N increases. Quadratics were chosen as a compromise between small sample bias and large sample stability. In this paper we consider smoothing techniques which adapt to the changing spectral shape in an attempt to improve both the small and large sample behavior of the method. The techniques considered are polynomial smoothing with the degree selected sequentially using standard regression statistics, polynomial smoothing with the degree selected by cross validation, and smoothing splines with the amount of smoothing determined by cross validation. These techniques were empirically evaluated both for fixed sample sizes and when corporated into the sequential run length control procedure. For fixed sample sizes they did not improve the small sample behavior and only marginally improved the large sample behavior when compared with the quadratic method. Their performance in the sequential procedure was unsatisfactory. Hence, the straightforward quadratic technique recommended in the earlier paper is still recommended as an effective, practical technique for simulation confidence interval generation and run length control.

Journal ArticleDOI
TL;DR: This correspondence shows how correct confidence limits and maximum likelihood estimates can be obtained from the test results.
Abstract: Mills capture-recapture sampling method allows the estimation of the number of errors in a program by randomly inserting known errors and then testing the program for both inserted and indigenous errors This correspondence shows how correct confidence limits and maximum likelihood estimates can be obtained from the test results Both fixed sample size testing and sequential testing are considered