scispace - formally typeset
Search or ask a question

Showing papers on "Sample size determination published in 1978"


Journal ArticleDOI
01 Feb 1978-Heredity
TL;DR: In the absence of genotype-environment interactions, distributional skewness and mean-variance regression in DZ twins are found to be more powerful tests of directional dominance than the standard model fitting procedure and these tests may be worthwhile in future studies.
Abstract: SUMMARY A method based on the non-central chi-square distribution is developed for the calculation of sample sizes required to reject, with given probability, models of variation when they are II wrong". The method is illustrated with reference to simple alternative models of variation in MZ and DZ twins reared together. Simulation of twin experiments finds the empirical power in good agreement with that predicted by the method. Tables are produced showing the sample sizes required for 95 per cent rejection at the 5 per cent level of inappropriate models of variation. For equivalent cases it is always found easier to reject an inappropriate simple genetical model of variation than an inappropriate simple environmental model. For several frequendy encountered cases, more than 600 pairs of twins would be required to reject inappropriate alternative models. The optimum proportion of MZ and DZ twins in a sample will vary with the .. true .. model Of variation but is most likely to be between two-thirds and one-half of DZ twin pairs. The possibility of detecting genetical non-additivity with the classical twin study is investigated by theoretical power calculations and simulation. In the absence of genotype-environment interactions, distributional skewness and mean-variance regression in DZ twins are found to be more powerful tests of directional dominance (or unequal gene frequencies) than the standard model fitting procedure and these tests may be worthwhile in future studies.

497 citations


Journal ArticleDOI
TL;DR: In this article, a modification of the Dudewicz-Dalal procedure for the problem of selecting the population with the largest mean from k normal populations with unknown variances is discussed.
Abstract: In this paper we discuss a modification of the Dudewicz-Dalal procedure for the problem of selecting the population with the largest mean from k normal populations with unknown variances We derive some inequalities and use them to lower-bound the probability of correct selection These bounds are applied to the determination of the second-stage sample size which is required in order to achieve a prescribed probability of correct selection We discuss the resulting procedure and compare it to that of Dudewicz and Dalai (1975)

480 citations


Journal ArticleDOI
TL;DR: Cutler and Ederer illustrated how the life table approach could provide considerably more information on 5-yr survival than would be available from only those subjects followed for at least 5 yr.

290 citations


Journal Article
TL;DR: In this article, the authors provide formulas which prescribe the sample size necessary to meet certain criteria specified by the investigator for this alternative type of clinical trial, and the percent increase in total sample size is described when more patients are allocated to one treatment than the other.
Abstract: Determination of an adequate sample size for a clinical trial has traditionally involved the specification of type I (false positive) and type II (false negative) error rates, and a difference that one wishes to detect. Because newer therapy has generally been more invasive or more toxic, it is conventional for the type I error to be 0.05 in order that new therapy not be accepted as superior unless its advantages are definitively established. Recently, many new trials have been directed toward showing that a more conservative treatment is equivalent in efficacy to a standard intensive therapy. In this paper, we provide formulas which prescribe the sample size necessary to meet certain criteria specified by the investigator for this alternative type of clinical trial. In addition, the percent increase in total sample size is described when more patients are allocated to one treatment than the other.

224 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a method for deriving a confidence interval for a population mean from the output of a simulation run, which groups the observations on a run into batches and uses these batches as the basic data for analysis.
Abstract: This paper presents a method for deriving a confidence interval for a population mean from the output of a simulation run. The method groups the observations on a run into batches and uses these batches as the basic data for analysis. The technique is not new. What is new is the procedure for determining how to group the observations into batches that satisfy certain assumptions necessary for the technique to work correctly. It is inexpensive and requires a moderate knowledge of statistics. The results of testing the method on a single server queuing model with Poisson distributed arrivals of exponentially distributed service times M/M/1, indicate that the proposed technique performs as theory suggests for moderate activity levels. However, for higher activity levels performance is below theoretical expectation for small sample sizes n. As n increases, performance converges to expectation. Moreover, two calculations of the sample sizes needed to obtain results with moderate accuracy indicate that these sample sizes are in a range where the procedure is expected to perform with small error.

176 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigated the effect of sample size, actual population difference, and alpha level on the overestimation of experimental effects and concluded that effect size estimation is impractical unless scientific journals drop the consideration of statistical significance as one of the criteria of publication.
Abstract: Experiments that find larger differences between groups than actually exist in the population are more likely to pass stringent tests of significance and be published than experiments that find smaller differences. Published measures of the magnitude of experimental effects will therefore tend to overestimate these effects. This bias was investigated as a function of sample size, actual population difference, and alpha level. The overestimation of experimental effects was found to be quite large with the commonly employed significance levels of 5 per cent and 1 per cent. Further, the recently recommended measure, ω2, was found to depend much more heavily on the alpha level employed than the true population ω2 value. Hence, it was concluded that effect size estimation is impractical unless scientific journals drop the consideration of statistical significance as one of the criteria of publication.

158 citations


Journal ArticleDOI
TL;DR: In this paper, a method for obtaining confidence intervals following sequential tests is described, where an order relation is defined among the points on the stopping boundary of the test and confidence limits are determined by finding those values of the unknown parameter for which the probabilities of more extreme deviations in the order relation than the one observed have prescribed values.
Abstract: SUMMARY A method is given for obtaining confidence intervals following sequential tests. It involves defining an order relation among points on the stopping boundary and computing the probability of a deviation more extreme in this order relation than the observed one. Particular attention is given to the case of a normal mean with known or unknown variance. A comparison with the customary fixed sample size interval based on the same data is given. The purpose of this paper is to describe a method for obtaining confidence intervals following sequential tests. An order relation is defined among the points on the stopping boundary of the test. The confidence limits are determined by finding those values of the unknown parameter for which the probabilities of more extreme deviations in the order relation than the one observed have prescribed values. To facilitate understanding the proposed procedures, most of the paper is restricted to estimating the mean of a normal population with known variance following the class of sequential tests recommended by Armitage (1975) for clinical trials. The case of unknown variance is discussed briefly in ? 4. It is easy to see that the proposed method is valid more generally, although the probability calculations required to implement it depend on the specific parent distribution and stopping rule. A closely related method was proposed by Armitage (1958), who studied the case of binomial data numerically by enumeration of sample paths. Let xl, x2, ... be independent and normally distributed with unknown mean ,u and known variance cr2. Let sn =x1 + .. . + xn, and for given b > 0 consider the stopping rule

151 citations


Journal ArticleDOI
TL;DR: In this article, the accuracy of the large sample standard error of weighted kappa appropriate to the non-null case was studied by computer simulation, and it was shown that only moderate sample sizes are required to test the hypothesis that two independently derived estimates of the same value are equal.
Abstract: The accuracy of the large sample standard error of weighted kappa appropriate to the non-null case was studied by computer simulation. Results indi cate that only moderate sample sizes are required to test the hypothesis that two independently de rived estimates of weighted kappa are equal. How ever, in most instances the minimal sample sizes re quired for setting confidence limits around a single value of weighted kappa are inordinately large. An alternative, but as yet untested procedure for set ting confidence limits, is suggested as being poten tially more accurate.

136 citations


Journal ArticleDOI
TL;DR: In this article, a method for determining the sample size required for a specified precision simultaneous confidence statement about the parameters of a multinomial population is described, based on a simultaneous confidence interval procedure due to Goodman.
Abstract: A method is described for determining the sample size required for a specified precision simultaneous confidence statement about the parameters of a multinomial population. The method is based on a simultaneous confidence interval procedure due to Goodman, and the results are compared with those obtained by separately considering each cell of the multinomial population as a binomial.

107 citations


Journal Article
TL;DR: In order to establish a range of reference values for any characteristic one can use Gaussian or nonparametric techniques, whichever are most appropriate, and the same precision can be obtained with smaller sample sizes than using the non parametric techniques.
Abstract: In order to establish a range of reference values for any characteristic one can use Gaussian or nonparametric techniques, whichever are most appropriate One has the choice of calculating tolerance intervals or percentile intervals A tolerance interval is said to contain, say 95% of the population with probability, say 090 A percentile interval simply simply calculates the values between which 95% of the observations fall If the data can be said to have a Gaussian distribution, the same precision can be obtained with smaller sample sizes than using the nonparametric techniques In some cases, data which are not Gaussian can be transformed into a Gaussian form and hence make use of the more efficient Gaussian techniques In both cases, the data should be checked for outliers or rogue observations and these should be eliminated if the testing procedure fails to imply that they are an integral part of the data

105 citations


Journal ArticleDOI
01 Mar 1978
TL;DR: In this paper, the problem of choosing appropriate models for multiply cross-classified data is examined, and a simple procedure for the analysis of multiply crossclassified data, for both normal and non-normal responses, is presented.
Abstract: the President Dr H. P. WYNN in the Chair) SUMMARY The problem of choosing appropriate models for multiply cross-classified data is examined. It is shown that a standard method of analysis used in many ANOVA programs, equivalent to Yates's method of weighted squares of means, may lead to inappropriate models. A simultaneous test procedure equivalent to one proposed in regression analysis is used to derive a class of minimal acceptable models. The procedure is applied to an example. THIS paper presents a simple procedure for the analysis of multiply cross-classified data, for both normal and non-normal responses. The procedure is based on the hierarchical partition- ing of the total sum of squares or maximized likelihood, with a simultaneous test procedure, proposed in regression analysis, applied to the hierarchical partition to determine appropriate models for the data. This leads to a class of "minimal adequate" models; there may be several models in this class. The procedure is illustrated by applying it to a set of normal response data. Before discussing the general cross-classification, we consider some difficulties that have arisen in the treatment of the unbalanced two-way classification. 2. TIk TWO-WAY CLASSICATION 2.1. The2x2 Case Consider a 2 x 2 cross-classification with "disproportionate subclass frequencies". We will be concerned with the case in which the disproportionality is considerable, so that approximate methods based on replacing the observed frequencies by proportional "expected" frequencies are inappropriate. Let A1, A2 and B1, B2 denote the levels of the cross-classifying factors, and let nU, j and ,uH be the sample size, and sample and population means in the (i,j) cell of the classifica- tion, and let n. and n 1 be the row and column marginal totals: B1 B2 B1 B2 B1 B2

Journal ArticleDOI
TL;DR: A roving creel survey with non-uniform probability sampling was conducted on West Point Reservoir, Georgia, for 24 months as mentioned in this paper, where the assumption that catch per unit effort (CPE) for incompleted fishing trips is an unbiased estimator of CPE for completed trips is tested and verified.
Abstract: A roving creel survey with nonuniform probability sampling was conducted on West Point Reservoir, Georgia, for 24 months. The sampling design is described in detail. The assumption that catch per unit effort (CPE) for incompleted fishing trips is an unbiased estimator of CPE for completed trips is tested and verified. Coefficients of variation for monthly estimates of catch and effort are used to measure the precision of the sampling design. Precision was relatively high during the summer (April-October), but decreased markedly during the winter (November-March). This change is largely independent of sample size within the range of 5-10 sample days per month leading to the conclusion that sampling effort could be reduced 50% without impairing the precision of the survey. The method appears capable of detecting changes in the quality of fishing small enough for management purposes. The paper is intended to provide guidelines for the implementation, evaluation, and modification of statistically bas...

Journal ArticleDOI
TL;DR: The remote sensing sampling strategy presented has the added advantage that it can easily be adapted for use with most forms of remote sensing imagery, including orbital data, and provides a reliable framework for testing the accuracy of any remote sensing image interpretation — based land use classification using the minimum number of sample points.

Journal ArticleDOI
TL;DR: Gail and Gart as discussed by the authors presented tables showing the required sample size, n, for the Fisher-Irwin exact conditional test for 2 X 2 tables to achieve a power of at least 0.50, 0.80 and 0.90 against one-sided alternatives at nominal.05 and.01 significance levels.
Abstract: Gail and Gart (1973) present tables showing the required sample size, n, for the Fisher-Irwin exact conditional test for 2 X 2 tables to achieve a power of at least 0.50, 0.80 and 0.90 against one-sided alternatives at the nominal .05 and .01 significance levels. However, calculations for n > 35 were based on the Arc sine approximation, which was found to underestimate the actual required sample size by as much as 34%. The purpose of this note is to revise the Gail-Gart tables, giving exact sample sizes in all cases. The problem of nominal versus actual significance levels when the underlying probability of response is low is also briefly discussed.

Journal ArticleDOI
TL;DR: In this article, it was shown that race as a moderator of test validity is one such illusory moderator, having been created solely by belief in the law of small numbers, and that adequate statistical power in moderator research requires much larger sample sizes than have typically been employed.
Abstract: The thesis of this paper is that many proposed moderators in personnel psychology are probably illusory, having been created solely by belief in the law of small numbers. Evidence is presented that race as a moderator of test validity is one such illusory moderator. In addition, a model for validity generalization is described which, in addition to eliminating the need for criterion-related validity studies under certain circumstances, strongly calls into question the idea that situations moderate test validity, i.e., the traditional doctrine of situational specificity of test validities. Calculations are presented which show that adequate statistical power in moderator research requires much larger sample sizes than have typically been employed. This requirement is illustrated empirically using validity data for the Army Classification Battery for 35 jobs and 21,000 individuals. These analyses show that (1) even when a moderator is generally assumed to be large, large samples are required to gauge its effect reliably and (2) large sample research may show that moderators that appear plausible and important a priori are nonexistent or trivial in magnitude. The practice of pooling across numerous small sample studies to obtain statistical power equivalent to that of large sample studies is recommended. In light of the evidence that many proposed moderators may not exist, the authors hypothesize that the true structure of underlying relationships in personnel psychology is considerably simpler than personnel psychologists have generally imagined it to be.


Journal ArticleDOI
TL;DR: In this paper, an ISO FORTRAN subroutine for determining the required sample size or power in a stratified clinical trial is presented, which permits a post-accrual follow up period, censored observations, patient strata or risk groups, and unequal case allocation schemes.
Abstract: This paper presents an ISO FORTRAN subroutine for determining the required sample size or power in a stratified clinical trial. The algorithm permits a post-accrual follow up period, censored observations, patient strata or risk groups, and unequal case allocation schemes. An example which illustrates use of the subroutine is provided.

Journal ArticleDOI
TL;DR: In this article, the robustness of the two-sample t-test over an extensive practical range of distributions was quantified using a Monte Carlo study over the Pearson system of distributions and the details indicate that the results are quite accurate.
Abstract: The present paper has as its objective an accurate quantification of the robustness of the two–sample t-test over an extensive practical range of distributions. The method is that of a major Monte Carlo study over the Pearson system of distributions and the details indicate that the results are quite accurate. The study was conducted over the range β 1 =0.0(0.4)2.0 (negative and positive skewness) and β 2 =1.4 (0.4)7.8 with equal sample sizes and for both the one-and two-tail t-tests. The significance level and power levels (for nominal values of 0.05, 0.50, and 0.95, respectively) were evaluated for each underlying distribution and for each sample size, with each probability evaluated from 100,000 generated values of the test-statistic. The results precisely quantify the degree of robustness inherent in the two-sample t-test and indicate to a user the degree of confidence one can have in this procedure over various regions of the Pearson system. The results indicate that the equal-sample size two-sample ...

Journal ArticleDOI
TL;DR: In this paper, a general theorem is proven which describes the asymptotic distribution of maximum likelihood estimates subject to identifiability constraints, and a technique is described for displaying Bayesian conditional credibility regions for any sample size.
Abstract: Techniques are developed for surrounding each of the points in a multidimensional scaling solution with a region which will contain the population point with some level of confidence. Bayesian credibility regions are also discussed. A general theorem is proven which describes the asymptotic distribution of maximum likelihood estimates subject to identifiability constraints. This theorem is applied to a number of models to display asymptotic variance-covariance matrices for coordinate estimates under different rotational constraints. A technique is described for displaying Bayesian conditional credibility regions for any sample size.

Journal ArticleDOI
TL;DR: In this paper, the authors compared Simpson's index Y and the InJoirinational index H with a new measure Q based on the interquartile slope of the cumulative species abundance curve.
Abstract: Sununesar Two popular diversity indices, Simpson 's index Y and the InJoirinationl index H, are compared with a new measure Q based onl the inter-quartile slope oJ the cumulative species abundance curve. It is assumed that interest extends to characteriZing the site environlnent: the population present at the instant of sampling is considered to be on/i' one oJ a range oJ possible populations which the site could support. Expressions are derived Jor the expectations and variances of the three sample statistics a, aH, and a} wvhen the species abundances are gainmna variates. Q is a mnore in/cbrinative measure than H or Y. Both H and Y depend greatly onl the ab'unidancces ot the commonest species, which may fluctuate wvidel/ Jroi i'ear to iear. Q depends onl the more stable species with median abundance and discrim.inilates better between sites than H or Y: it can he recommended when sites are to be compared. The expected value oQ/Q is expressed in terns of the parameters of the gainina, lognormal and log-series models and is shown to he mnuch more robust than H or Y to the particular choice ot'model. For the log-series model, E(Q) is represented h ' the parameter a, while Jor the lognormnal E( Q) = 0.3 71 T/u, where T is the number of species in the population and u is the standard deviation oJ logged abundances. aU inay he biased in sinall samples, though the bias should bejairlY sinall whenl more thamn 50% of a species are present in the sample. A bsence oJ sinall sample bias should not he an overriding criterion in selecting a diversityv index since sinall samples wt'ill at best onl/ allow the crudest comparisons bet ween communities. The use of a single index to characterise the pattern of the abundances of different species in a community has obvious appeal and several such measures have been formulated. In practice the diversity is measured for a sample drawn from the community, so it is important that any proposed index is independent of sample size, at least for large samples; this is achieved if the index is based on the species relative abundances. We here study the behaviour of the two most popular measures of diversity, Simpson's index and the Information index, and propose an alternative index which provides a better characterisation of the community. To study the behavior of the diversity sample statistics theoretically, we must make assumptions about the mathematical form of the distribution of species abundances in the community and the nature of the sampling variability. Our assumptions can of course be no more than approximations to reality, but we expect our main conclusions to be fairly robust to deviations from the chosen model. We shall assume the population contains T species (denoted by S* in many ecological

Journal ArticleDOI
TL;DR: An analytic formulation of the effect of stratified randomization on the probability of Type I and Type II error is presented, based on calculation of exact binomial probabilities, and results compare favorably with the computer simulations of Feinstein and Landis.

Journal ArticleDOI
TL;DR: In this paper, an expression for the null density of the Bartlett test statistic for testing equality of variances of n normal populations, when random samples not necessarily of the same size are taken, was derived.
Abstract: An expression is derived for the null density of the Bartlett test statistic for testing equality of variances of n normal populations, when random samples not necessarily of the same size are taken. The expression permits computation of exact Bartlett critical values. As a generalization, an expression is derived for the density of the normalized ratio of an arbitrarily weighted geometric mean to the unweighted arithmetic mean, based on a sample of n independently distributed gamma random variables having common scale parameter but different shape parameters.

Journal ArticleDOI
A.S. Jordan1
TL;DR: A survey of the key statistical properties of the lognormal distribution which are relevant in device engineering can be found in this article, where the confidence limits on the median life and standard deviation of the population as a function of confidence level and sample size are given in novel graphical forms.

Journal ArticleDOI
TL;DR: In this article, the distribution of the one sample Kolmogorov-Smirnov statistics for truncated or censored samples is presented for sample sizes exceeding 25; they are based on the asymptotic distribution derived by Koziol and Byar and the exact power of these tests for finite sample sizes can be calculated.
Abstract: Formulas and tables of significance points are presented for the distribution of the one sample Kolmogorov-Smirnov statistics for truncated or censored samples as proposed by Barr and Davidson [1]. Approximate formulas for the significance points are given for sample sizes exceeding 25; they are based on the asymptotic distribution derived by Koziol and Byar [2]. In addition, it is shown how the exact power of these tests for finite sample sizes can be calculated.

Journal ArticleDOI
TL;DR: This paper presents an analysis of a longitudinal multi-center clinical trial with missing data that illustrates the application, the appropriateness, and the limitations of a straightforward ratio estimation procedure for dealing with multivariate situations in which missing data occur at random and with small probability.
Abstract: Summary This paper presents an analysis of a longitudinal multi-center clinical trial with missing data. It illustrates the application, the appropriateness, and the limitations of a straightforward ratio estimation procedure for dealing with multivariate situations in which missing data occur at random and with small probability. The parameter estimates are computed via matrix operators such as those used for the generalized least squares analysis of categorical data. Thus, the estimates may be conveniently analyzed by asymptotic regression methods within the same computer program which computes the estimates, provided that the sample size is sufficiently large. ltecently a general methodology was proposed by Koch, Landis, Freeman, Freeman and Lehnen (1977) for the analysis of experiments with repeated measurement of categorical data. Their approach involves the formulation of appropriate functions of the data, to which they apply asymptotic regression models and generalized Wald (1943) statistics. In addition to presenting the basic issues involved, they discuss asymptotic theory, large contingency tables, and zero cells. One problem not discussed, but which may frequently face the consultant or statistical analyst5 is the occurrence of missing data on one or more dependent variables. This paper presents an analysis of a multi-center clinical trial with repeated measurement of ordinal categorical data, and with the added complication of missing data. The clinical trial tested the efficacy and safety of a new drug for skin conditions. Patients were randomly assigned to one of the two treatments (drug vs. placebo) in each of six clinics and were evaluated prior to treatment to determine the initial severity of the skin condition. At three follow-up visits, patients were evaluated according to a five-point ordinal response scale defining extent of improvement. The data are shown in Table 1. One question of interest is whether there is a significant diCerence between the patients' responses to the two treatments. This question is dealt with in Section 2, and it is shown that a significant diCerence does exist when missing data are taken into account. Further questions of interest pertain to the pattern of improvement over time and its relationship to initial disease stage (severity). These questions are investigated in Section 3, where ratio estimates of mean improvement are modeled using asymptotic regression techniques. Section 4 contains a discussion of the appropriateness and limitations of the ratio estimation strategy used in the data analysis.

Journal ArticleDOI
TL;DR: In this paper, a new multinomial classification procedure based on a discrete distributional distance is presented and discussed, and its performance along with other commonly used classification procedures is assessed through Monte Carlo sampling experiments under different population structures.
Abstract: This article presents and discusses a new multinomial classification procedure based on a discrete distributional distance. Its performance along with other commonly used classification procedures is assessed through Monte Carlo sampling experiments under different population structures. In addition to reporting results consistent with the work of Gilbert (1968) and Moore (1973), the article describes sampling experiments which show that the new distance procedure is generally superior, in terms of both the mean actual and mean apparent errors, to the usual full multinomial rule in situations of disproportionate sample sizes.

Book ChapterDOI
01 Jan 1978
TL;DR: In this paper, the spectral analysis has been used for plankton patchiness analysis and it has been shown that the causes of patchiness are likely to be different at different space scales and this effect should be observable as a change in shape of the spectrum at different wavenumbers.
Abstract: Since its first use by Platt (1972) spectral analysis has come to play a central role in the statistical analysis and understanding of plankton patchiness. For a number of reasons it has replaced earlier statistical models that were based on the distribution of counts/sample (Cassie, 1963). Firstly, the statistical distributions used to analyze the distribution of counts/sample (e. g. Neyman type A and Negative Binomial) assumed that the clusters or patches were small compared to the sample size (Skellam, 1958). When this is not true the estimated value of the distribution parameters and the goodness-of-fit will vary with quadrat size (Pielou, 1957). Secondly, the use of counts/sample data alone discards valuable information about the spatial covariance properties of the data being studied. Thirdly, it is fairly well accepted that the causes of patchiness are likely to be different at different space scales and this effect should be observable as a change in shape of the spectrum at different wavenumbers.

Journal ArticleDOI
TL;DR: In this paper, a modified maximum likelihood estimator was developed for estimating the zero class from a truncated Poisson sample when the available sample size itself is a random variable, and a modified estimator appeared to be best with respect to the chosen criteria.
Abstract: Maximum likelihood estimators and a modified maximum likelihood estimator are developed for estimating the zero class from a truncated Poisson sample when the available sample size itself is a random variable. All the estimators considered here are asymptotically equivalent in the usual sense; hence their asymptotic properties are investigated in some detail theoretically as well as by making use of Monte Carlo experiments. One modified estimator appears to be best with respect to the chosen criteria. An example is given to illustrate the results obtained.

Journal ArticleDOI
TL;DR: In this paper, a partition of the vector space of all deviation score vectors for fixed sample size N is used to show that the (indeterminate) factors of the factor model can always be constructed so as to predict any criterion perfectly, including all those that are entirely uncorrected with the observed variables.
Abstract: A partition of the vector space of all deviation score vectors for fixed sample size N is used to show that the (indeterminate) factors of the factor model can always be constructed so as to predict any criterion perfectly, including all those that are entirely uncorrected with the observed variables.

Journal ArticleDOI
TL;DR: Standard sample size calculations for n, the number of observations per group when comparing two independent proportions, P1 and P2, require the specification of four quantities: P1, one of the two proportions of interest; delta, the smallest difference which it is important to detect; alpha, the significance level; and beta, the chance of failing to detect a difference as large as delta.
Abstract: Standard sample size calculations for n, the number of observations per group when comparing two independent proportions, P1 and P2, require the specification of four quantities: P1, one of the two proportions of interest; delta = P2 - P1, the smallest difference which it is important to detect; alpha, the significance level; and beta, the chance of failing to detect a difference as large as delta. In terms of these four quantities, the graphical aid is a series of charts showing isographs of sample size for selected values of n ranging from 35 to 500. The isographs, i.e. curves connecting points of equal sample size, are based on the asymptotic arc sine approximation and are plotted on the grid formed by P1 on the abscissa and delta on the ordinate. Eight separate charts are available for different choices of alpha and beta. These charts are especially useful in situations where the feasible sample size is roughly known, in which case the detectable difference, delta, can be read directly from the graph.