scispace - formally typeset
Search or ask a question

Showing papers on "Sample size determination published in 1973"


Book ChapterDOI
TL;DR: In this article, several matching methods that match all of one sample from another larger sample on a continuous matching variable are compared with respect to their ability to remove the bias of the matching variable.
Abstract: Several matching methods that match all of one sample from another larger sample on a continuous matching variable are compared with respect to their ability to remove the bias of the matching variable. One method is a simple mean-matching method and three are nearest available pair-matching methods. The methods' abilities to remove bias are also compared with the theoretical maximum given fixed distributions and fixed sample sizes. A summary of advice to an investigator is included.

867 citations


Journal ArticleDOI
P.K. Gallagher1, D.W. Johnson1
TL;DR: In this paper, isothermal and dynamic methods were used to study the rate of weight loss of CaCO3 and a pronounced dependence of the activation enthalpy, pre-exponential term, and rate constant upon sample weight and heating rate was observed.

214 citations


Journal ArticleDOI
TL;DR: In this article, a trimmed t statistic is defined for the case of two independent samples with equal population variance, using an efficient Monte-Carlo technique, and for a normal underlying distribution and equal sample sizes, the two-sample trimmed t can be satisfactorily approximated by Student's t with degrees of freedom corresponding to the reduced sample.
Abstract: SUMMARY A trimmed t statistic is defined for the case of two independent samples with equal population variance. Using an efficient Monte-Carlo technique, it is shown that for a normal underlying distribution and equal sample sizes, the two-sample trimmed t can be satisfactorily approximated by Student's t with degrees of freedom corresponding to the reduced sample. Asymptotically, the statistic approaches N(O, 1). The loss of power efficiency in using trimmed t is small under exact normality while the gain may be appreciable for longer tailed distributions. The case of unequal sample sizes is also discussed.

97 citations


Journal ArticleDOI
TL;DR: Completely general methods are developed to obtain the average sample size and the probability of accepting the hypothesis p = po for binomial probabilities in mnultiple stage sampling procedures.
Abstract: SUMMARY Completely general methods are developed to obtain the average sample size and the probability of accepting the hypothesis p = po for binomial probabilities in mnultiple stage sampling procedures. An example illustrating the use of a multiple-stage plan of this type for a drug screen is considered. Data on the actual performance of the screen is also given.

88 citations


Journal ArticleDOI
Grace Wyshak1

75 citations


Journal ArticleDOI
TL;DR: In this paper, least squares estimates are proposed for the parameters of four survival distributions that can be fit: exponential, linear hazard, Gompertz and Weibull, and sample estimates of the hazard function are utilized in the least squares procedures and a method is given for selecting a distribution for further investigation based on the likelihood under the four survival models.
Abstract: Given a set of grouped survival data, least squares estimates are proposed for the parameters of four survival distributions that can be fit: exponential, linear hazard, Gompertz and Weibull. Sample estimates of the hazard function are utilized in the least squares procedures and a method is given for selecting a distribution for further investigation based on the likelihood under the four survival models. A Monte Carlo study demonstrated that the least squares estimates are nearly as efficient as maximum likelihood when the sample size is 50 or more. The methods are applied to survival data for 112 patients with plasma cell myeloma.

75 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a procedure for the economic design of Cusum charts to control the mean of a process with a normally distributed quality characteristic, and a model is derived which gives the long-run average cost as a function of both the design parameters of the chart and the cost and risk factors associated with the process.
Abstract: We present a procedure for the economic design of Cusum charts to control the mean of a process with a normally distributed quality characteristic. A model is derived which gives the long-ran average cost as a function of both the design parameters of the chart and the cost and risk factors associated with the process. The “pattern-search” technique is employed to determine the optimum values of the sample size, the sampling interval and the decision limit. The cost surfaces are investigated, numerically, to study the effects of changes in the design parameters and in some of the cost and risk factors.

72 citations


Journal ArticleDOI
TL;DR: In this paper, the Monte Carlo method was used to estimate the parameters of a mixture of two univariate normal distributions with, and sample sizes less than 300, where the components are not well separated and the sample size is small.
Abstract: There are few results In the literature on the properties of the maximum likelihood estimates of the parameters of a mixture of two normal distributions when the components are not well separated and the sample size is small. In the present investigation mixtures of two univariate normal distributions with , and sample sizes less than 300 are studied by the Monte Carlo method. For the cases considered, empirical evidtnca is given that the method of maximum likelihood should be used with extreme caution or not at all.

68 citations




Journal ArticleDOI
TL;DR: This paper examined the statistical sampling distributions of Horn's test for the number of factors, and found that the Horn test tended to fail to give an estimate of the number in many samples following the SMC and image factorings.
Abstract: To examine the statistical sampling distributions of Horn's test for the number of factors, 100 sample correlation matrices at each of three sample sizes were generated from a structured population correlation matrix and an unstructured (identity) population correlation matrix of the same order. Each of the 600 sample matrices was submitted to principal components, SMC, image, and Harris rescaled farings. Two variations of the Horn test (which differed in the number of sets of unstructured Eigen-values averaged for the comparison) were applied to each factoring of each structured sample matrix. It was found that the 2 variations were not significantly different but that the type of factoring did have a significant effect. It was also found that the test tended to fail to give an estimate of the number of factors in many samples following the SMC and image factorings.


Journal ArticleDOI
TL;DR: In this paper, an experimental procedure was developed to measure hypothesis-sampling directly where the subject is allowed to select attributes that he wishes to see, then randomly selected values on the selected attributes are presented.


Book ChapterDOI
J. Michaelis1
01 Jan 1973
TL;DR: In this paper, a simulation program is described which can be used to obtain estimates of the different types of misclassification probabilities for multiple group linear and quadratic discriminant analysis.
Abstract: Summary A simulation program is described which can be performed to obtain estimates of the different types of misclassification probabilities for multiple group linear and quadratic discriminant analysis. The program can be used to study how these errors depend on sample sizes and the different parameters of the multivariate normal distribution. Examples for several simulation experiments are given and possible conclusions are discussed.

Journal ArticleDOI
TL;DR: In this paper, the density function of the i-th order statistic from a sample with random size is derived for the case that the size has a bionmial distribution, and a simpler derivation is given below.
Abstract: Summary Raghunandanan and Patil [1] derived the density function of the i-th order statistic from a sample with random size. For the case that the size has a bionmial distribution, a simpler derivation is given below.

Journal ArticleDOI
TL;DR: A combination of trapping and tracking was used to estimate population levels of the masked shrew over a 3-year period, eliminating two large families of errors and resolving the age-old problem of calculating effective size of the trap plot.
Abstract: A combination of trapping and tracking was used to estimate population levels of the masked shrew ( Sorex cinereus ) over a 3-year period. This new technique is based on the principle of measuring the relative decrease in activity in an area due to the removal of a known number of animals. The population estimate is determined by calculating the number of shrews that would have to be removed to reduce activity to zero. The system eliminates two large families of errors that have plagued systems of population estimates. It eliminates all variables relating to changes in rates of animal activity due to climate, season, availability of food, natality, mortality, or others. It also resolves the age-old problem of calculating effective size of the trap plot. The method has built into it procedures and tests that assure adequate sample sizes.

Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of estimating the largest mean of a population with a common variance and showed that the confidence interval obtained is unsymmetric for k > 2 and behaves asymptotically as well as the optimal interval.
Abstract: Interval estimation of the largest mean of $k$ normal populations $(k \geqq 1)$ with a common variance $\sigma^2$ is considered. When $\sigma^2$ is known the optimal fixed-width interval is given so that, to have the probability of coverage uniformly lower bounded by $\gamma$ (preassigned), the sample size needed is minimized. This optimal interval is unsymmetric for $k > 2$. When $\sigma^2$ is unknown a sequential procedure is proposed and its behavior is studied. It is shown that the confidence interval obtained, which is also unsymmetric for $k > 2$, behaves asymptotically as well as the optimal interval. This represents an improvement of the procedure of symmetric intervals considered by the author previously; the improvement is significant, especially when $k$ is large.

Journal ArticleDOI
TL;DR: In this paper, the expected value, density, and distribution function (DF) of the gain from selection (GS) was derived and studied in terms of sample size, number of predictors, and prior distribution assigned to the population multiple correlation.
Abstract: The gain from selection (GS) is defined as the standardized average performance of a group of subjects selected in a future sample using a regression equation derived on an earlier sample. Expressions for the expected value, density, and distribution function (DF) of GS are derived and studied in terms of sample size, number of predictors, and the prior distribution assigned to the population multiple correlation. The DF of GS is further used to determine how large sample sizes must be so that with probability .90 (.95), the expected GS will be within 90 percent of its maximum possible value. An approximately unbiased estimator of the expected GS is also derived.

Journal ArticleDOI
John B. Ofosu1
TL;DR: In this paper, a two-sample procedure for selecting the population with the largest mean from k normal populations with unknown variances is proposed, which is based on a twosample procedure proposed by Stein (1945).
Abstract: SUMMARY This paper gives a two-sample procedure for selecting the population with the largest mean from k normal populations with unknown variances. The method is based on a twosample procedure proposed by Stein (1945). Tables necessary for the application of the procedure are given for selected values of k. Comparisons of the minimum values of the expected sample sizes using the proposed procedure are made with the corresponding single-sample sizes for known variances (Bechhofer, 1954). Comparisons are also made of the expected total sample sizes for the single-sample procedure, the two-sample procedure given in this paper and the two-sample procedure proposed by Bechhofer, Dunnett & Sobel (1954) which assumes that the populations have known variance ratios. It is shown that the expected total sample sizes are not much increased by ignorance of the variance ratios.

Journal ArticleDOI
TL;DR: In this paper, a guide to the minimum size of treatment groups, inferred from the relationship between pupil norms and norms for class averages of standardized achievement tests, is provided. But the minimum number of subjects required for each treatment group is not fixed.
Abstract: Careful planning of educational experiments includes decisions about the sample size to be used. This paper provides a guide to the minimum size of treatment groups, inferred from the relationships between pupil norms and norms for class averages of standardized achievement tests. With "highly effective" treatments and simple random assignment of subjects to conditions, 60 to 85 subjects is derived as the minimum number per group. With "moderately effective" treatments, the minimum number is 235 or more. Use of stratified samples reduces the minimum by fifteen to forty percent. Specification of sample size is one of the thornier problems which must be resolved in the planning of most educational experiments. It is an issue in which practical considerations pull the experimenter in one direction and statistical considerations pull him in the other. Most experienced researchers know only too well the practical difficulties involved in "selling" an experiment to many independent school administrators and research committees, in monitoring an ongoing project in widely scattered districts, and in financing large-scale projects from personal or local funds. But investigators also sense the high cost of performing an experiment with inadequate numbers of cases. Without sufficient numbers, differences of some educational consequence may not be detected, and the experimenter may commit time, energy, and resources to an investigation with negligible chances of success. Statistical consultants are often asked to strike a balance between these competing considerations-to specify a sample size that is "large enough," but is also "realistic." Unfortunately, the magical sample size that is just large enough cannot be defined in a

Journal ArticleDOI
TL;DR: This paper examines the long range costs of lot-by-lot sampling inspection of product which may be inspected on a dichotomoun basis and suggestions are made concerning the choice of parameters to minimize this cost.
Abstract: Military Standard 105D has been almost universally adopted by government and private consumers for the lot-by-lot sampling inspection of product which may be inspected on a dichotomoun basis The plan specifies, for each lot size, a random sample size and set of acceptance numbers (maximum allowable number of defectives in each sample). The acceptance numbers are based upon the binomial distribution and depend upon the quality required by the purchaser. Where several consecutive lots are submitted, a shift to less severe (“reduced”) inspection or more severe (“tightened”) inspection is specified when the ongoing quality is very high or low. Further experience permits a return to normal sampling from either of these states This paper examines the long range costs of such a sampling scheme. The three inspection types are considered as three distinct Markov chains, with periodic transitions from chain to chain. The expected sample size and the expected proportion of rejected product are determined as a function of the two parameters under control of the manufacturer, lot size and product quality. Some numerical examples are given which illustrate how to compute the overall cost of sampling inspection. Suggestions are made concerning the choice of parameters to minimize this cost.

Journal ArticleDOI
TL;DR: In this article, the authors generalize some one-sample statistics of the Cramer-von Mises type so that they can be used to test grouped data for goodness of fit, and prove that under the null hypothesis the asymptotic distributions of these statistics coincide with the corresponding classical statistics if the ratio of the sample size to the number of groups (k) remains constant.
Abstract: Following Riedwyl [8] we generalize some one-sample statistics of the Cramer-von Mises type so that they can be used to test grouped data for goodness of fit. We prove that under the null hypothesis the asymptotic distributions of these statistics, when suitably standardized, coincide with the corresponding classical statistics if the ratio of the sample size to the number of groups (k) remains constant. For k fixed the asymptotic distributions are given under the null hypothesis, and it is shown how to obtain them under any alternative. Some results for finite sample sizes are also derived.

Journal ArticleDOI
TL;DR: In this article, empirical distributions of Z = (X-bar - ()/(N-1/2)) were obtained for several sample sizes to investigate the approach of the sampling distribution of Z to a normal distribution.
Abstract: From each of two dozen populations varying widely in shape and skewness, empirical distributions of Z = (X-bar - ()/(N-1/2 were obtained for several sample sizes to investigate the approach of the sampling distribution of Z to a normal distribution. For..

Journal ArticleDOI
TL;DR: If one is restricted to a small sample size from one population, it is shown that it is not necessary to "make up" this deficiency by taking a large sample from the other population; best results may be obtained when both sample sizes are small.
Abstract: It is shown in the classification problem, when independent samples are taken from uniform distributions, that for small sample sizes the probability of misclassification when using the nearest-neighbor rule is "close" to its asymptotic value. It is also shown that when using this rule the probability of classification in many cases is close to its Bayes optimum even for small sample sizes. Moreover, if one is restricted to a small sample size from one population, it is shown that it is not necessary to "make up" this deficiency by taking a large sample from the other population; best results may be obtained when both sample sizes are small.

Journal ArticleDOI
TL;DR: In this paper, the properties of a coefficient of correlation rn, defined in a previous paper, are examined theoretically and experimentally to obtain a description of variance-sample size relationships.

Journal ArticleDOI
TL;DR: In this paper, a sequential sampling procedure was developed to estimate the scale parameter θ of the Pareto distribution when the shape parameter is unknown using the cost function, where A is a positive constant.
Abstract: A sequential sampling procedure is developed to estimate the scale parameter θ of the Pareto distribution when the shape parameterθ is unknown using the cost function , where A is a positive constant. The probability distribution of the stopping time for this procedure is tabulated and the expected stopping time and cost under the sequential sampling procedure are computed and compared with the optimum sample size and minimum cost under the fixed sample size procedure (for the case when the shape parameter is known).

Journal ArticleDOI
TL;DR: In this article, the authors proposed a sampling rule which guarantees that the Mean Squared Error (M.S.E) of the estimate does not exceed a given bound when the distributions have a common but unknown scale parameter.
Abstract: : Given n observations from each of k populations whose distributions differ by a location parameter, the value of the largest parameter is to be estimated using the largest value of the k sample means. It is desired to design a sampling rule which guarantees that the Mean Squared Error (M.S.E.) of the estimate does not exceed a given bound when the distributions have a common but unknown scale parameter. A sequential sampling scheme is devised based on an estimate of the scale parameter and a least favorable configuration of the location parameters. The sample size characteristics of the sampling plan studied under mild restrictions on the distributions involved. The M.S.E. of the resulting estimator is studied under the additional assumption of normality. A brief discussion is given of an alternate sequential plan which uses information in the sample regarding the configuration of the location parameters.

Journal ArticleDOI
D. R. Jensen1
TL;DR: In this article, the authors give monotone bounds for the chi-squared approximation to the distribution of Pearson's statistic for goodness of fit for simple hypotheses, depending on the underlying multinomial probabilities and sample size.
Abstract: 0. Summary Monotone bounds, depending on the underlying multinomial probabilities and the sample size, are given for the chi-squared approximation to the distribution of Pearson’s statistic for goodness of fit for simple hypotheses. These bounds apply to the distribution of a single statistic and to the joint distribution of two statistics associated with the margins of a two-way table, in both the central and non-central cases. Let Xff=C(O --E)”IE be Pearson’s (1900) statistic for testing goodness of fit in terms of the observed (0) and the expected (E) cell frequencies in a sample of N independent trials j let {xl, . . ., xk) be the underlying mnltinomial probabilities j and denote by FN( -) and t)”( .) the actual cumulative distribution function (cdf) of X; and its limiting chi-squared ( x2) form, respectively, where v =k -1. Esseen (1945) studied the rate of convergence of FIN( .) to (J~( -) j in the central case he derived the uniform bound

Journal ArticleDOI
TL;DR: For a fixed total sample size, a multistage procedure based on generalized U -statistics is developed for choosing a partition of this sample size into individual sample size for which the generalized variance of the estimator of the parameter vector is asymptotically minimized.