scispace - formally typeset
Search or ask a question

Showing papers on "Sample size determination published in 1972"


Journal ArticleDOI
TL;DR: In this paper, the subjective probability of an event, or a sample, is determined by the degree to which it is similar in essential characteristics to its parent population and reflects the salient features of the process by which it was generated.

4,231 citations


Journal ArticleDOI
TL;DR: In this article, the probability of a significance test conrerning the highest or lowest observation in normal samples was extended to larger sample sizes than heretofore published and to probability levels of.005 and.001.
Abstract: Tables of percentage points for significance tests conrerning the highest or the lowest observation in normal samples, or the two highest or the two lowest observations in normal samples, were recently extended to larger sample sizes than heretofore published and to probability levels of .005 and .001. These tables are being pltblished to satisfy a long existing demand for such extensions.

468 citations


Journal ArticleDOI
D. Foley1
TL;DR: The design-set error rate for a two-class problem with multivariate normal distributions is derived as a function of the sample size per class (N) and dimensionality (L) and is demonstrated to be an extremely biased estimate of either the Bayes or test- set error rate.
Abstract: In many practical pattern-classification problems the underlying probability distributions are not completely known. Consequently, the classification logic must be determined on the basis of vector samples gathered for each class. Although it is common knowledge that the error rate on the design set is a biased estimate of the true error rate of the classifier, the amount of bias as a function of sample size per class and feature size has been an open question. In this paper, the design-set error rate for a two-class problem with multivariate normal distributions is derived as a function of the sample size per class (N) and dimensionality (L) . The design-set error rate is compared to both the corresponding Bayes error rate and the test-set error rate. It is demonstrated that the design-set error rate is an extremely biased estimate of either the Bayes or test-set error rate if the ratio of samples per class to dimensions (N/L) is less than three. Also the variance of the design-set error rate is approximated by a function that is bounded by 1/8N .

322 citations


Journal ArticleDOI
TL;DR: In this article, the problem of repeated application and masking is discussed and two new statistics are proposed to over-come these problems: Lk which is based on the k largest (observed) values and Ek which is in absolute value (in absolute value) residuals.
Abstract: Several widely used tests for outlying observations are reviewed. Problems of repeated application and “masking” are described. Suggested as appropriate to over-come these problems are two new statistics: Lk which is based on the k largest (observed) values and Ek which is based on the k largest (in absolute value) residuals. Tables of approximate critical values for these statistics are given for 0.01, .025,0.05, and 0.10 levels of significance and for sample size n = 3 (1) 20 (5) 50.

253 citations


Journal ArticleDOI
TL;DR: In this article, a double sampling scheme is presented to estimate multinomial proportions from data which have been misclassified, where two measuring devices are available to classify units into one of r mutually exclusive categories.
Abstract: In some situations, it is desired to estimate multinomial proportions from data which have been misclassified. One such area is the sampling inspection area of quality control. In this paper, it is assumed that two measuring devices are available to classify units into one of r mutually exclusive categories. The first device is an expensive procedure which classifies units correctly; the second device is a cheaper procedure which tends to misclassify units. In order to estimate the proportions pi (i = 1,2, …, r) a double sampling scheme is presented. At the first stage, a sample of N units is taken and the fallible classifications are obtained; at the second stage a subsample of n units is drawn from the main sample and the true classifications are obtained. The maximum likelihood estimates of the pi are derived along with their asymptotic variances. Optimum values of n and N which minimize the measurement costs for a fixed precision of estimation and which minimize the precision for fixed cost are derive...

140 citations


Journal ArticleDOI
TL;DR: In this paper, an approximate cost model for a quality control procedure for two or more related variables is investigated, and a method is presented to determine the optimal sample size, interval between samples, and critical region parameter for the Hotelling T2 control chart.
Abstract: An approximate cost model for a quality control procedure for two or more related variables is investigated. A method is presented to determine the optimal sample size, interval between samples, and critical region parameter for the Hotelling T2 control chart. This model is a multivariate analog of several well-known models for the univariate XI„-chart. It is assumed that only one assignable cause of variation exists and the time between occurrences is exponentially distributed. Numerical results are provided in a particular bivariate case for several values of the cost coefficients. The behavior of the model to variation of the model parameters is discussed.

118 citations


Journal ArticleDOI
TL;DR: In this paper, the authors report the level of power for recent statistical tests reported in the AERJ and propose alternative reporting schemes relative to hypothesis testing to include power and effect size as well as the traditional a.
Abstract: It is almost universally accepted by educational researchers that the power (probability of rejecting Ho when Ho is false, that is, 1-0-) of a statistical test is important and should be substantial. What is not universally accepted or known is that the power can and should be calculated and reported for every standard statistical test. The power of statistical tests already conducted in educational research is equally unknown. It is the purpose of this paper to report the level of power for recent statistical tests reported in the AERJ and to propose alternative reporting schemes relative to hypothesis testing to include power and effect size as well as the traditional a. Cohen (1962, 1969), Tversky and Kahnman (1971), Overall (1969) and others argue quite strongly that explicit computation of power relative to reasonable hypotheses should be made before any study is completed and subsequently reported. Tversky and Kahnman (1971) suggest three reasons why this computation is important: (1) Such computations can lead the researcher to the conclusion that there is no point in running the study unless the sample size is materially increased; (2) The computation is essential to the interpretation of negative results, that is, failures to reject the null hypothesis; and (3) Computed power gives the researcher an indication of the level of the probability of a valid rejection of the null hypothesis.

108 citations


Journal ArticleDOI
TL;DR: In this article, an improved Bayesian method, due to Lindley, for the simultaneous estimation of multiple regressions in m groups is studied by applying the method to test data, and evidence is found to support the belief that in many testing applications the collateral information obtained from each subset of m − 1 colleges will be useful for estimating the regression in the mth college, especially when the sample sizes are small.
Abstract: An improved Bayesian method, due to Lindley, for the simultaneous estimation of multiple regressions in m groups is studied by applying the method to test data. Evidence is found to support the belief that in many testing applications the collateral information obtained from each subset of m − 1 colleges will be useful for estimating the regression in the mth college, especially when the sample sizes are small. Using a 25 per cent sample, the Bayesian prediction equations achieved an average 9.7 per cent reduction in mean square error, as compared with the within-group least squares equations, when cross-validated with a later sample. More importantly, the mean square error for the Bayesian equations based on the 25 per cent sample was only barely greater than that for the least squares equations based on the full sample data. Thus the main virtue of the method is that it permits predictions to be made separately for relevant subpopulations (e.g. male-female) where sample sizes would otherwise be too small to achieve an acceptable degree of accuracy.

96 citations



Journal ArticleDOI
TL;DR: In this article, a linear combination of Student's t statistics is proposed as a practical method of obtaining confidence intervals for the common mean, and the distribution function of T is conveniently approximated for the general case.
Abstract: A linear combination T of Student's t statistics is proposed as a practical method of obtaining confidence intervals for the common mean, and the distribution function of T is conveniently approximated for the general case. Exact percentage points of T are compared with those based on this approximation for two populations and various sample size configurations.

48 citations


Journal ArticleDOI
TL;DR: It appears that the conclusive statistical evidence for the existence of sub-types of depression will have to be the demonstration, using reliable data, that depression is better described by a mixture of distributions than by a single homogeneous distribution.

Journal ArticleDOI
TL;DR: In this article, the effect of misclassification on the linear discriminant function (Anderson's classification statistic) has been investigated by analysing the results of sampling experiments using asymptotic expansions of degrees higher than previously available.
Abstract: The effect of misclassification, for initial samples of finite size from multivariate normal populations, on the linear discriminant function (Anderson's classification statistic [l]) has been considered by analysing the results of sampling experiments. (Lachenbruch [2]). This paper presents an alternative approach by which the effect of misclassification is expressed in the form of asymptotic expansions of degrees higher than previously available. Although the results of these expansions are not always in agreement with the conclusions drawn by Lachenbruch from his sampling experiments in which the sample size was moderately large, the conclusion from the asymptotic approach is that Lachenbruch's large sample results (obtained when the sample size is infinite) hold in most cases. In those instances in which they apparently do not, a general condition for them to hold is obtained.

Journal ArticleDOI
TL;DR: The purpose of this paper is to extend the Bayesian approach to include consideration of the sample size and the sampling interval in the design of the control procedure and to show how the optimal time between samples and the optimal sample size can be found andHow the optimal decision can be made based on the outcome of the samples.
Abstract: The general problem of process control is one of maintaining a production process in such a state that the output from the process conforms to design specifications. As the process operates it will be subject to changes which cause the quality of the output to deteriorate. Some amount of deterioration can be tolerated but at some point it becomes less costly to stop and overhaul the process. The problem of establishing control procedures to minimize long-run expected costs has been approached by several researchers using Bayesian decision theory. However, the models used by these researchers have been incomplete. The purpose of this paper is to extend the Bayesian approach to include consideration of the sample size and the sampling interval in the design of the control procedure. Using dynamic programming the analysis will show how the optimal time between samples and the optimal sample size can be found and how the optimal decision can be made based on the outcome of the sample.

Journal ArticleDOI
TL;DR: A review of univariate tolerance intervals from an application-oriented point of view is presented in this paper, where both β-content and β-expectation intervals are defined and considered.
Abstract: A review of univariate tolerance intervals is presented from an application-oriented point of view. Both β-content and β-expectation intervals are defined and considered. Standard problems are discussed for the distribution-free case and with various distributional assumptions (normal, gamma, Poisson) which occur most frequently in practice. The determination of sample size is emphasized. A number of examples are used to illustrate the types of problems which permit solutions with the excellent tables now available.

Journal ArticleDOI
TL;DR: In this paper, the upper and lower confidence limits of the geometric mean of a log normal distribution were derived for a given probability, and the relationships between them were derived from theoretical considerations.

Journal ArticleDOI
TL;DR: In particular, excessive sample sizes are avoided when the population probabilities are small as mentioned in this paper, and the number of failures in the Sobel-Weiss stopping rule is obtained with good general properties.
Abstract: Sobel and Weiss [2] have given an inverse sampling stopping rule for selecting the better of two binomial populations. Sampling is done by the play-the-winner method. By including the number of failures in the Sobel-Weiss stopping rule a truncated procedure is obtained with good general properties. In particular, excessive sample sizes are avoided when the population probabilities are small.

Journal ArticleDOI
TL;DR: In this article, a set of chemotherapy data published by the Co-operative Breast Cancer Group in which a series of test compounds are each compared with testosterone propionate which is considered to be the standard treatment is discussed.
Abstract: SUMMARY The fact that more is generally known about a standard treatment in a clinical trial than about the test treatment is exploited in empirical Bayes estimates based on the results of using the same standard in other trials. Such estimates are proposed for the case of dicho- tomous response, and discussed in terms of an example in cancer research. Methods of design and analysis of experimental trials comparing a test treatment, T, with a standard treatment, S, usually consider the two treatments in a balanced and sym- metric way. However, the description of S as a standard suggests that it has been used many times before, and so prior to the experimental results more may be known about the general effectiveness of S than about T. If data from past trials in which S has been used are syste- matically tabulated, the empirical Bayes method is one way in which this information may be used. We consider this approach for the case of simple trials in which (i) the response is dichotomous, and (ii) an appropriate experimental analysis may be considered to involve separate estimations for the two treatments. Condition (ii) will be a reasonable approxima- tion if sample sizes are not too small. In ? 5, we discuss a set of chemotherapy data published by the Co-operative Breast Cancer Group in which a series of test compounds are each com- pared with testosterone propionate which is considered to be the standard treatment. Suppose that out of n patients in a clinical trial who are given S, x successes are observed, and let p be the corresponding response probability. We also suppose that there exist the results of m - 1 trials of a similar type carried out in the past, in which out of ni patients given

Journal ArticleDOI
TL;DR: In this article, the author has described a method for determining optimum sample sizes in multivariate stratified surveys, and studied some of the mathematical properties of the optimum solution, which is similar to ours.
Abstract: In [1] the author has described a method for determining optimum sample sizes in multivariate stratified surveys. In this paper we study some of the mathematical properties of the optimum solution.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the expected number of observations on the poorer treatment, i.e., on the population with smaller p, is never greater than for the Vector-at-a-time rule.
Abstract: SUMMARY In a sampling problem for selecting the better of two binomial populations with a fixed total sample size, two sampling rules are shown to have equal probabilities of correct selection. One of them is the so-called Play-the-Winner sampling rule and the other is the usual Vector-at-a-Time sampling rule. As a corollary it is also shown that for the Play-theWinner rule the expected number of observations on the poorer treatment, i.e. on the population with smaller p, is never greater than for the Vector-at-a-Time rule.

Journal ArticleDOI
TL;DR: In this article, the authors considered the likelihood ratio tests as ensembles of sequential probability ratio tests and compared them with alternative procedures by constructing alternative ensemble, applying a simple inequality of Wald and a new inequality of similar type.
Abstract: Sequential tests of separated hypotheses concerning the parameter θ of a Koopman-Darmois family are studied from the point of view of minimizing expected sample sizes pointwise in θ subject to error probability bounds. Sequential versions of the (generalized) likelihood ratio test are shown to exceed the minimum expected sample sizes by at most M log log α(-1) uniformly in θ, where α is the smallest error probability bound. The proof considers the likelihood ratio tests as ensembles of sequential probability ratio tests and compares them with alternative procedures by constructing alternative ensembles, applying a simple inequality of Wald and a new inequality of similar type. A heuristic approximation is given for the error probabilities of likelihood ratio tests, which provides an upper bound in the case of a normal mean.

Journal ArticleDOI
TL;DR: In this article, Abrahams, S. C. and Keve, E. T. present tables of expected ranked exact moduli of normal observations, for sample sizes to 41, which are useful for half-normal probability plots.
Abstract: In using normal probability plots for comparing two sets of crystallographic data [Abrahams, S. C. & Keve, E. T. (1971), Acta Cryst. A27, 157] note should be taken of the fact that the expected values of normal order statistics are not given exactly by the percentage points of the normal distribution. This becomes an important consideration only for small samples. Tables of expected ranked exact moduli of normal observations, for sample sizes to 41, are presented: these are useful for half-normal probability plots.

Journal ArticleDOI
TL;DR: An approximate expected cost model for the Hotelling T2 control chart is presented in this article, where the model sensitivity to the cost coefficients, the shift parameter, the sample correlation coefficient, and the sample covariance matrix is discussed.
Abstract: An approximate expected cost model for the Hotelling T2 control chart is presented. This control chart is a multivariate analog of the X control chart. A method is presented to determine the optimal sample size, interval between samples, and critical region parameter for the T2 chart. Numerical results are provided for several bivariate problems for various values of the cost coefficients. The model sensitivity to the cost coefficients, the shift parameter, the sample correlation coefficient, and the sample covariance matrix is discussed.

Journal ArticleDOI
TL;DR: This work shows here how a datadependent allocation rule may be used in the case of deciding which of two normally distributed treatment effects has the greater mean, when the variances are assumed to be equal and known.
Abstract: In clinical trials comparing two treatments, one would often like to control the probability of erroneous decision while minimizing not the total sample size but the number of patients given the inferior treatment. To do this obviously requires that one use a datadependent allocation rule for the two treatments rather than the conventional equal sample size scheme, whether fixed or sequential. We show here how this may be done in the case of deciding which of two normally distributed treatment effects has the greater mean, when the variances are assumed to be equal and known. Similar methods can be used under other hypotheses on the underlying probability distributions, and will provide a considerable increase in flexibility in the design of sequential clinical trials.

Journal ArticleDOI
TL;DR: The magnitude of sampling errors associated with estimates of the mean, median, and standard deviation of voice fundamental frequencies (fo) during oral reading is investigated as a function of sample size.
Abstract: Distributional measures such as mean, median, standard deviation, 90% range, and 50% range of fundamental frequency have been often used to characterize voices. This study reports behavior of such statistical measures for each of six talkers as a function of sample size (from 3.6 up to 200 sec). Fundamental frequencies were obtained by a computer program that used a peak‐picking method. The results showed that, for each script reading hy a given talker, the standard deviations of the means converge fairly quickly—that is, to within about 2 Hz of the total mean at a sample size of about 1 min. However, when all the various readings from even a single talker are used to get the total mean, the convergence was much slower, indicating that differences among the scripts and day‐to‐day variations in repeated readings are significant and must be taken into account when interpreting statistical measures of fundamental‐frequency variation. Median convergence behavior was the same. Implications of these findings to studies of normal and disordered speech are discussed. [This research was supported in part by the Air Force Cambridge Research Laboratories.]Distributional measures such as mean, median, standard deviation, 90% range, and 50% range of fundamental frequency have been often used to characterize voices. This study reports behavior of such statistical measures for each of six talkers as a function of sample size (from 3.6 up to 200 sec). Fundamental frequencies were obtained by a computer program that used a peak‐picking method. The results showed that, for each script reading hy a given talker, the standard deviations of the means converge fairly quickly—that is, to within about 2 Hz of the total mean at a sample size of about 1 min. However, when all the various readings from even a single talker are used to get the total mean, the convergence was much slower, indicating that differences among the scripts and day‐to‐day variations in repeated readings are significant and must be taken into account when interpreting statistical measures of fundamental‐frequency variation. Median convergence behavior was the same. Implications of these findings to...

Journal ArticleDOI
TL;DR: In this article, a sampling technique for directional data has been developed using the circular measures of dispersion and approximate ANOVA of G. S. Watson, which can be used to determine whether the paleocurrent directions from different geological formations belong to significantly different populations.
Abstract: Statistical procedures for (1) sampling, (2) testing the existence of a preferred direction, and (3) testing homogeneity of twodimensional directional data, which have been developed by the authors for paleocurrent studies, are presented. It is well known that conventional methods of statistical analysis are not applicable to directional data (e.g., crossbedding and ripplemark directions, grain lineations, etc.) which are “circularly distributed” on a compass dial. A sampling technique for directional data has been developed using the circular measures of dispersion and approximate ANOVA of G. S. Watson. On the basis of a pilot survey, it is possible to compute the minimum sample size required for estimating, with a desired precision, the mean paleocurrent direction of a formation. The optimum allocation of sample size between and within outcrops also can be accomplished at a minimum cost. The procedure described for testing uniformity (or lack of preferred direction) is based on the arc lengths made by successive sample points and is simple to use if the sample size is moderate. A table of critical points and a numerical example are given after a description of the test procedure. Finally, the procedures for testing the homogeneity of directional data from several geological formations are described by (1) tests for equality of the resultant directions (polar vectors) and (2) tests for equality of dispersions. With these tests it is possible to determine whether the paleocurrent directions from different geological formations belong to significantly different populations.

Journal ArticleDOI
TL;DR: In this article, a technique is given for drawing valid inferences in cases where performance characteristics of statistical procedures (e.g., power for a test, or probability of a correct selection for a selection procedure) depend upon unknown parameters.
Abstract: A technique is given for drawing valid inferences in cases where performance characteristics of statistical procedures (e.g. power for a test, or probability of a correct selection for a selection procedure) depend upon unknown parameters (e.g. an unknown variance). The technique is especially useful in situations where sample sizes are small (e.g. in many medical trials); the “usual” approximate procedures are found to be misleading in such cases.

Journal ArticleDOI
TL;DR: A procedure has been developed, and tables presented, for estimating sample size requirements in the planning stage of a clinical trial where patients are to enter the study in cohorts rather than simultaneously, and can be of value in terms of providing guidelines for assessing the adequacy of sample accrual.

Journal ArticleDOI
TL;DR: In this article, the asymptotic normality of the joint distribution of an increasing number of sample quantiles as the sample size increases is investigated in both cases where the basic distributions are equal and are unequal.
Abstract: Uniform (or type (B) d ) asymptotic normality of the joint distribution of an increasing number of sample quantiles as the sample size increases is investigated in both cases where the basic distributions are equal and are unequal. Under fairly general assumptions, sufficient conditions are derived for the asymptotic normality of sample quantiles.

Journal ArticleDOI
TL;DR: A Symposium devoted to this topic was held in Chapel Hill, N.C., in 1968 as discussed by the authors, with the main issue being: Should sample survey theory be dealt with (i) by extending the statistical theory with a new model and corresponding formal criteria of optimality or appropriateness.
Abstract: In recent years, the foundations of survey sampling have been subject to considerable debate. A symposium devoted to this topic was held in Chapel Hill, N.C., in 1968. Godambe [10] summarizes what he feels constituted the central issue at the symposium as follows: Should Sample Survey Theory be dealt with (i) "by extending the statistical theory with a new model and corresponding formal criteria of optimality or appropriateness"

Journal ArticleDOI
TL;DR: A survey of major articles dating from October 1969 to May 1971 was made in regard to statistical tests used, levels of significance, the sample size, and power, with the t and A NOVA tests being the most commonly used tests.
Abstract: This study was designed to investigate the power of the research studies published in the Research Quarterly. A survey of major articles dating from October 1969 to May 1971 was made in regard to statistical tests used, levels of significance, the sample size, and power. It was noted that statistical inferential testing was conducted in 136 of the 151 articles, with the t and A NOVA tests being the most commonly used. Cohen's (3) tables and metric free values for effect size were used to estimate the power of the 82 t and 179 F tests which were reported in 106 of the studies. An analysis of the results showed that the researchers would have made a valid rejection of H0 on the average approximately 78% of the time when using a large effect size, 50% of the time when using a medium effect size, and 13% when using a small effect size. Some suggestions were presented to assist in increasing the power of statistical tests used.