scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 1952"


Journal ArticleDOI
TL;DR: In this article, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C − 1), unless the samples were too small, in which case special approximations or exact tables are provided.
Abstract: Given C samples, with n i observations in the ith sample, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C – 1), unless the samples are too small, in which case special approximations or exact tables are provided. One of the most important applications of the test is in detecting differences among the population means.* * Based in part on research supported by the Office of Naval Research at the Statistical Research Center, University of Chicago.

9,365 citations


Journal ArticleDOI
TL;DR: In this paper, two sampling schemes are discussed in connection with the problem of determining optimum selection probabilities according to the information available in a supplementary variable, which is a general technique for the treatment of samples drawn without replacement from finite universes when unequal selection probabilities are used.
Abstract: This paper presents a general technique for the treatment of samples drawn without replacement from finite universes when unequal selection probabilities are used. Two sampling schemes are discussed in connection with the problem of determining optimum selection probabilities according to the information available in a supplementary variable. Admittedly, these two schemes have limited application. They should prove useful, however, for the first stage of sampling with multi-stage designs, since both permit unbiased estimation of the sampling variance without resorting to additional assumptions. * Journal Paper No. J2139 of the Iowa Agricultural Experiment Station, Ames, Iowa, Project 1005. Presented to the Institute of Mathematical Statistics, March 17, 1951.

3,990 citations


Journal ArticleDOI
TL;DR: A simple function, in terms of two physically meaningful parameters, has been evolved, which fits survivorship data very well and can be used to compare succinctly the mortality of two groups, different in respect of treatment, type of cancer, or other characteristics.
Abstract: On the basis of experience with calculated survivorships of patients following treatment for cancer, a simple function, in terms of two physically meaningful parameters, has been evolved, which fits such survivorship data very well. These two parameters can be used to compare succinctly the mortality of two groups, different in respect of treatment, type of cancer, or other characteristics. The parameters are c (“cured”), which represents the proportion of the population which is subject only to “normal” death rates, and β, which is the death rate from the cancer, to which the rest of the population [not “cured,” (1–c)] is subject. Thus if one treatment is characterized by c 1 = 0.30, β 1 = 0.25, another by c 2 = 0.20, β 2 = 0.15, this could be interpreted as meaning that while the first treatment “cured” a larger proportion of the population than did the second treatment, it did not ameliorate the deaths attributable to cancer in the patients not cured as much as did the second treatment. If l T...

785 citations


Journal ArticleDOI
D. J. Davis1
TL;DR: The rationale and statistical techniques employed in the analysis of some failure data obtained from operations performed by machines and people are summarized and the agreement between theory and data is evaluated.
Abstract: This paper summarizes the rationale and statistical techniques employed in the analysis of some failure data obtained from operations performed by machines and people. These data are compared to frequency distributions arising from either an exponential or a normal theory of failure. The agreement between theory and data is evaluated.

419 citations



Journal ArticleDOI
TL;DR: In this article, a paired comparison test of m brands of a product each of the ½m(m − 1) pairs is presented to 2r judges: to r in one order, and to r r in the other.
Abstract: In a paired comparison test of m brands of a product each of the ½m(m – 1) pairs is presented to 2r judges: to r in one order, and to r in the other. An analysis of variance is developed for the case in which the judges’ preferences are expressed on a 7 or 9-point scale. Account is taken of the effects of order of presentation. Main effects are defined for the brands. The hypothesis of subtractivity, analogous to the hypothesis of additivity in a two-way layout, states roughly that the results for any pair, after order effects are eliminated, can be attributed entirely to the difference of the main effects of the two brands in the pair. Significance tests for the main effects, for the order effects, and for the hypothesis of subtractivity are given, as well as estimates of various parameters and their standard errors. The main effects are analyzed by considering all possible comparisons. A numerical example illustrates the method. * The illustrative data in this paper were originally analyzed by ...

339 citations




Journal ArticleDOI
TL;DR: In this paper, the authors present a review of the more important properties of partially balanced incomplete block designs with two associate classes and give both the intrablock and the combined intraand inter-block analysis in a simplified and compact form, together with an illustrative numerical example.
Abstract: INCOMPLETE block designs are now in fairly general use, especially the I balanced incomplete block and the lattice designs. One of the authors in collaboration with Nair [1] introduced in 1939 a wider class of designs, viz. partially balanced incomplete block designs which included as a special case the balanced incomplete block designs and the square lattices. A slight generalization by Nair and Rao [2], resulted in the inclusion of cubic and other higher dimensional lattices as special cases. Recently other special cases of partially balanced designs are beginning to be used, for example, the rectangular lattices of Harshbarger [3, 4, 51 and the linked block designs of Youden [6]. It is the object of this paper to review the more important properties of partially balanced designs with two associate classes and to give both the intrablock and the combined intraand inter-block analysis in a simplified and compact form, together with an illustrative numerical example. A special feature is the division of these designs into a small number of distinct types, for each of which the association scheme can be explicitly exhibited. This simplifies the numerical computations as well as the interpretation of the results. A number of useful designs belonging to each type have been given for illustrative purposes, but no complete tabulation has been attempted. Complete tables (up to ten replications) will be published by the Institute of Statistics, University of North Carolina.

310 citations



Journal ArticleDOI
TL;DR: In this article, the distribution of Kolmogorov's statistic for finite sample size was computed. But the distribution was not considered in this paper, since it was considered in the context of a small sample size.
Abstract: (1952). Numerical Tabulation of the Distribution of Kolmogorov's Statistic for Finite Sample Size. Journal of the American Statistical Association: Vol. 47, No. 259, pp. 425-441.

Journal ArticleDOI
TL;DR: The method of calculating confidence limits for medians described in this paper has been used in the Bureau of the Census for a number of years, having been used first in the survey of radio listening habits conducted by the Bureau for the Federal Communications Commission in 1945 as discussed by the authors.
Abstract: * The method of calculating confidence limits for medians described in this paper has been used in the Bureau of the Census for a number of years, having been used first in the survey of radio listening habits conducted by the Bureau for the Federal Communications Commission in 1945. The author, aided by suggestions from other members of the sampling staff of the Bureau of the Census, has developed the rationale for the method.


Journal ArticleDOI
TL;DR: In this article, the authors presented a chart which can be used to simplify estimation of μ and σ in the case of sampling from a singly truncated normal distribution when the point of truncation and the number of observations in the truncated portion are known.
Abstract: Charts are presented which can be used to simplify estimation of μ and σ in the case of sampling from a singly truncated normal distribution when (a) the point of truncation and the number of observations in the truncated portion are known, (b) the number of observations in the truncated portion is not known. A somewhat different iteration procedure for case (a) than given by other writers is suggested and an example is given.

Book ChapterDOI
TL;DR: In this article, a number of pairs of teams or products are compared on the basis of n binomial trials, and the average probability that the better team or product wins a given trial is estimated to measure the discrimination provided by the test.
Abstract: Suppose a number of pairs of teams or products are compared on the basis of n binomial trials. Although we cannot know from the outcomes which teams or products were actually better, we wish to estimate the average probability that the better team or product wins a given trial, and thus to measure the discrimination provided by our test. World Series data provide an example of such comparisons. The National League has been outclassed by the American League teams in a half-century of World Series competition. The American League has won about 58 per cent of the games and 65 per cent of the Series. The probability that the better team wins the World Series is estimated as 0.80, and the American League is estimated to have had the better team in about 75 per cent of the Series. * This work was facilitated by support from the Laboratory of Social Relations, Harvard University.



Journal ArticleDOI
TL;DR: The authors presented a paper at the annual meeting of the American Statistical Association (ASA) on December 29, 1950, which was later published in the New York Journal of Mathematical Information.
Abstract: * Paper presented at the Annual Meeting of the American Statistical Association on December 29, 1950.

Journal ArticleDOI
TL;DR: In this article, a bibliography contains 999 references on nonparametric statistics and related topics, classified as follows: Surveys and Discussions (39), Theory (31), Tchebycheff Inequalities (94), Tolerance Sets (21), Goodness of Fit (122), Multisample Problems (53), Parameter Problems (135), Contingency Tables (75), Randomness (109), Correlation and Curve Fitting (96), Comparative Studies (49), Systematic Statistics (127), Scaling (37), Distribution Theory (383), O
Abstract: This bibliography contains 999 references on nonparametric statistics and related topics, classified as follows: (A) Surveys and Discussions (39), (B) Theory (31), (C) Tchebycheff Inequalities (94), (D) Tolerance Sets (21), (E) Goodness of Fit (122), (F) Multisample Problems (53), (G) Parameter Problems (135), (H) Contingency Tables (75), (I) Randomness (109), (J) Correlation and Curve Fitting (96), (K) Comparative Studies (49), (L) Systematic Statistics (127), (M) Scaling (37), (N) Distribution Theory (383), (O) Applications (89), (P) Tables (228), (X) Miscellaneous (28).


Journal ArticleDOI
TL;DR: In this article, the authors compared several residual methods, including the vital statistics method, the forward, reverse, and average survival rate methods, on the basis of a symbolic model representing the population in an age group in terms of migration cohorts.
Abstract: Census net migration data and estimates of net migration obtained by the residual method, representing the difference between total population change and natural change during a period, are compared, and some general problems in the use of the residual method are discussed. Several residual methods—the vital statistics method and the forward, reverse, and average survival rate methods—are described, compared, and evaluated. On the basis of a symbolic model representing the population in an age group in terms of migration cohorts, it is shown how the various survival rate formulas, unlike the vital statistics method, fail to make an accurate allowance for the net migration of persons who die during the migration period, except under very restricted conditions of migration. The maximum theoretical errors in the use of the various survival rate formulas, resulting from the inability of survival rates to measure deaths occurring in an area exactly, and the theoretical errors under different condition...

Journal ArticleDOI
TL;DR: In this paper, the problem of sampling without replacement from a discrete, finite, uniform population is discussed and a minimum variance unbiased estimators of the parameters are obtained and compared with other estimators which have been suggested.
Abstract: The problem discussed is that of sampling without replacement from a discrete, finite, uniform population. One source of this problem is the analysis of serial numbers on manufactured items in order to estimate the total number of items manufactured. Minimum variance unbiased estimators of the parameters are obtained and compared with other estimators which have been suggested. Tests of hypothesis and confidence intervals are also discussed.

Journal ArticleDOI
TL;DR: A partial correlation coefficient which is also a multiple correlation coefficient is discussed in this article, and its relationship with other well-known coefficients is explained. Computational methods for computing the estimating equation and the correlation coefficient are suggested.
Abstract: A partial correlation coefficient which is also a multiple correlation coefficient is discussed. Its relationship with other well-known coefficients is explained. Computational methods for computing the estimating equation and the correlation coefficient are suggested. * The writer wishes to thank Professors Harold Hotelling, George E. Nicholson, and John H. Smith for critically reading the manuscript and offering valuable comments. Professor Hotelling indicated the method of computation which he had suggested in an unpublished paper (see note 5). Professor Smith called the writer's attention to some of the earlier references to the subject in the literature. Since the first draft of this paper was written (June, 1951), it has been learned that Professor C. Horace Hamilton, of the North Carolina State College of Agriculture and Engineering, has written an article entitled “Population Pressure and Other Factors Affecting Net Rural-Urban Migration,” in which the coefficient of multiple-partial corr...



Journal ArticleDOI
G. D. Shellard1
TL;DR: In this paper, the product of several random variables is estimated by estimating the Product of Several Random Variables (PEV) with respect to a set of random variables, and the product is then estimated using a linear regression model.
Abstract: (1952). Estimating the Product of Several Random Variables. Journal of the American Statistical Association: Vol. 47, No. 258, pp. 216-221.

Journal ArticleDOI
TL;DR: For the analysis of fertility trends the author draws upon official data and presents a preview of some of the materials in P. K. Whelpton's forthcoming monograph: Cohort Fertility: Native White Women in the United States as discussed by the authors.
Abstract: For the analysis of fertility trends the author draws upon official data and presents a preview of some of the materials in P. K. Whelpton's forthcoming monograph: Cohort Fertility: Native White Women in the United States. His discussion of trends in fertility differentials since 1940 is based mainly upon the Census Bureau's periodic releases giving fertility ratios among groups sampled in the Current Population Survey during the past few years. * The substance of this paper was presented before a joint meeting of the American Statistical Association and the Population Association of America in Chicago on December 27, 1950.

Journal ArticleDOI
TL;DR: In this article, multiple sampling of attributes is used to sample attributes from a set of attributes, and the results are shown to be similar to the ones presented in this paper, but with different attributes.
Abstract: (1952). Multiple Sampling of Attributes. Journal of the American Statistical Association: Vol. 47, No. 258, pp. 203-215.


Journal ArticleDOI
TL;DR: In this article, the authors examined the performance of distribution-free tests when the standard normal situation in fact holds good, for if a test is consistent (i.e., if the probability of rejecting a false alternative hypothesis tends to unity with increasing sample size) there must come a point, for any set of alternatives, where the loss of power involved in its use is negligible.
Abstract: SITUATIONS arise in which we are unable to make the assumptions necessary for the application of standard theory based on the normal distribution. Most of the distribution-free tests which have been proposed for such situations are based on statistics which are very easy to compute, and this ease of computation goes some way to compensate for any information, available in the sample, which may be ignored. Quite apart from such situations, it is interesting to investigate the performances of distribution-free tests when the standard normal situation in fact holds good, for if a test is consistent (i.e. if the probability of rejecting a false alternative hypothesis tends to unity with increasing sample size) there must come a point, for any set of alternatives, where the loss of power involved in its use is negligible. In this paper two very simple tests are examined in this light. A distribution-free test of serial independence of N (unequal) observations ordered in time, proposed by Moore and Wallis [6], consists in counting the number of positive first differences in the series. On the null hypothesis that the observations came from the same (continuous) population, every ordering of the observations is equally probable, so that the mean value and variance of the statistic are very simply obtained, and its distribution can easily be shown to be asymptotically normal. A lower bound for the power of the test against a general class of alternatives, implying a trend in the observations, was obtained by Mann [5]. This paper considers its power in the particular case where the alternative is a normal regression model with coefficient ( and residual variance 2. The loss of power entailed by the use of this test at the 95% level of significance is unimportant when either N ? 25, P/caN/2 _ .5, or N> 75, P/crV/2 _ .3. The difference-sign test is easily generalized to the bivariate case for use in testing the correlation between two series. The approximate power of this second test is tabulate(d below against the alternative hypothesis that the pairs of observations were drawn from a bivariate normal population with non-zero correlation p. Much larger sample sizes are required for the power of this test to approach that of the test based on Fisher's transformation of the correlation coefficient. For