scispace - formally typeset
Search or ask a question
Author

William Kruskal

Bio: William Kruskal is an academic researcher from University of Chicago. The author has contributed to research in topics: Population & Sampling (statistics). The author has an hindex of 30, co-authored 68 publications receiving 15532 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C − 1), unless the samples were too small, in which case special approximations or exact tables are provided.
Abstract: Given C samples, with n i observations in the ith sample, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C – 1), unless the samples are too small, in which case special approximations or exact tables are provided. One of the most important applications of the test is in detecting differences among the population means.* * Based in part on research supported by the Office of Naval Research at the Statistical Research Center, University of Chicago.

9,365 citations

Book
01 Jan 1979
TL;DR: In this article, a number of alternative measures are considered, almost all based upon a probabilistic model for activity to which the cross-classification may typically lead, and only the case in which the population is completely known is considered, so no question of sampling or measurement error appears.
Abstract: When populations are cross-classified with respect to two or more classifications or polytomies, questions often arise about the degree of association existing between the several polytomies. Most of the traditional measures or indices of association are based upon the standard chi-square statistic or on an assumption of underlying joint normality. In this paper a number of alternative measures are considered, almost all based upon a probabilistic model for activity to which the cross-classification may typically lead. Only the case in which the population is completely known is considered, so no question of sampling or measurement error appears. We hope, however, to publish before long some approximate distributions for sample estimators of the measures we propose, and approximate tests of hypotheses. Our major theme is that the measures of association used by an empirical investigator should not be blindly chosen because of tradition and convention only, although these factors may properly be g...

2,672 citations

Journal ArticleDOI
TL;DR: The three measures considered at length are the quadrant measure, Kendall's tau, and Spearman's rho as mentioned in this paper, with emphasis on the probabilistic and operational interpretations of their population values.
Abstract: Ordinally invariant, i.e., rank, measures of association for bivariate populations are discussed, with emphasis on the probabilistic and operational interpretations of their population values. The three measures considered at length are the quadrant measure, Kendall's tau, and Spearman's rho. Relationships between these measures are discussed, as are connections between these measures and certain measures of association for cross classifications. Sampling theory is surveyed with special attention to the motivation for sample values of the measures. The historical development of ordinal measures of association is outlined. * This research was supported in part by the Statistics Branch, Office of Naval Research. Reproduction in whole or in part is permitted for any purpose of the United States Government. A large part of the work leading to this paper was done at the Department of Statistics, University of California, Berkeley. I would like to thank the following persons for their suggestions and c...

684 citations

Book ChapterDOI
TL;DR: In this paper, the authors derived large sample normal distributions with their associated standard errors for various measures of association and various methods of sampling and explained how the large sample normality may be used to test hypotheses about the measures and about differences between them, and to construct corresponding confidence intervals.
Abstract: The population measures of association for cross classifications, discussed in the authors' prior publications, have sample analogues that are approximately normally distributed for large samples. (Some qualifications and restrictions are necessary.) These large sample normal distributions with their associated standard errors, are derived for various measures of association and various methods of sampling. It is explained how the large sample normality may be used to test hypotheses about the measures and about differences between them, and to construct corresponding confidence intervals. Numerical results are given about the adequacy of the large sample normal approximations. In order to facilitate extension of the large sample results to other measures of association, and to other modes of sampling, than those treated here, the basic manipulative tools of large sample theory are explained and illustrated.

470 citations

Journal ArticleDOI
TL;DR: In this article, a test of the null hypothesis against alternatives of the form $F_i(x) = F(x - \theta_i)-quad (\text{all} x, i = 1, \cdots, C) was discussed.
Abstract: Suppose that $C$ independent random samples of sizes $n_1, \cdots, n_c$ are to be drawn from $C$ univariate populations with unknown cumulative distribution functions $F_1, \cdots, F_c$. This paper discusses a test of the null hypothesis $F_1 = F_2 = \cdots = F_c$ against alternatives of the form $F_i(x) = F(x - \theta_i)\quad (\text{all} x, i = 1, \cdots, C)$ with the $\theta_i$'s not all equal, or against alternatives of a much more general sort to be specified in Section 5. The test to be discussed has as its critical region large values of the ordinary $F$-ratio for one-way analysis of variance, computed after the observations have been replaced by their ranks in the $\sum n_i$-fold over-all sample. This use of ranks simplifies the distribution theory, and permits application of the test to cases where the ranks are available but the numerical values of the observations are difficult to obtain. Briefly, then, we shall consider a non-parametric analogue, based on ranks, of one-way analysis of variance. It is shown in Section 4 that, under quite general conditions, the proposed test statistic, $H$, is asymptotically chi-square with $C - 1$ degrees of freedom when the null hypothesis holds. Section 5 derives a necessary and sufficient condition that the natural family of sequences of tests based on large values of $H$ all be consistent against a given alternative. Section 6 derives the variance of $H$ under the null hypothesis, Section 7 derives the maximum value of $H$, and Section 8 gives a difference equation which may be used to obtain exact small-sample distributions under the null hypothesis. These derivations are made on the assumption of continuity for the cumulative distribution functions; Section 9 considers extensions to the possibly discontinuous case.

434 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.
Abstract: This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.

64,109 citations

Journal ArticleDOI
Jacob Cohen1
TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Abstract: CONSIDER Table 1. It represents in its formal characteristics a situation which arises in the clinical-social-personality areas of psychology, where it frequently occurs that the only useful level of measurement obtainable is nominal scaling (Stevens, 1951, pp. 2526), i.e. placement in a set of k unordered categories. Because the categorizing of the units is a consequence of some complex judgment process performed by a &dquo;two-legged meter&dquo; (Stevens, 1958), it becomes important to determine the extent to which these judgments are reproducible, i.e., reliable. The procedure which suggests itself is that of having two (or more) judges independently categorize a sample of units and determine the degree, significance, and

34,965 citations

Journal ArticleDOI
TL;DR: In this article, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C − 1), unless the samples were too small, in which case special approximations or exact tables are provided.
Abstract: Given C samples, with n i observations in the ith sample, a test of the hypothesis that the samples are from the same population may be made by ranking the observations from from 1 to Σn i (giving each observation in a group of ties the mean of the ranks tied for), finding the C sums of ranks, and computing a statistic H. Under the stated hypothesis, H is distributed approximately as χ2(C – 1), unless the samples are too small, in which case special approximations or exact tables are provided. One of the most important applications of the test is in detecting differences among the population means.* * Based in part on research supported by the Office of Naval Research at the Statistical Research Center, University of Chicago.

9,365 citations

Journal ArticleDOI
TL;DR: A new method for metagenomic biomarker discovery is described and validates by way of class comparison, tests of biological consistency and effect size estimation to address the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities.
Abstract: This study describes and validates a new method for metagenomic biomarker discovery by way of class comparison, tests of biological consistency and effect size estimation. This addresses the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities, which is a central problem to the study of metagenomics. We extensively validate our method on several microbiomes and a convenient online interface for the method is provided at http://huttenhower.sph.harvard.edu/lefse/.

9,057 citations

Journal ArticleDOI
TL;DR: A discussion of matching, randomization, random sampling, and other methods of controlling extraneous variation is presented in this paper, where the objective is to specify the benefits of randomization in estimating causal effects of treatments.
Abstract: A discussion of matching, randomization, random sampling, and other methods of controlling extraneous variation is presented. The objective is to specify the benefits of randomization in estimating causal effects of treatments. The basic conclusion is that randomization should be employed whenever possible but that the use of carefully controlled nonrandomized data to estimate causal effects is a reasonable and necessary procedure in many cases. Recent psychological and educational literature has included extensive criticism of the use of nonrandomized studies to estimate causal effects of treatments (e.g., Campbell & Erlebacher, 1970). The implication in much of this literature is that only properly randomized experiments can lead to useful estimates of causal effects. If taken as applying to all fields of study, this position is untenable. Since the extensive use of randomized experiments is limited to the last half century,8 and in fact is not used in much scientific investigation today,4 one is led to the conclusion that most scientific "truths" have been established without using randomized experiments. In addition, most of us successfully determine the causal effects of many of our everyday actions, even interpersonal behaviors, without the benefit of randomization. Even if the position that causal effects of treatments can only be well established from randomized experiments is taken as applying only to the social sciences in which

8,377 citations