scispace - formally typeset
Search or ask a question
Author

William G. Cochran

Bio: William G. Cochran is an academic researcher from Harvard University. The author has contributed to research in topics: Sampling (statistics) & Observational study. The author has an hindex of 49, co-authored 95 publications receiving 29542 citations. Previous affiliations of William G. Cochran include Iowa State University & North Carolina State University.


Papers
More filters
Journal ArticleDOI
TL;DR: The problem of making a combined estimate has been discussed previously by Cochran and Yates and Cochran (1937) for agricultural experiments, and by Bliss (1952) for bioassays in different laboratories as discussed by the authors.
Abstract: When we are trying to make the best estimate of some quantity A that is available from the research conducted to date, the problem of combining results from different experiments is encountered. The problem is often troublesome, particularly if the individual estimates were made by different workers using different procedures. This paper discusses one of the simpler aspects of the problem, in which there is sufficient uniformity of experimental methods so that the ith experiment provides an estimate xi of u, and an estimate si of the standard error of xi . The experiments may be, for example, determinations of a physical or astronomical constant by different scientists, or bioassays carried out in different laboratories, or agricultural field experiments laid out in different parts of a region. The quantity xi may be a simple mean of the observations, as in a physical determination, or the difference between the means of two treatments, as in a comparative experiment, or a median lethal dose, or a regression coefficient. The problem of making a combined estimate has been discussed previously by Cochran (1937) and Yates and Cochran (1938) for agricultural experiments, and by Bliss (1952) for bioassays in different laboratories. The last two papers give recommendations for the practical worker. My purposes in treating the subject again are to discuss it in more general terms, to take account of some recent theoretical research, and, I hope, to bring the practical recommendations to the attention of some biologists who are not acquainted with the previous papers. The basic issue with which this paper deals is as follows. The simplest method of combining estimates made in a number of different experiments is to take the arithmetic mean of the estimates. If, however, the experiments vary in size, or appear to be of different precision, the investigator may wonder whether some kind of weighted meani would be more precise. This paper gives recommendations about the kinds of weighted mean that are appropriate, the situations in which they

4,335 citations

Journal ArticleDOI
TL;DR: In this article, the authors discuss two kinds of failure to make the best use of x2 tests which I have observed from time to time in reading reports of biological research, and propose a number of methods for strengthening or supplementing the most common uses of the ordinary x2 test.
Abstract: Since the x2 tests of goodness of fit and of association in contingency tables are presented in many courses on statistical methods for beginners in the subject, it is not surprising that x2 has become one of the most commonly-used techniques, even by scientists who profess only a smattering of knowledge of statistics. It is also not surprising that the technique is sometimes misused, e.g. by calculating x2 from data that are not frequencies or by errors in counting the number of degrees of freedom. A good catalogue of mistakes of this kind has been given by Lewis and Burke (1). In this paper I want to discuss two kinds of failure to make the best use of x2 tests which I have observed from time to time in reading reports of biological research. The first arises because x2 tests, as has often been pointed out, are not directed against any specific alternative to the null hypothesis. In the computation of x2, the deviations (fi mi) between observed and expected frequencies are squared, divided by mi in order to equalize the variances (approximately), and added. No attempt is made to detect any particular pattern of deviations (fi mi) that may hold if the null hypothesis is false. One consequence is that the usual x2 tests are often insensitive, and do not indicate significant results when the null hypothesis is actually false. Some forethought about the kind of alternative hypothesis that is likely to hold may lead to alternative tests that are more powerful and appropriate. Further, when the ordinary x2 test does give a significant result, it does not direct attention to the way in which the null hypothesis disagrees with the data, although the pattern of deviations may be informative and suggestive for future research. The remedy here is to supplement the ordinary test by additional tests that help to reveal the significant type of deviation. In this paper a number of methods for strengthening or supplementing the most common uses of the ordinary x2 test will be presented and illustrated by numerical examples. The principal devices are as follows:

3,351 citations


Cited by
More filters
Journal ArticleDOI
04 Sep 2003-BMJ
TL;DR: A new quantity is developed, I 2, which the authors believe gives a better measure of the consistency between trials in a meta-analysis, which is susceptible to the number of trials included in the meta- analysis.
Abstract: Cochrane Reviews have recently started including the quantity I 2 to help readers assess the consistency of the results of studies in meta-analyses. What does this new quantity mean, and why is assessment of heterogeneity so important to clinical practice? Systematic reviews and meta-analyses can provide convincing and reliable evidence relevant to many aspects of medicine and health care.1 Their value is especially clear when the results of the studies they include show clinically important effects of similar magnitude. However, the conclusions are less clear when the included studies have differing results. In an attempt to establish whether studies are consistent, reports of meta-analyses commonly present a statistical test of heterogeneity. The test seeks to determine whether there are genuine differences underlying the results of the studies (heterogeneity), or whether the variation in findings is compatible with chance alone (homogeneity). However, the test is susceptible to the number of trials included in the meta-analysis. We have developed a new quantity, I 2, which we believe gives a better measure of the consistency between trials in a meta-analysis. Assessment of the consistency of effects across studies is an essential part of meta-analysis. Unless we know how consistent the results of studies are, we cannot determine the generalisability of the findings of the meta-analysis. Indeed, several hierarchical systems for grading evidence state that the results of studies must be consistent or homogeneous to obtain the highest grading.2–4 Tests for heterogeneity are commonly used to decide on methods for combining studies and for concluding consistency or inconsistency of findings.5 6 But what does the test achieve in practice, and how should the resulting P values be interpreted? A test for heterogeneity examines the null hypothesis that all studies are evaluating the same effect. The usual test statistic …

45,105 citations

Journal ArticleDOI
TL;DR: This paper examines eight published reviews each reporting results from several related trials in order to evaluate the efficacy of a certain treatment for a specified medical condition and suggests a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies.

33,234 citations

Book ChapterDOI
TL;DR: The analysis of censored failure times is considered in this paper, where the hazard function is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time.
Abstract: The analysis of censored failure times is considered. It is assumed that on each individual arc available values of one or more explanatory variables. The hazard function (age-specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.

28,264 citations

Book
B. J. Winer1
01 Jan 1962
TL;DR: In this article, the authors introduce the principles of estimation and inference: means and variance, means and variations, and means and variance of estimators and inferors, and the analysis of factorial experiments having repeated measures on the same element.
Abstract: CHAPTER 1: Introduction to Design CHAPTER 2: Principles of Estimation and Inference: Means and Variance CHAPTER 3: Design and Analysis of Single-Factor Experiments: Completely Randomized Design CHAPTER 4: Single-Factor Experiments Having Repeated Measures on the Same Element CHAPTER 5: Design and Analysis of Factorial Experiments: Completely-Randomized Design CHAPTER 6: Factorial Experiments: Computational Procedures and Numerical Example CHAPTER 7: Multifactor Experiments Having Repeated Measures on the Same Element CHAPTER 8: Factorial Experiments in which Some of the Interactions are Confounded CHAPTER 9: Latin Squares and Related Designs CHAPTER 10: Analysis of Covariance

25,607 citations

Journal ArticleDOI
TL;DR: It is concluded that H and I2, which can usually be calculated for published meta-analyses, are particularly useful summaries of the impact of heterogeneity, and one or both should be presented in publishedMeta-an analyses in preference to the test for heterogeneity.
Abstract: The extent of heterogeneity in a meta-analysis partly determines the difficulty in drawing overall conclusions. This extent may be measured by estimating a between-study variance, but interpretation is then specific to a particular treatment effect metric. A test for the existence of heterogeneity exists, but depends on the number of studies in the meta-analysis. We develop measures of the impact of heterogeneity on a meta-analysis, from mathematical criteria, that are independent of the number of studies and the treatment effect metric. We derive and propose three suitable statistics: H is the square root of the chi2 heterogeneity statistic divided by its degrees of freedom; R is the ratio of the standard error of the underlying mean from a random effects meta-analysis to the standard error of a fixed effect meta-analytic estimate, and I2 is a transformation of (H) that describes the proportion of total variation in study estimates that is due to heterogeneity. We discuss interpretation, interval estimates and other properties of these measures and examine them in five example data sets showing different amounts of heterogeneity. We conclude that H and I2, which can usually be calculated for published meta-analyses, are particularly useful summaries of the impact of heterogeneity. One or both should be presented in published meta-analyses in preference to the test for heterogeneity.

25,460 citations