# Showing papers in "The American Statistician in 1995"

••

TL;DR: A detailed, introductory exposition of the Metropolis-Hastings algorithm, a powerful Markov chain method to simulate multivariate distributions, and a simple, intuitive derivation of this method is given along with guidance on implementation.

Abstract: We provide a detailed, introductory exposition of the Metropolis-Hastings algorithm, a powerful Markov chain method to simulate multivariate distributions. A simple, intuitive derivation of this method is given along with guidance on implementation. Also discussed are two applications of the algorithm, one for implementing acceptance-rejection sampling when a blanketing function is not available and the other for implementing the algorithm with block-at-a-time scans. In the latter situation, many different algorithms, including the Gibbs sampler, are shown to be special cases of the Metropolis-Hastings algorithm. The methods are illustrated with examples.

3,886 citations

••

TL;DR: Evidence that published results of scientific investigations are not a representative sample of results of all scientific studies is presented and practice leading to publication bias have not changed over a period of 30 years is indicated.

Abstract: This article presents evidence that published results of scientific investigations are not a representative sample of results of all scientific studies. Research studies from 11 major journals demonstrate the existence of biases that favor studies that observe effects that, on statistical evaluation, have a low probability of erroneously rejecting the so-called null hypothesis (H 0). This practice makes the probability of erroneously rejecting H 0 different for the reader than for the investigator. It introduces two biases in the interpretation of the scientific literature: one due to multiple repetition of studies with false hypothesis, and one due to failure to publish smaller and less significant outcomes of tests of a true hypotheses. These practices distort the results of literature surveys and of meta-analyses. These results also indicate that practice leading to publication bias have not changed over a period of 30 years.

425 citations

••

TL;DR: In this article, the authors established an independence result concerning a progressive Type-II censored sample from the uniform distribution, which was used to present a simple and efficient simulational algorithm for generating a progressive type-II cached sample from any continuous distribution.

Abstract: We establish an independence result concerning a progressive Type-II censored sample from the uniform distribution. This result is used to present a simple and efficient simulational algorithm for generating a progressive Type-II censored sample from any continuous distribution.

420 citations

••

TL;DR: In this article, a dynamic graphical display is proposed for uniting partial regression and partial residual plots, which helps students understand multicollinearity and interpret the variance inflation factor.

Abstract: A dynamic graphical display is proposed for uniting partial regression and partial residual plots. This animated display helps students understand multicollinearity and interpret the variance inflation factor. The variance inflation factor is presented as the square of the ratio of t-statistics associated with the partial regression and partial residual plots. Examples using two small data sets illustrate this approach.

326 citations

••

TL;DR: Maximum likelihood provides a powerful and extremely general method for making inferences over a wide range of data/model combinations and many mathematical statistics textbooks do not give this important topic coverage commensurate with its place in the world of modern applications.

Abstract: Maximum likelihood (ML) provides a powerful and extremely general method for making inferences over a wide range of data/model combinations. The likelihood function and likelihood ratios have clear intuitive meanings that make it easy for students to grasp the important concepts. Modern computing technology has made it possible to use these methods over a wide range of practical applications. However, many mathematical statistics textbooks, particularly those at the Senior/Masters level, do not give this important topic coverage commensurate with its place in the world of modern applications. Similarly, in nonlinear estimation problems, standard practice (as reflected by procedures available in the popular commercial statistical packages) has been slow to recognize the advantages of likelihood-based confidence regions/intervals over the commonly use “normal-theory” regions/intervals based on the asymptotic distribution of the “Wald statistic.” In this note we outline our approach for presenting, ...

282 citations

••

TL;DR: In this article, the authors present four examples using data from the 1988 National Maternal and Infant Health Survey to demonstrate that weighted and unweighted estimators can be quite different, and to show the underlying causes of such differences.

Abstract: Unweighted estimators using data collected in a sample survey can be badly biased, whereas weighted estimators are approximately unbiased for population parameters. We present four examples using data from the 1988 National Maternal and Infant Health Survey to demonstrate that weighted and unweighted estimators can be quite different, and to show the underlying causes of such differences.

214 citations

••

TL;DR: In this article, the authors present four recursive formulas that translate moments to cumulants and vice versa in univariate and multivariate situations, in the univariate case and in the multivariate case.

Abstract: It seems difficult to find a formula in the literature that relates moments to cumulants (and vice versa) and is useful in computational work rather than in an algebraic approach. Hence I present four very simple recursive formulas that translate moments to cumulants and vice versa in the univariate and multivariate situations.

181 citations

••

TL;DR: In this paper, the authors introduced a measure based on the mode of a distribution that maintains the c-ordering of the distribution, which is a measure of right-skewed distributions that is easily computed as a function of the shape parameter of the family and can be used to measure the degree of skewness.

Abstract: There are several measures employed to quantify the degree of skewness of a distribution. These have been based on the expectations or medians of the distributions considered. In 1964, van Zwet showed that all the standardized odd central moments of order 3 or higher maintained the convex or c-ordering of distributions that he introduced. This ordering has been widely accepted as appropriate for ordering two distributions in relation to skewness. More recently, measures based on the medians have been shown to honor the convex ordering. The measure of skewness (μ – M) / [sgrave] where μ, [sgrave], and M are, respectively, the expectation, standard deviation, and mode of the distribution was initially proposed by Karl Pearson. It unfortunately does not maintain the convex ordering. Here we introduce a measure based on the mode of a distribution that maintains the c-ordering. For many classes of right-skewed distributions, it is easily computed as a function of the shape parameter of the family and ...

153 citations

••

TL;DR: In this paper, several techniques for assessing multivariate normality are described and suggestions are offered for their practical application using two previously published sets of real-life data, and it is shown that simply testing each of the marginal distributions for univariate normal can lead to a mistaken conclusion.

Abstract: The assumption of multivariate normality (MVN) underlies many important techniques in multivariate analysis. In the past 50 years, over 50 tests of this assumption have been proposed. However, for various reasons, practitioners are often reluctant to address the MVN issue. In this article, several techniques for assessing MVN based on well-known tests for univariate normality are described and suggestions are offered for their practical application. The techniques are illustrated using two previously published sets of real-life data. In one of the examples it is shown that simply testing each of the marginal distributions for univariate normality can lead to a mistaken conclusion.

125 citations

••

TL;DR: An overview of a statistical method called Q-TWiST (Quality-Adjusted Time Without Symptoms and Toxicity) which incorporates quality-of-life considerations into treatment comparisons, which emphasizes treatment comparisons based on threshold utility analyses that highlight trade-offs between different health state durations.

Abstract: The quality of life of patients is an important component of evaluation of therapies. We present an overview of a statistical method called Q-TWiST (Quality-Adjusted Time Without Symptoms and Toxicity) which incorporates quality-of-life considerations into treatment comparisons. Multivariate censored survival data are used to partition the overall survival time into periods of time spent in a set of progressive clinical health states which may differ in quality of life. Mean health state durations, restricted to the follow-up limits of the clinical trial, are derived from the data and combined with value weights to estimate quality-adjusted survival. The methodology emphasizes treatment comparisons based on threshold utility analyses that highlight trade-offs between different health state durations; it is not intended to provide a unique result combining quality and quantity of life. We also describe three recent extensions of the methodology: covariates can be included using proportional hazard...

121 citations

••

TL;DR: In this paper, the iterative proportional fitting (IPF) algorithm is used to generate an n-dimensional distribution of correlated categorical data with specified margins of dimension 1, 2, 3, 4, 5, k < n.

Abstract: Two recent papers have suggested methods for generating correlated binary data with fixed marginal distributions and specified degrees of pairwise association. Emrich and Piedmonte suggested a method based on the existence of a multivariate normal distribution, while Lee suggested methods based on linear programming and Archimedian copulas. In this paper, a simpler method is described using the iterative proportional fitting algorithm for generating an n-dimensional distribution of correlated categorical data with specified margins of dimension 1, 2, …, k < n. An example of generating a distribution for a generalized estimating equations (GEE) model is discussed.

••

TL;DR: The authors used an extended example to demonstrate that researchers need to use care when examining what laypersons believe and make the public aware of the Simpson's paradox and other counterintuitive results.

Abstract: A number of psychologists and statisticians are interested in how laypersons make judgments in the face of uncertainties, assess the likelihood of coincidences, and draw conclusions from observation. This is an important and exciting area that has produced a number of interesting articles. This article uses an extended example to demonstrate that researchers need to use care when examining what laypersons believe. In particular, it is argued that the data available to laypersons may be very different from the data available to professional researchers. In addition, laypersons unfamiliar with a counterintuitive result, such as Simpson's paradox, may give the wrong interpretation to the pattern in their data. This paper gives two recommendations to researchers and teachers. First, take care to consider what data are available to laypersons. Second, it is important to make the public aware of Simpson's paradox and other counterintuitive results.

••

TL;DR: In this paper, numerical inversion of the characteristic function is used as a tool for obtaining cumulative distribution functions, which is suitable for instructional purposes, particularly in the illustration of the inversion theorems covered in graduate probability courses.

Abstract: We review and discuss numerical inversion of the characteristic function as a tool for obtaining cumulative distribution functions. With the availability of high-speed computing and symbolic computation software, the method is ideally suited for instructional purposes, particularly in the illustration of the inversion theorems covered in graduate probability courses. The method is also available as an alternative to asymptotic approximations, Monte Carlo, or bootstrap techniques when analytic expressions for the distribution function are not available. We illustrate the method with several examples, including one which is concerned with the detection of possible clusters of disease in an epidemiologic study.

••

York University

^{1}TL;DR: In this article, a dynamic conceptual model for categorical data is described that likens observations to gas molecules in a pressure chamber, and fitting a statistical model by maximum likelihood corresponds to minimizing energy or balancing of forces.

Abstract: A dynamic conceptual model for categorical data is described that likens observations to gas molecules in a pressure chamber. In this physical model frequency corresponds to pressure, and fitting a statistical model by maximum likelihood corresponds to minimizing energy or balancing of forces. The model provides neat explanations of many results for categorical data, extends readily to multiway tables, and provides a rationale for the graphic representation of counts by area or visual density.

••

TL;DR: The Horvitz-Thompson theorem as mentioned in this paper offers a needed integrating perspective for teaching the methods and fundamental concepts of probability sampling, and helps to avoid some common stumbling blocks of beginning students.

Abstract: Courses in sampling often lack a coherent structure because many related sampling designs, estimators, variances, and variance estimators are presented as separate cases. The Horvitz-Thompson theorem offers a needed integrating perspective for teaching the methods and fundamental concepts of probability sampling. Development of basic concepts in sampling via this approach provides the student with tools to solve more complicated problems, and helps to avoid some common stumbling blocks of beginning students. Examples from natural resource sampling are provided to illustrate applications and insight gained from this approach.

••

Purdue University

^{1}, Mount Holyoke College^{2}, Iowa State University^{3}, University of Minnesota^{4}TL;DR: The state of statistics teaching at the university level at the end of the century was discussed in this article, with a focus on the impact of technology, the reform of teaching, and challenges to the internal culture of higher education.

Abstract: Higher education faces an environment of financial constraints, changing customer demands, and loss of public confidence. Technological advances may at last bring widespread change to college teaching. The movement for education reform also urges widespread change. What will be the state of statistics teaching at the university level at the end of the century? This article attempts to imagine plausible futures as stimuli to discussion. It takes the form of provocations by the first author, with responses from the others on three themes: the impact of technology, the reform of teaching, and challenges to the internal culture of higher education.

••

TL;DR: An examination of data analysis exposition sponsored by the Section on Statistical Graphics of the ASA in 1988 shows that approaches commonly identified with Exploratory Data Analysis are substantially more effective at revealing the underlying patterns in the data.

Abstract: At a data analysis exposition sponsored by the Section on Statistical Graphics of the ASA in 1988, 15 groups of statisticians analyzed the same data about salaries of major league baseball players. By examining what they did, what worked, and what failed, we can begin to learn about the relative strengths and weaknesses of different approaches to analyzing data. The data are rich in difficulties. They require reexpression, contain errors and outliers, and exhibit nonlinear relationships. They thus pose a realistic challenge to the variety of data analysis techniques used. The analysis groups chose a wide range of model-fitting methods, including regression, principal components, factor analysis, time series, and CART. We thus have an effective framework for comparing these approaches so that we can learn more about them. Our examination shows that approaches commonly identified with Exploratory Data Analysis are substantially more effective at revealing the underlying patterns in the data and at ...

••

Astra

^{1}TL;DR: In this paper, the folded empirical distribution function curve, or mountain plot, is described and compared to alternative plots, which is a graphical display that complements other visual representations of the data, as illustrated with several examples.

Abstract: Various graphical methods are available for displaying one or more univariate distributions. In this article the folded empirical distribution function curve, or mountain plot, is described and compared to alternative plots. The mountain plot is a graphical display that complements other visual representations of the data, as is illustrated with several examples.

••

TL;DR: A list of over 300 terms commonly used in mathematical statistics is presented with their apparent first occurrence in print, with some of the more interesting problems encountered in preparing the list.

Abstract: A list of over 300 terms commonly used in mathematical statistics is presented with their apparent first occurrence in print. Some of the more interesting problems encountered in preparing the list are described.

••

IBM

^{1}TL;DR: It is shown that for all positive t and for all distributions, the moment bound is tighter than Chernoff's bound.

Abstract: Chernoff's bound on P[X ⩾ t] is used almost universally when a tight bound on tail probabilities is required. In this article we show that for all positive t and for all distributions, the moment bound is tighter than Chernoff's bound. By way of example, we demonstrate that the improvement is often substantial.

••

TL;DR: It is concluded that much unnecessary complication has been introduced into the computer analysis of linear models by the imposition of constraints on parameters, neglect of marginality relations in forming hypotheses, and confusion between the form of noncentrality parameters and hypotheses.

Abstract: The responses to a recent paper by Dallal in this journal are evaluated by reference to the ideas of Frank Yates. It is concluded that much unnecessary complication has been introduced into the computer analysis of linear models by (1) the imposition of constraints on parameters, (2) neglect of marginality relations in forming hypotheses, and (3) confusion between the form of noncentrality parameters and hypotheses.

••

TL;DR: This work provides exact confidence intervals for noncentrality, power, and sample size, particularly one-sided intervals, that help in planning a future study and in evaluating existing studies.

Abstract: The power of a test, the probability of rejecting the null hypothesis in favor of an alternative, may be computed using estimates of one or more distributional parameters. Statisticians frequently fix mean values and calculate power or sample size using a variance estimate from an existing study. Hence computed power becomes a random variable for a fixed sample size. Likewise, the sample size necessary to achieve a fixed power varies randomly. Standard statistical practice requires reporting uncertainty associated with such point estimates. Previous authors studied an asymptotically unbiased method of obtaining confidence intervals for noncentrality and power of the general linear univariate model in this setting. We provide exact confidence intervals for noncentrality, power, and sample size. Such confidence intervals, particularly one-sided intervals, help in planning a future study and in evaluating existing studies.

••

TL;DR: This article describes the experience of a student with projects in the context of a large lecture course where projects are carried out in groups.

Abstract: Introductory statistics courses should incorporate unstructured projects where students themselves generate the problems they choose to study, gather their own data, analyze the information using suitable computer software, and communicate their findings in a report. This article describes my experience with projects in the context of a large lecture course where projects are carried out in groups. Special problems that arise with group projects are discussed. Several successful student projects are described.

••

TL;DR: Simulation data are used to test a student's beliefs about the relative probabilities of two sequences obtained by flipping a fair coin to illustrate general issues in using simulations instructionally.

Abstract: Simulation data are used to test a student's beliefs about the relative probabilities of two sequences obtained by flipping a fair coin. The episode is used to illustrate general issues in using simulations instructionally.

••

TL;DR: This article showed that the conditionality principle should not be taken to be of universal validity, and discussed some consequences of these examples, such as confidence coefficients and levels of tests being dismissed as meaningless.

Abstract: The famous theorem of Birnbaum, stating that the likelihood principle follows from the conditionality principle together with the sufficiency principle, has caused much discussion among statisticians. Briefly, many writers dislike the consequences of the likelihood principle (among other things, confidence coefficients and levels of tests are dismissed as meaningless), but at the same time they feel that both the conditionality principle and the sufficiency principle are intuitively obvious. In the present article we give examples to show that the conditionality principle should not be taken to be of universal validity, and we discuss some consequences of these examples.

••

TL;DR: In this article, the authors describe methods used to provide an exact test of significance of the hypothesis that all factors are mutually independent of each other in 23 and 24 contingency tables, and give bounds on the number of tables needed to perform this exact significance test.

Abstract: We describe methods used to provide an exact test of significance of the hypothesis that all factors are mutually independent of each other in 23 and 24 contingency tables. Several numerical examples demonstrate the advantages of exact tests over approximate significance levels. We give bounds on the number of tables needed to perform this exact significance test. In four or more dimensions the number of tables in this enumeration becomes astronomical with even modest sample sizes. Inverting the characteristic function of the exact distribution has proved useful in these situations.

••

TL;DR: A comparative review of three meta-analytic software packages: DSTAT, TRUE EPISTAT, and FAST*PRO is presented.

Abstract: DSTAT, Version 1.10: Available from Lawrence Erlbaum Associates, Inc., 10 Industrial Ave., Mahwah, NJ 07430-2262; phone: 800-926-6579 TRUE EPISTAT, Version 4.0: Available from Epistat Services, 2011 Cap Rock Circle, Richardson, TX 75080; phone: 214-680-1376; fax: 214-680-1303. FAST*PRO, Version 1.0: Available from Academic Press, Inc., 955 Massachusetts Avenue, Cambridge, MA 02139; phone: 800-321-5068; fax: 800-336-7377. Meta-analysts conduct studies in which the responses are analytic summary measurements, such as risk differences, effect sizes, p values, or z statistics, obtained from a series of independent studies. The motivation for conducting a meta-analysis is to integrate research findings over studies in order to summarize the evidence about treatment efficacy or risk factors. This article presents a comparative review of three meta-analytic software packages: DSTAT, TRUE EPISTAT, and FAST*PRO.

••

TL;DR: In this article, the authors compared the performance of several treatments to control in a single experiment versus separate experiments in terms of Type I error rate and power, and found that if no Dunnett correction is applied in the single experiment case with relatively few treatments, the distribution of the number of type I errors is not that different from what it would be in separate experiments with the same number of subjects in each treatment.

Abstract: We contrast comparisons of several treatments to control in a single experiment versus separate experiments in terms of Type I error rate and power. It is shown that if no Dunnett correction is applied in the single experiment case with relatively few treatments, the distribution of the number of Type I errors is not that different from what it would be in separate experiments with the same number of subjects in each treatment. The difference becomes more pronounced with a larger number of treatments. Extreme outcomes (either very few or very many rejections) are more likely when comparisons are made in a single experiment. When the total number of subjects is the same in a single versus separate experiments, power is generally higher in a single experiment even if a Dunnett adjustment is made.

••

TL;DR: Hasse diagrams summarize the structure of mixed models and can be used by a statistical consultant to help design a complicated experiment or to help clarify the data structure to be analyzed.

Abstract: Hasse diagrams summarize the structure of mixed models and can be used by a statistical consultant to help design a complicated experiment or to help clarify the structure of data to be analyzed. They are also useful in the classroom as an aid for obtaining expected mean squares or deciding which denominator should be used in an F statistic.