scispace - formally typeset
Search or ask a question

Showing papers in "Statistical Science in 2014"


Journal ArticleDOI
TL;DR: A formal representation called "selection diagrams" for expressing knowledge about differences and commonalities between populations of interest is introduced and questions of transportability are reduced to symbolic derivations in the do-calculus.
Abstract: The generalizability of empirical findings to new environments, settings or populations, often called �external validity,� is essential in most scientific explorations. This paper treats a particular problem of generalizability, called �transportability,� defined as a license to transfer causal effects learned in experimental studies to a new population, in which only observational studies can be conducted. We introduce a formal representation called �selection diagrams� for expressing knowledge about differences and commonalities between populations of interest and, using this representation, we reduce questions of transportability to symbolic derivations in the docalculus. This reduction yields graph-based procedures for deciding, prior to observing any data, whether causal effects in the target population can be inferred from experimental findings in the study population. When the answer is affirmative, the procedures identify what experimental and observational findings need be obtained from the two populations, and how they can be combined to ensure bias-free transport.

359 citations


Journal ArticleDOI
TL;DR: It is proved that the doubly robust estimation method uniformly improves over existing techniques, achieving both lower variance in value estimation and better policies, and is expected to become common practice in policy evaluation and optimization.
Abstract: We study sequential decision making in environments where rewards are only partially observed, but can be modeled as a function of observed contexts and the chosen action by the decision maker. This setting, known as contextual bandits, encompasses a wide variety of applications such as health care, content recommendation and Internet advertising. A central task is evaluation of a new policy given historic data consisting of contexts, actions and received rewards. The key challenge is that the past data typically does not faithfully represent proportions of actions taken by a new policy. Previous approaches rely either on models of rewards or models of the past policy. The former are plagued by a large bias whereas the latter have a large variance. In this work, we leverage the strengths and overcome the weaknesses of the two approaches by applying the doubly robust estimation technique to the problems of policy evaluation and optimization. We prove that this approach yields accurate value estimates when we have either a good (but not necessarily consistent) model of rewards or a good (but not necessarily consistent) model of past policy. Extensive empirical comparison demonstrates that the doubly robust estimation uniformly improves over existing techniques, achieving both lower variance in value estimation and better policies. As such, we expect the doubly robust approach to become common practice in policy evaluation and optimization.

225 citations


Journal ArticleDOI
TL;DR: In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics, and a dynamic treatment regime is a set of sequential decision rules that operationalizes this process.
Abstract: In clinical practice, physicians make a series of treatment decisions over the course of a patient’s disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

167 citations


Journal ArticleDOI
TL;DR: Structural nested models (SNMs) and the associated method of G-estimation were first proposed by James Robins over two decades ago as approaches to modeling and estimating the joint effects of a sequence of treatments or exposures as discussed by the authors.
Abstract: Structural nested models (SNMs) and the associated method of G-estimation were first proposed by James Robins over two decades ago as approaches to modeling and estimating the joint effects of a sequence of treatments or exposures. The models and estimation methods have since been extended to dealing with a broader series of problems, and have considerable advantages over the other methods developed for estimating such joint effects. Despite these advantages, the application of these methods in applied research has been relatively infrequent; we view this as unfortunate. To remedy this, we provide an overview of the models and estimation methods as developed, primarily by Robins, over the years. We provide insight into their advantages over other methods, and consider some possible reasons for failure of the methods to be more broadly adopted, as well as possible remedies. Finally, we consider several extensions of the standard models and estimation methods.

120 citations


Journal ArticleDOI
TL;DR: In this paper, the authors use causal diagrams to distinguish among three causal mechanisms that give rise to interference: direct interference, interference by contagion and allocational interference, where one individual's outcome may affect the outcomes of other individuals with whom he comes into contact.
Abstract: The term �interference� has been used to describe any setting in which one subject�s exposure may affect another subject�s outcome. We use causal diagrams to distinguish among three causal mechanisms that give rise to interference. The first causal mechanism by which interference can operate is a direct causal effect of one individual�s treatment on another individual�s outcome; we call this direct interference. Interference by contagion is present when one individual�s outcome may affect the outcomes of other individuals with whom he comes into contact. Then giving treatment to the first individual could have an indirect effect on others through the treated individual�s outcome. The third pathway by which interference may operate is allocational interference. Treatment in this case allocates individuals to groups; through interactions within a group, individuals may affect one another�s outcomes in any number of ways. In many settings, more than one type of interference will be present simultaneously. The causal effects of interest differ according to which types of interference are present, as do the conditions under which causal effects are identifiable. Using causal diagrams for interference, we describe these differences, give criteria for the identification of important causal effects, and discuss applications to infectious diseases.

107 citations


Journal ArticleDOI
TL;DR: A Bayesian method for producing probabilistic population projections for most countries that the United Nations could use, which has at its core Bayesian hierarchical models for the total fertility rate and life expectancy at birth.
Abstract: The United Nations regularly publishes projections of the populations of all the world's countries broken down by age and sex. These projections are the de facto standard and are widely used by international organizations, governments and researchers. Like almost all other population projections, they are produced using the standard deterministic cohort-component projection method and do not yield statements of uncertainty. We describe a Bayesian method for producing probabilistic population projections for most countries that the United Nations could use. It has at its core Bayesian hierarchical models for the total fertility rate and life expectancy at birth. We illustrate the method and show how it can be extended to address concerns about the UN's current assumptions about the long-term distribution of fertility. The method is implemented in the R packages bayesTFR, bayesLife, bayesPop and bayesDem.

95 citations


Journal ArticleDOI
TL;DR: The authors traces the history of the two-piece normal distribution from its origin in the posthumous Kollektivmasslehre (1897) of Gustav Theodor Fechner to its rediscoveries and generalisations.
Abstract: This paper traces the history of the two-piece normal distribution from its origin in the posthumous Kollektivmasslehre (1897) of Gustav Theodor Fechner to its rediscoveries and generalisations. The denial of Fechner’s originality by Karl Pearson, reiterated a century later by Oscar Sheynin, is shown to be without foundation.

91 citations


Journal ArticleDOI
TL;DR: In this article, a strong, finite sample, version of Bell's inequality is shown to be inconsistent with the conjunction of locality, realism and freedom, and it is argued that Bell's theorem should lead us to relinquish not locality, but realism.
Abstract: Bell�s [Physics 1 (1964) 195�200] theorem is popularly supposed to establish the nonlocality of quantum physics. Violation of Bell�s inequality in experiments such as that of Aspect, Dalibard and Roger [Phys. Rev. Lett. 49 (1982) 1804�1807] provides empirical proof of nonlocality in the real world. This paper reviews recent work on Bell�s theorem, linking it to issues in causality as understood by statisticians. The paper starts with a proof of a strong, finite sample, version of Bell�s inequality and thereby also of Bell�s theorem, which states that quantum theory is incompatible with the conjunction of three formerly uncontroversial physical principles, here referred to as locality, realism and freedom. Locality is the principle that the direction of causality matches the direction of time, and that causal influences need time to propagate spatially. Realism and freedom are directly connected to statistical thinking on causality: they relate to counterfactual reasoning, and to randomisation, respectively. Experimental loopholes in state-of-the-art Bell type experiments are related to statistical issues of post-selection in observational studies, and the missing at random assumption. They can be avoided by properly matching the statistical analysis to the actual experimental design, instead of by making untestable assumptions of independence between observed and unobserved variables. Methodological and statistical issues in the design of quantum Randi challenges (QRC) are discussed. The paper argues that Bell�s theorem (and its experimental confirmation) should lead us to relinquish not locality, but realism.

80 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a review of the performance of a wide range of test strategies to assess association between a group of rare variants and a trait, with competing claims about their performance.
Abstract: In the search for genetic factors that are associated with complex heritable human traits, considerable attention is now being focused on rare variants that individually have small effects. In response, numerous recent papers have proposed testing strategies to assess association between a group of rare variants and a trait, with competing claims about the performance of various tests. The power of a given test in fact depends on the nature of any association and on the rareness of the variants in question. We review such tests within a general framework that covers a wide range of genetic models and types of data. We study the performance of specific tests through exact or asymptotic power formulas and through novel simulation studies of over 10,000 different models. The tests considered are also applied to real sequence data from the 1000 Genomes project and provided by the GAW17. We recommend a testing strategy, but our results show that power to detect association in plausible genetic scenarios is low for studies of medium size unless a high proportion of the chosen variants are causal. Consequently, considerable attention must be given to relevant biological information that can guide the selection of variants for testing.

75 citations


ReportDOI
TL;DR: In this article, the authors review recent work in the statistics literature on instrumental variables methods from an econometrics perspective and provide context to the current applications, a better understanding of the applicability of these methods may arise.
Abstract: I review recent work in the statistics literature on instrumental variables methods from an econometrics perspective. I discuss some of the older, economic, applications including supply and demand models and relate them to the recent applications in settings of randomized experiments with noncompliance. I discuss the assumptions underlying instrumental variables methods and in what settings these may be plausible. By providing context to the current applications, a better understanding of the applicability of these methods may arise.

74 citations


Journal ArticleDOI
TL;DR: Galform as mentioned in this paper is a state-of-the-art model of galaxy formation that uses Bayesian emulation within an iterative history matching strategy, and represents the most detailed uncertainty analysis of a galaxy formation simulation yet performed.
Abstract: Cosmologists at the Institute of Computational Cosmology, Durham University, have developed a state of the art model of galaxy formation known as Galform, intended to contribute to our understanding of the formation, growth and subsequent evolution of galaxies in the presence of dark matter. Galform requires the specification of many input parameters and takes a significant time to complete one simulation, making comparison between the model’s output and real observations of the Universe extremely challenging. This paper concerns the analysis of this problem using Bayesian emulation within an iterative history matching strategy, and represents the most detailed uncertainty analysis of a galaxy formation simulation yet performed.

Journal ArticleDOI
TL;DR: Several sensitivity analysis techniques for the infectiousness effect are described which, in a vaccine trial, captures the effect of the vaccine of one person on protecting a second person from infection even if the first is infected.
Abstract: Causal inference with interference is a rapidly growing area. The literature has begun to relax the �no-interference� assumption that the treatment received by one individual does not affect the outcomes of other individuals. In this paper we briefly review the literature on causal inference in the presence of interference when treatments have been randomized. We then consider settings in which causal effects in the presence of interference are not identified, either because randomization alone does not suffice for identification or because treatment is not randomized and there may be unmeasured confounders of the treatment�outcome relationship. We develop sensitivity analysis techniques for these settings. We describe several sensitivity analysis techniques for the infectiousness effect which, in a vaccine trial, captures the effect of the vaccine of one person on protecting a second person from infection even if the first is infected. We also develop two sensitivity analysis techniques for causal effects under interference in the presence of unmeasured confounding which generalize analogous techniques when interference is absent. These two techniques for unmeasured confounding are compared and contrasted.

Journal ArticleDOI
TL;DR: Empirical Bayes methods use the data from parallel experiments, for instance, observations Xk i« N(Θk, 1) for k = 1, 2,..., N, to estimate the conditional distributions Θk|Xk as discussed by the authors.
Abstract: Empirical Bayes methods use the data from parallel experiments, for instance, observations Xk i« N(Θk, 1) for k = 1, 2, . . . , N, to estimate the conditional distributions Θk|Xk. There are two main estimation strategies: modeling on the ¦E space, called i°g-modelingi± here, and modeling on the x space, called i°f -modeling.i± The two approaches are described and compared. A series of computational formulas are developed to assess their frequentist accuracy. Several examples, both contrived and genuine, show the strengths and limitations of the two strategies.

Journal ArticleDOI
TL;DR: This paper considers conducting inference about the effect of a treatment (or exposure) on an outcome of interest and approaches are considered in various settings, including assessing principal strata effects, direct and indirect effects and effects of time-varying exposures.
Abstract: This paper considers conducting inference about the effect of a treatment (or exposure) on an outcome of interest. In the ideal setting where treatment is assigned randomly, under certain assumptions the treatment effect is identifiable from the observable data and inference is straightforward. However, in other settings such as observational studies or randomized trials with noncompliance, the treatment effect is no longer identifiable without relying on untestable assumptions. Nonetheless, the observable data often do provide some information about the effect of treatment, that is, the parameter of interest is partially identifiable. Two approaches are often employed in this setting: (i) bounds are derived for the treatment effect under minimal assumptions, or (ii) additional untestable assumptions are invoked that render the treatment effect identifiable and then sensitivity analysis is conducted to assess how inference about the treatment effect changes as the untestable assumptions are varied. Approaches (i) and (ii) are considered in various settings, including assessing principal strata effects, direct and indirect effects and effects of time-varying exposures. Methods for drawing formal inference about partially identified parameters are also discussed.

Journal ArticleDOI
TL;DR: In this article, the utility of recursive marginal likelihood estimators for computational Bayesian analyses of a particular family of recursive estimators characterized by the (equivalent) algorithms known as ''biased sampling" or ''reverse lo- gistic regression" in the statistics literature and ''the density of states" in physics is investigated.
Abstract: We investigate the utility to computational Bayesian analyses of a particular family of recursive marginal likelihood estimators characterized by the (equivalent) algorithms known as \biased sampling" or \reverse lo- gistic regression" in the statistics literature and \the density of states" in physics. Through a pair of numerical examples (including mixture modeling of the well-known galaxy dataset) we highlight the remarkable diversity of sampling schemes amenable to such recursive normalization, as well as the notable eciency of the resulting pseudo-mixture distributions for gauging prior-sensitivity in the Bayesian model selection context. Our key theo- retical contributions are to introduce a novel heuristic (\thermodynamic integration via importance sampling") for qualifying the role of the bridg- ing sequence in this procedure, and to reveal various connections between these recursive estimators and the nested sampling technique.

Journal ArticleDOI
TL;DR: In this article, the authors explore the limitations of the LATE in the context of epidemiologic and public health research and conclude that it is not the primary choice for determining the local average treatment effect.
Abstract: We appreciated Imbens' summary and reflections on the state of instrumental variable (IV) methods from an econometrician's perspective. His review was much needed as it clarified several issues that have been historically a source of confusion when individuals from different disciplines discussed IV methods. Among the many topics covered by Imbens, we would like to focus on the common choice of the local average treatment effect (LATE) over the "global" average treatment effect (ATE) in IV analyses of epidemiologic data. As Imbens acknowledges, this choice of the LATE as an estimand has been contentious (Angrist, Imbens and Rubin, 1996; Robins and Greenland, 1996; Deaton, 2010; Imbens, 2010; Pearl, 2011). Several authors have questioned the usefulness of the LATE for informing clinical practice and policy decisions, because it only pertains to an unknown subset of the population of interest: the so-called "compliers". To make things worse, many studies do not even report the expected proportion of compliers in the study population (Swanson and Hernan, 2013). Other authors have wondered whether the LATE is advocated for simply because of the relatively weaker assumptions required for its identification, analogous to the drunk who stays close to the lamp post and declares whatever he finds under its light is what he was looking for all along (Deaton, 2010). Here we explore the limitations of the LATE in the context of epidemiologic and public health research. First we discuss the relevance of LATE as an effect measure and conclude that it is not our primary choice. Second, we discuss the tenability of the monotonicity condition and conclude that this assumption is not a plausible one in many common settings. Finally, we propose further alternatives to the LATE, beyond those discussed by Imbens, that refocus on the global ATE in the population of interest.

Journal ArticleDOI
TL;DR: Connections of a wide class of calibration estimators, constructed based on generalized empirical likelihood, to many existing estimators in biostatistics, econometrics and survey sampling are provided and simulation studies are performed to show that the finite sample properties of calibrated estimators conform well with the theoretical results being studied.
Abstract: In the presence of a missing response, reweighting the complete case subsample by the inverse of nonmissing probability is both intuitive and easy to implement. When the population totals of some auxiliary variables are known and when the inclusion probabilities are known by design, survey statisticians have developed calibration methods for improving efficiencies of the inverse probability weighting estimators and the methods can be applied to missing data analysis. Model-based calibration has been proposed in the survey sampling literature, where multidimensional auxiliary variables are first summarized into a predictor function from a working regression model. Usually, one working model is being proposed for each parameter of interest and results in different sets of calibration weights for estimating different parameters. This paper considers calibration using multiple working regression models for estimating a single or multiple parameters. Contrary to a common belief that overfitting hurts efficiency, we present three rather unexpected results. First, when the missing probability is correctly specified and multiple working regression models for the conditional mean are posited, calibration enjoys an oracle property: the same semiparametric efficiency bound is attained as if the true outcome model is known in advance. Second, when the missing data mechanism is misspecified, calibration can still be a consistent estimator when any one of the outcome regression models is correctly specified. Third, a common set of calibration weights can be used to improve efficiency in estimating multiple parameters of interest and can simultaneously attain semiparametric efficiency bounds for all parameters of interest. We provide connections of a wide class of calibration estimators, constructed based on generalized empirical likelihood, to many existing estimators in biostatistics, econometrics and survey sampling and perform simulation studies to show that the finite sample properties of calibration estimators conform well with the theoretical results being studied.

Journal ArticleDOI
TL;DR: In the early morning hours of June 1, 2009, during a flight from Rio de Janeiro to Paris, Air France Flight AF 447 disappeared during stormy weather over a remote part of the Atlantic carrying 228 passengers and crew to their deaths.
Abstract: In the early morning hours of June 1, 2009, during a flight from Rio de Janeiro to Paris, Air France Flight AF 447 disappeared during stormy weather over a remote part of the Atlantic carrying 228 passengers and crew to their deaths. After two years of unsuccessful search, the authors were asked by the French Bureau d’Enquetes et d’Analyses pour la securite de l’aviation to develop a probability distribution for the location of the wreckage that accounted for all information about the crash location as well as for previous search efforts. We used a Bayesian procedure developed for search planning to produce the posterior target location distribution. This distribution was used to guide the search in the third year, and the wreckage was found with one week of undersea search. In this paper we discuss why Bayesian analysis is ideally suited to solving this problem, review previous non-Bayesian efforts, and describe the methodology used to produce the posterior probability distribution for the location of the wreck.

Journal ArticleDOI
TL;DR: A Bayesian model is presented that systematically combines disparate data to make country-, region- and global-level estimates of time trends in important health indicators, including blood pressure, which is the leading risk factor for cardiovascular disease and leading cause of death worldwide.
Abstract: Improving health worldwide will require rigorous quantification of population-level trends in health status. However, global-level surveys are not available, forcing researchers to rely on fragmentary country-specific data of varying quality. We present a Bayesian model that systematically combines disparate data to make country-, region- and global-level estimates of time trends in important health indicators. The model allows for time and age nonlinearity, and it borrows strength in time, age, covariates, and within and across regional country clusters to make estimates where data are sparse. The Bayesian approach allows us to account for uncertainty from the various aspects of missingness as well as sampling and parameter uncertainty. MCMC sampling allows for inference in a high-dimensional, constrained parameter space, while providing posterior draws that allow straightforward inference on the wide variety of functionals of interest. Here we use blood pressure as an example health metric. High blood pressure is the leading risk factor for cardiovascular disease, the leading cause of death worldwide. The results highlight a risk transition, with decreasing blood pressure in high-income regions and increasing levels in many lower-income regions.

Journal ArticleDOI
Deborah G. Mayo1
TL;DR: In this article, the authors provide a new clarification and critique of Birnbaum's argument and show how data may violate the strong likelihood principle while holding both the Weak Conditionality Principle (WCP) and the SP.
Abstract: An essential component of inference based on familiar frequentist notions, such as p-values, significance and confidence levels, is the relevant sampling distribution. This feature results in violations of a principle known as the strong likelihood principle (SLP), the focus of this paper. In particular, if outcomes x* and y* from experiments E1 and E2 (both with unknown parameter θ) have different probability models f1(.), f2(.), then even though f1(x*; θ) = cf2(y*; θ) for all θ, outcomes x* and y* may have different implications for an inference about θ. Although such violations stem from considering outcomes other than the one observed, we argue this does not require us to consider experiments other than the one performed to produce the data. David Cox [Ann. Math. Statist. 29 (1958) 357.372] proposes the Weak Conditionality Principle (WCP) to justify restricting the space of relevant repetitions. The WCP says that once it is known which Ei produced the measurement, the assessment should be in terms of the properties of Ei . The surprising upshot of Allan Birnbaum's [J. Amer. Statist. Assoc. 57 (1962) 269.306] argument is that the SLP appears to follow from applying theWCP in the case of mixtures, and so uncontroversial a principle as sufficiency (SP). But this would preclude the use of sampling distributions. The goal of this article is to provide a new clarification and critique of Birnbaum's argument. Although his argument purports that [(WCP and SP) entails SLP], we show how data may violate the SLP while holding both the WCP and SP. Such cases also refute [WCP entails SLP].

Journal ArticleDOI
TL;DR: In this article, the authors consider the Bayesian analysis of a few complex, highdimensional models and show that intuitive priors, which are not tailored to the fine details of the model and the estimated parameters, produce estimators which perform poorly in situations in which good, simple frequentist estimators exist.
Abstract: We consider the Bayesian analysis of a few complex, highdimensional models and show that intuitive priors, which are not tailored to the fine details of the model and the estimated parameters, produce estimators which perform poorly in situations in which good, simple frequentist estimators exist. The models we consider are: stratified sampling, the partial linear model, linear and quadratic functionals of white noise and estimation with stopping times. We present a strong version of Doob�fs consistency theorem which demonstrates that the existence of a uniformly �a n-consistent estimator ensures that the Bayes posterior is �a n-consistent for values of the parameter in subsets of prior probability 1. We also demonstrate that it is, at least, in principle, possible to construct Bayes priors giving both global and local minimax rates, using a suitable combination of loss functions.We argue that there is no contradiction in these apparently conflicting findings.

Journal ArticleDOI
TL;DR: In this article, higher order tangent spaces and influence functions are reviewed and their use to construct minimax efficient estimators for parameters in highdimensional semiparametric models is discussed.
Abstract: We review higher order tangent spaces and influence functions and their use to construct minimax efficient estimators for parameters in highdimensional semiparametric models.

Journal ArticleDOI
TL;DR: In this paper, the authors describe the emergence of the modelling approach and the refinement of the weighting approach for confounder control in observational studies, and describe the role of regression models in confounders control.
Abstract: Control for confounders in observational studies was generally handled through stratification and standardization until the 1960s. Standardization typically reweights the stratum-specific rates so that exposure categories become comparable. With the development first of loglinear models, soon also of nonlinear regression techniques (logistic regression, failure time regression) that the emerging computers could handle, regression modelling became the preferred approach, just as was already the case with multiple regression analysis for continuous outcomes. Since the mid 1990s it has become increasingly obvious that weighting methods are still often useful, sometimes even necessary. On this background we aim at describing the emergence of the modelling approach and the refinement of the weighting approach for confounder control.

Journal ArticleDOI
TL;DR: In this paper, a success story regarding Bayesian inference in fishery management in the Baltic Sea is discussed, and the technical and human challenges in using Bayesian modeling to give practical advice to the public and to government.
Abstract: . We review a success story regarding Bayesian inference infisheries management in the Baltic Sea. The management of salmonfisheries is currently based on the results of a complex Bayesian pop-ulation dynamic model, and managers and stakeholders use the prob-abilities in their discussions. We also discuss the technical and humanchallenges in using Bayesian modeling to give practical advice to thepublic and to government officials and suggest future areas in which itcan be applied. In particular, large databases in fisheries science offerflexible ways to use hierarchical models to learn the population dynam-ics parameters for those by-catch species that do not have similar largestock-specific data sets like those that exist for many target species.This information is required if we are to understand the future ecosys-tem risks of fisheries. Key words and phrases: Bayesian inference, Baltic salmon, risk anal-ysis, fishery management, decision analysis.1. INTRODUCTIONWe introduce a case of fisheries managementwhere Bayesian inference has been extensively used.Fisheries management is a field of applied science,and one could easily argue that fisheries science is

Journal ArticleDOI
TL;DR: This work describes a uniformly consistent estimator of both the Markov equivalence class of a linear Gaussian causal structure and the identifiable structural coefficients in theMarkov equivalences class under the Causal Markov assumption and the considerably weaker k-Triangle-Faithfulness assumption.
Abstract: Spirtes, Glymour and Scheines [Causation, Prediction, and Search (1993) Springer] described a pointwise consistent estimator of the Markov equivalence class of any causal structure that can be represented by a directed acyclic graph for any parametric family with a uniformly consistent test of conditional independence, under the Causal Markov and Causal Faithfulness assumptions. Robins et al. [Biometrika 90 (2003) 491�515], however, proved that there are no uniformly consistent estimators of Markov equivalence classes of causal structures under those assumptions. Subsequently, Kalisch and Buhlmann [J. Mach. Learn. Res. 8 (2007) 613�636] described a uniformly consistent estimator of the Markov equivalence class of a linear Gaussian causal structure under the Causal Markov and Strong Causal Faithfulness assumptions. However, the Strong Faithfulness assumption may be false with high probability in many domains. We describe a uniformly consistent estimator of both the Markov equivalence class of a linear Gaussian causal structure and the identifiable structural coefficients in the Markov equivalence class under the Causal Markov assumption and the considerably weaker k-Triangle-Faithfulness assumption.

Journal ArticleDOI
TL;DR: In this article, Efron's [Biometrika 58 (1971) 403-417] biased-coin rule is compared with a Bayesian rule, which is shown to have appealing properties; at the cost of slight imbalance, bias is virtually eliminated for large samples.
Abstract: Biased-coin designs are used in clinical trials to allocate treatments with some randomness while maintaining approximately equal allocation. More recent rules are compared with Efron’s [Biometrika 58 (1971) 403–417] biased-coin rule and extended to allow balance over covariates. The main properties are loss of information, due to imbalance, and selection bias. Theoretical results, mostly large sample, are assembled and assessed by small-sample simulations. The properties of the rules fall into three clear categories. A Bayesian rule is shown to have appealing properties; at the cost of slight imbalance, bias is virtually eliminated for large samples.

Journal ArticleDOI
TL;DR: In this paper, the authors reexamine the data from a weight-judging competition described in an article by Francis Galton published in 1907, and show that this forecasting competition is an interesting precursor of two more recent developments in the statistical forecasting literature.
Abstract: This note reexamines the data from a weight-judging competition described in an article by Francis Galton published in 1907. Following the correction of some errors, it is shown that this forecasting competition is an interesting precursor of two more recent developments in the statistical forecasting literature. One is forecast combination, with the mean forecast here exactly coinciding with the outcome, and the second is the use of two-piece frequency and probability distributions to describe asymmetry.

Journal ArticleDOI
TL;DR: In this paper, a parametric path in the space of models is defined, and the properties of estimators as the groups of data move from being far apart to close together.
Abstract: There are several methods for obtaining very robust estimates of regression parameters that asymptotically resist 50% of outliers in the data. Differences in the behaviour of these algorithms depend on the distance between the regression data and the outliers. We introduce a parameter $\lambda$ that defines a parametric path in the space of models and enables us to study, in a systematic way, the properties of estimators as the groups of data move from being far apart to close together. We examine, as a function of $\lambda$, the variance and squared bias of five estimators and we also consider their power when used in the detection of outliers. This systematic approach provides tools for gaining knowledge and better understanding of the properties of robust estimators.

Journal ArticleDOI
TL;DR: The Neyman-Fisher controversy as discussed by the authors has had a deleterious impact on the development of statistics, with a major consequence being that potential outcomes were ignored in favor of linear models and classical statistical procedures that are imprecise.
Abstract: The Neyman-Fisher controversy considered here originated with the 1935 presentation of Jerzy Neyman's Statistical Problems in Agricultural Experimentation to the Royal Statistical Society. Neyman asserted that the standard ANOVA F-test for randomized complete block designs is valid, whereas the analogous test for Latin squares is invalid in the sense of detecting differentiation among the treatments, when none existed on average,more often than desired (i.e., having a higher Type I error than advertised). However, Neyman's expressions for the expected mean residual sum of squares, for both designs, are generally incorrect. Furthermore, Neyman's belief that the Type I error (when testing the null hypothesis of zero average treatment effects) is higher than desired, whenever the expected mean treatment sum of squares is greater than the expected mean residual sum of squares, is generally incorrect. Simple examples show that, without further assumptions on the potential outcomes, one cannot determine the Type I error of the F-test from expected sums of squares. Ultimately, we believe that the Neyman- Fisher controversy had a deleterious impact on the development of statistics, with a major consequence being that potential outcomes were ignored in favor of linear models and classical statistical procedures that are imprecise without applied contexts.

Journal ArticleDOI
TL;DR: Instrumental Variables: An Econometrician's Perspective by Guido W. Imbens [arXiv:1410.0163] as discussed by the authors is an example.
Abstract: Discussion of "Instrumental Variables: An Econometrician's Perspective" by Guido W. Imbens [arXiv:1410.0163].