scispace - formally typeset
Search or ask a question

Showing papers by "Donald B. Rubin published in 2010"


Journal ArticleDOI

88 citations


Journal ArticleDOI
TL;DR: It is my firm view that it is better, for eventual progress, to know that the authors cannot yet reliably answer questions than to pretend that they know the answers based on bogus assumptions and hopelessly inadequate data.
Abstract: There is a little doubt that the topic of the target article is extremely important—how to provide guidance for making medical decisions that are based on evidence concerning which interventions are likely to work well in practice. The hope for this enterprise, now called CER for ‘comparative effectiveness research,’ which is similar to the previously used ‘evidence-based medicine,’ is to use the voluminous amounts of data often routinely collected in our society, such as for insurance and other administrative uses, or from federal, state, and private sample surveys, to learn about the relative effectiveness of all sorts of medical interventions. The target article itself is rich in its statement of goals and general objectives, but it seems to me to be quite thin in its statement of (a) specific suggestions for what characteristics the data sets should possess to make such effectiveness comparisons plausibly accurate or (b) what methodological techniques are most likely to be appropriate to apply to the data sets to achieve these goals. Even the specific examples cited, at least those with which I am familiar, seem to me to be relatively poor sources of guidance for either the kinds of data sets needed or for the kinds of statistical techniques to be applied to them to achieve credible answers. In my limited experience in trying to address such topics, I find that there is too little thoughtful guidance for the correct design and analysis of investigations that use complex data sets, mostly non-randomized observational ones, to estimate the effects of medical interventions; an exception is Cochran [1], but that article does not reflect the substantial changes that have taken place over the last half century in statistics, computing, medicine, and routinely collected databases. It is important for me to emphasize that much of the required effort when attacking a causal question using observational data takes place before any of the usual statistical analyses of the data are initiated, just as with welldesigned randomized experiments. Moreover, often the inevitable conclusion of a thoughtful attack may be that there are no reliable answers available from existing data sets, or even from summaries of them based on meta-analyses [2], effect-size surface estimations [3], or from research syntheses [4]. It is my firm view that it is better, for eventual progress, to know that we cannot yet reliably answer questions than to pretend that we know the answers based on bogus assumptions and hopelessly inadequate data. My comments are admittedly somewhat idiosyncratic in that they rely largely on references to my own work or work with which I have substantial familiarity, although there are many fine sources of ideas on the topics. I do this mainly because, when thinking about the issues that the target article raises, many of these concern what is needed to create a reliable observational study, and I have spent a large amount of time thinking and writing about this in the context of real problems for more than four decades.

69 citations


Journal ArticleDOI
TL;DR: Reflections on the development of the Rubin causal model (RCM) are offered, which were stimulated by the impressive discussions of the RCM and Campbell's superb contributions to the practical problems of drawing causal inferences written by Will Shadish and Steve West and Felix Thoemmes.
Abstract: This article offers reflections on the development of the Rubin causal model (RCM), which were stimulated by the impressive discussions of the RCM and Campbell's superb contributions to the practical problems of drawing causal inferences written by Will Shadish (2010) and Steve West and Felix Thoemmes (2010). It is not a rejoinder in any real sense but more of a sequence of clarifications of parts of the RCM combined with some possibly interesting personal historical comments, which I do not think can be found elsewhere. Of particular interest in the technical content, I think, are the extended discussions of the stable unit treatment value assumption, the explication of the variety of definitions of causal estimands, and the discussion of the assignment mechanism.

69 citations


Book ChapterDOI
01 Jan 2010
TL;DR: The Rubin Causal Model is a formal mathematical framework for causal inference first given that name by Holland (1986) for a series of previous articles developing the perspective and an optional distribution on the quantities being conditioned on in the assignment mechanism, thereby allowing model-based Bayesian ‘posterior predictive’ (causal) inference.
Abstract: The Rubin Causal Model (RCM) is a formal mathematical framework for causal inference, first given that name by Holland (1986) for a series of previous articles developing the perspective (Rubin, 1974; 1975; 1976; 1977; 1978; 1979; 1980). There are two essential parts to the RCM, and a third optional one. The first part is the use of ‘potential outcomes’ to define causal effects in all situations — this part defines ‘the science’, which is the object of inference, and it requires the explicit consideration of the manipulations that define the treatments whose causal effects we wish to estimate. The second part is an explicit probabilistic model for the assignment of ‘treatments’ to ‘units’ as a function of all quantities that could be observed, including all potential outcomes; this model is called the ‘assignment mechanism’, and defines the structure of experiments designed to learn about the science from observed data or the acts of nature that lead to the observed data. The third possible part of the RCM framework is an optional distribution on the quantities being conditioned on in the assignment mechanism, including the potential outcomes, thereby allowing model-based Bayesian ‘posterior predictive’ (causal) inference. This part of the RCM focuses on the model-based analysis of observed data to draw inferences for causal effects, where the observed data are revealed by applying the assignment mechanism to the science. A full-length text that discusses estimation and inference for causal effects from this perspective is Imbens and Rubin (2006).

51 citations


Journal ArticleDOI
TL;DR: A modified general location model is proposed to integrate the ideas of missing data techniques and principal stratification and then analyze the same data as in Barnard, Frangakis, Hill, and Rubin (2003), where a pattern-mixture model was used.
Abstract: Missing data, especially when coupled with noncompliance, are a challenge even in the setting of randomized experiments. Although some existing methods can address each complication, it can be diff...

40 citations


Patent
26 May 2010
TL;DR: In this paper, a user can generate a predictive model based on historical data about a system being modeled, and the project includes a series of user choice points and actions or parameter settings that govern the generation of the model, which direct the user to select and apply an optimal model.
Abstract: Models are generated using a variety of tools and features of a model generation platform. For example, in connection with a project in which a user generates a predictive model based on historical data about a system being modeled, the user is provided through a graphical user interface a structured sequence of model generation activities to be followed, the sequence including dimension reduction, model generation, model process validation, and model re-generation. In connection with a project in which a user generates a predictive model based on historical data about a system being modeled, and in which the project includes a series of user choice points and actions or parameter settings that govern the generation of the model based on rules, which direct the user to select and apply an optimal model.

18 citations



Journal ArticleDOI
TL;DR: This analysis is applied to a randomized trial of a potentially important intervention designed to reduce the transmission of bacterial colonization between mothers and their infants through vaginal delivery in South Africa: the Prevention of Perinatal Sepsis (PoPs).

3 citations