scispace - formally typeset
Search or ask a question

Showing papers in "Statistical Science in 2009"


Journal ArticleDOI
TL;DR: This work views population structure and cryptic relatedness as different aspects of a single confounder: the unobserved pedigree defining the (often distant) relationships among the study subjects, and defines and estimates kinship coefficients, both pedigree-based and marker-based.
Abstract: We review the problem of confounding in genetic association studies, which arises principally because of population structure and cryptic relatedness. Many treatments of the problem consider only a simple "island" model of population structure. We take a broader approach, which views population structure and cryptic relatedness as different aspects of a single confounder: the unobserved pedigree defining the (often distant) relationships among the study subjects. Kinship is therefore a central concept, and we review methods of defining and estimating kinship coefficients, both pedigree-based and marker-based. In this unified framework we review solutions to the problem of population structure, including family-based study designs, genomic control, structured association, regression control, principal components adjustment and linear mixed models. The last solution makes the most explicit use of the kinships among the study subjects, and has an established role in the analysis of animal and plant breeding studies. Recent computational developments mean that analyses of human genetic association data are beginning to benefit from its powerful tests for association, which protect against population structure and cryptic kinship, as well as intermediate levels of confounding by the pedigree. © Institute of Mathematical Statistics, 2009.

467 citations


Journal ArticleDOI
TL;DR: Prerequisites for exact replication; issues of heterogeneity; advantages and disadvantages of different methods of data synthesis across multiple studies; frequentist vs. Bayesian inferences; and challenges that arise from multi-team collaborations are discussed.
Abstract: Replication helps ensure that a genotype-phenotype association observed in a genome-wide association (GWA) study represents a credible association and is not a chance finding or an artifact due to uncontrolled biases. We discuss prerequisites for exact replication; issues of heterogeneity; advantages and disadvantages of different methods of data synthesis across multiple studies; frequentist vs. Bayesian inferences for replication; and challenges that arise from multi-team collaborations. While consistent replication can greatly improve the credibility of a genotype-phenotype association, it may not eliminate spurious associations due to biases shared by many studies. Conversely, lack of replication in well-powered follow-up studies usually invalidates the initially proposed association, although occasionally it may point to differences in linkage disequilibrium or effect modifiers across studies.

266 citations


Journal ArticleDOI
TL;DR: In this article, a simple design-based estimator with much improved statistical properties is proposed to overcome this problem without modeling assumptions, and a model-based approach that includes some of the benefits of this estimator as well as the estimator in the literature is proposed.
Abstract: A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals—such as households, communities, firms, medical practices, schools or classrooms—even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary; its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individual-level noncompliance, which is common in applications but not allowed for in most existing methods. We show that from the perspective of bias, efficiency, power, robustness or research costs, and in large or small samples, pairing should be used in cluster-randomized experiments whenever feasible; failing to do so is equivalent to discarding a considerable fraction of one’s data. We develop these techniques in the context of a randomized evaluation we are conducting of the Mexican Universal Health Insurance Program.

244 citations


Journal ArticleDOI
TL;DR: In this article, a modification of Levene-type tests to increase their power to detect monotonic trends in variances is discussed, which is useful when one is concerned with an alternative of increasing or decreasing variability, for example, increasing volatility of stocks prices or open or closed gramophones in regression residual analysis.
Abstract: In many applications, the underlying scientific question concerns whether the variances of k samples are equal. There are a substantial number of tests for this problem. Many of them rely on the assumption of normality and are not robust to its violation. In 1960 Professor Howard Levene proposed a new approach to this problem by applying the F-test to the absolute deviations of the observations from their group means. Levene’s approach is powerful and robust to nonnormality and became a very popular tool for checking the homogeneity of variances. This paper reviews the original method proposed by Levene and subsequent robust modifications. A modification of Levene-type tests to increase their power to detect monotonic trends in variances is discussed. This procedure is useful when one is concerned with an alternative of increasing or decreasing variability, for example, increasing volatility of stocks prices or “open or closed gramophones” in regression residual analysis. A major section of the paper is devoted to discussion of various scientific problems where Levene-type tests have been used, for example, economic anthropology, accuracy of medical measurements, volatility of the price of oil, studies of the consistency of jury awards in legal cases and the effect of hurricanes on ecological systems.

234 citations


Journal ArticleDOI
TL;DR: It is demonstrated how these analysis approaches arise from factorizations of the distribution of longitudinal data and survival information, and which method of accommodating deaths is consistent with research aims, and use analysis methods accordingly.
Abstract: Diverse analysis approaches have been proposed to distinguish data missing due to death from nonresponse, and to summarize trajectories of longitudinal data truncated by death We demonstrate how these analysis approaches arise from factorizations of the distribution of longitudinal data and survival information Models are illustrated using cognitive functioning data for older adults For unconditional models, deaths do not occur, deaths are independent of the longitudinal response, or the unconditional longitudinal response is averaged over the survival distribution Unconditional models, such as random effects models fit to unbalanced data, may implicitly impute data beyond the time of death Fully conditional models stratify the longitudinal response trajectory by time of death Fully conditional models are effective for describing individual trajectories, in terms of either aging (age, or years from baseline) or dying (years from death) Causal models (principal stratification) as currently applied are fully conditional models, since group differences at one timepoint are described for a cohort that will survive past a later timepoint Partly conditional models summarize the longitudinal response in the dynamic cohort of survivors Partly conditional models are serial cross-sectional snapshots of the response, reflecting the average response in survivors at a given timepoint rather than individual trajectories Joint models of survival and longitudinal response describe the evolving health status of the entire cohort Researchers using longitudinal data should consider which method of accommodating deaths is consistent with research aims, and use analysis methods accordingly

160 citations


Journal ArticleDOI
TL;DR: In this article, an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects, is proposed, which is a property of the estimator.
Abstract: In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called "winner's curse." We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.

140 citations


Journal ArticleDOI
TL;DR: It is shown that the power is remarkably robust to misspecification of these weights and two methods for choosing weights in practice are considered, based on prior information and estimated weighting.
Abstract: Genetic investigations often involve the testing of vast numbers of related hypotheses simultaneously. To control the overall error rate, a substantial penalty is required, making it difficult to detect signals of moderate strength. To improve the power in this setting, a number of authors have considered using weighted p-values, with the motivation often based upon the scientific plausibility of the hypotheses. We review this literature, derive optimal weights and show that the power is remarkably robust to misspecification of these weights. We consider two methods for choosing weights in practice. The first, external weighting, is based on prior information. The second, estimated weighting, uses the data to choose weights.

123 citations


Journal ArticleDOI
TL;DR: In this article, a review of the literature concerning the theory and applications of log-concave distributions is presented, and the MLE can be computed with readily available algorithms.
Abstract: Log-concave distributions are an attractive choice for modeling and inference, for several reasons: The class of log-concave distributions contains most of the commonly used parametric distributions and thus is a rich and flexible nonparametric class of distributions. Further, the MLE exists and can be computed with readily available algorithms. Thus, no tuning parameter, such as a bandwidth, is necessary for estimation. Due to these attractive properties, there has been considerable recent research activity concerning the theory and applications of log-concave distributions. This article gives a review of these results.

101 citations


Journal ArticleDOI
TL;DR: In this paper, the authors point out the fundamental aspects of this reference work, especially the thorough coverage of testing problems and the construction of both estimation and testing non-informative priors based on functional divergences.
Abstract: Published nearly seventy years ago, Jeffreys' Theory of Probability (1939) has had a unique impact on the Bayesian community and is now considered to be one of the main classics in Bayesian Statistics as well as the initiator of the objective Bayes school. In particular, its advances on the derivation of noninformative priors as well as on the scaling of Bayes factors have had a lasting impact on the field. However, the book reflects the characteristics of the time, especially in terms of mathematical rigorousness. In this paper, we point out the fundamental aspects of this reference work, especially the thorough coverage of testing problems and the construction of both estimation and testing noninformative priors based on functional divergences. Our major aim here is to help modern readers in navigating in this difficult text and in concentrating on passages that are still relevant today.

99 citations


Journal ArticleDOI
TL;DR: The two-stage design for genome-wide association studies is a more efficient design for discovery using a joint analysis of the data from both stages, and may be more useful at this stage for selecting subsets of subjects for deep re-sequencing of regions identified in the GWAS.
Abstract: Because of the high cost of commercial genotyping chip technologies, many investigations have used a two-stage design for genome-wide association studies, using part of the sample for an initial discovery of “promising” SNPs at a less stringent significance level and the remainder in a joint analysis of just these SNPs using custom genotyping. Typical cost savings of about 50% are possible with this design to obtain comparable levels of overall type I error and power by using about half the sample for stage I and carrying about 0.1% of SNPs forward to the second stage, the optimal design depending primarily upon the ratio of costs per genotype for stages I and II. However, with the rapidly declining costs of the commercial panels, the generally low observed ORs of current studies, and many studies aiming to test multiple hypotheses and multiple endpoints, many investigators are abandoning the two-stage design in favor of simply genotyping all available subjects using a standard high-density panel. Concern is sometimes raised about the absence of a “replication” panel in this approach, as required by some high-profile journals, but it must be appreciated that the two-stage design is not a discovery/replication design but simply a more efficient design for discovery using a joint analysis of the data from both stages. Once a subset of highly-significant associations has been discovered, a truly independent “exact replication” study is needed in a similar population of the same promising SNPs using similar methods. This can then be followed by (1) “generalizability” studies to assess the full scope of replicated associations across different races, different endpoints, different interactions, etc.; (2) fine-mapping or re-sequencing to try to identify the causal variant; and (3) experimental studies of the biological function of these genes. Multistage sampling designs may be more useful at this stage, say for selecting subsets of subjects for deep re-sequencing of regions identified in the GWAS.

57 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of identifying the most relevant aspects of a model and identify the target parameters in designed experiments and surveys, where the primary study data do not identify and may not even bound target parameters.
Abstract: In designed experiments and surveys, known laws or design features provide checks on the most relevant aspects of a model and identify the target parameters. In contrast, in most observational studies in the health and social sciences, the primary study data do not identify and may not even bound target parameters. Discrepancies between target and analogous identified parameters (biases) are then of paramount concern, which forces a major shift in modeling strategies. Conventional approaches are based on conditional testing of equality constraints, which correspond to implausible point-mass priors. When these constraints are not identified by available data, however, no such testing is possible. In response, implausible constraints can be relaxed into penalty functions derived from plausible prior distributions. The resulting models can be fit within familiar full or partial likelihood frameworks. The absence of identification renders all analyses part of a sensitivity analysis. In this view, results from single models are merely examples of what might be plausibly inferred. Nonetheless, just one plausible inference may suffice to demonstrate inherent limitations of the data. Points are illustrated with misclassified data from a study of sudden infant death syndrome. Extensions to confounding, selection bias and more complex data structures are outlined.

Journal ArticleDOI
TL;DR: It is argued, via a series of examples, that Bayesian interval estimation is an attractive way to proceed in this context even for frequentists, because it can be supplied with a diagnostic in the form of a calibration-sensitivity simulation analysis.
Abstract: We review some aspects of Bayesian and frequentist interval estimation, focusing first on their relative strengths and weaknesses when used in “clean” or “textbook” contexts. We then turn attention to observational-data situations which are “messy,” where modeling that acknowledges the limitations of study design and data collection leads to nonidentifiability. We argue, via a series of examples, that Bayesian interval estimation is an attractive way to proceed in this context even for frequentists, because it can be supplied with a diagnostic in the form of a calibration-sensitivity simulation analysis. We illustrate the basis for this approach in a series of theoretical considerations, simulations and an application to a study of silica exposure and lung cancer.

Journal ArticleDOI
TL;DR: In this paper, a model credibility index is defined as the maximum sample size at which samples from the model and those from the true data generating mechanism are nearly indistinguishable, and the authors use standard notions from hypothesis testing to make this definition precise.
Abstract: A standard goal of model evaluation and selection is to find a model that approximates the truth well while at the same time is as parsimonious as possible. In this paper we emphasize the point of view that the models under consideration are almost always false, if viewed realistically, and so we should analyze model adequacy from that point of view. We investigate this issue in large samples by looking at a model credibility index, which is designed to serve as a one-number summary measure of model adequacy. We define the index to be the maximum sample size at which samples from the model and those from the true data generating mechanism are nearly indistinguishable. We use standard notions from hypothesis testing to make this definition precise. We use data subsampling to estimate the index. We show that the definition leads us to some new ways of viewing models as flawed but useful. The concept is an extension of the work of Davies [Statist. Neerlandica 49 (1995) 185–245].

Journal ArticleDOI
TL;DR: In this article, the authors compared several procedures to combine GWA study data both in terms of the power to detect a disease-associated SNP while controlling the genome-wide significance level, and the detection probability (DP).
Abstract: Combining data from several case-control genome-wide association (GWA) studies can yield greater efficiency for detecting associations of disease with single nucleotide polymorphisms (SNPs) than separate analyses of the component studies. We compared several procedures to combine GWA study data both in terms of the power to detect a disease-associated SNP while controlling the genome-wide significance level, and in terms of the detection probability (DP). The DP is the probability that a particular disease-associated SNP will be among the T most promising SNPs selected on the basis of low p-values. We studied both fixed effects and random effects models in which associations varied across studies. In settings of practical relevance, meta-analytic approaches that focus on a single degree of freedom had higher power and DP than global tests such as summing chi-square test-statistics across studies, Fisher’s combination of p-values, and forming a combined list of the best SNPs from within each study.

Journal ArticleDOI
TL;DR: It is concluded that to identify interactions it is often necessary to do some selection of SNPs, for example, based on prior hypothesis or marginal significance, but that to identifies SNPs that are marginally associated with a disease it may also be useful to consider larger numbers of interactions.
Abstract: Genome-wide association studies, in which as many as a million single nucleotide polymorphisms (SNP) are measured on several thousand samples, are quickly becoming a common type of study for identifying genetic factors associated with many phenotypes. There is a strong assumption that interactions between SNPs or genes and interactions between genes and environmental factors substantially contribute to the genetic risk of a disease. Identification of such interactions could potentially lead to increased understanding about disease mechanisms; drug × gene interactions could have profound applications for personalized medicine; strong interaction effects could be beneficial for risk prediction models. In this paper we provide an overview of different approaches to model interactions, emphasizing approaches that make specific use of the structure of genetic data, and those that make specific modeling assumptions that may (or may not) be reasonable to make. We conclude that to identify interactions it is often necessary to do some selection of SNPs, for example, based on prior hypothesis or marginal significance, but that to identify SNPs that are marginally associated with a disease it may also be useful to consider larger numbers of interactions.

Journal ArticleDOI
TL;DR: This article argued that weakly informative priors are a natural choice for complex models, contrary to Jeffreys's preference for simplicity. But they also pointed out that it is natural for those of us who work in social and computational sciences to favor complex models and that a key generalization of Jeffrey's ideas is to explicitly include model checking in the process of data analysis.
Abstract: I actually own a copy of Harold Jeffreys's Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jef freys was not too proud to use a classical chi-squared p-value when he wanted to check the misfit of a model to data (Gelman, Meng and Stern, 2006). I do, how ever, feel that it is important to understand where our probability models come from, and I welcome the op portunity to use the present article by Robert, Chopin and Rousseau as a platform for further discussion of foundational issues.2 In this brief discussion I will argue the following: (1) in thinking about prior distributions, we should go beyond Jeffreys's principles and move toward weakly informative priors; (2) it is natural for those of us who work in social and computational sciences to favor complex models, contra Jeffreys's preference for sim plicity; and (3) a key generalization of Jeffreys's ideas is to explicitly include model checking in the process of data analysis.

Journal ArticleDOI
Xiao-Li Meng1
TL;DR: In this paper, Madigan's invitation for this discussion contribution came at the busiest time in my professional life with four courses and many more meetings attempting to com pensate, psychologically, for the lost endowment at Harvard.
Abstract: The invitation for this discussion contribution came at the busiest time in my (professional) life with four courses and many more meetings attempting to com pensate, psychologically, for the lost endowment at Harvard. I could not possibly, however, decline David Madigan's kind invitation. The topic is dear to my heart, as it should be to any statistician's, for without "unobservables," we would be unemployable. And I always wanted to know what "h-likelihood" is! I first heard the term from my academic twin brother, An drew Gelman, who sent me his discussion of Lee and Neider (1996). Gelman's conclusion was that "To the extent that the methods in this paper give different an swers from the full Bayesian treatment, I would trust the latter." This of course did not entice me to read the paper. Indeed, I still did not know its definition when I started to type this Prologue, nor have I had any profes sional or personal contact with either author. I surmise this qualifies me as an objective discussant, though I hope in this case the phrase objective is not exchange able with noninformative or ignorant \ But surely, one may quibble, Gelman's comment must have influenced me. True, but I'm not the kind of Bayesian who is unwilling to change his/her prior. My pure interest is to decode the h-likelihood. If my brother is right, I'll be more proud of him. If he is wrong, I'll be wiser by learning something new. (But I do ask Professors Lee and Neider for their tolerance

Journal ArticleDOI
TL;DR: In this paper, the authors propose a hierarchical likelihood framework for statistical models with unobservables and show how to make inferences from such models with hierarchical likelihood, which leads to a rich class of new probabilistic models from which likelihood-type inferences can be made naturally.
Abstract: There have been controversies among statisticians on (i) what to model and (ii) how to make inferences from models with unobservables. One such controversy concerns the difference between estimation methods for the marginal means not necessarily having a probabilistic basis and statistical models having unobservables with a probabilistic basis. Another concerns likelihood-based inference for statistical models with unobservables. This needs an extended-likelihood framework, and we show how one such extension, hierarchical likelihood, allows this to be done. Modeling of unobservables leads to rich classes of new probabilistic models from which likelihood-type inferences can be made naturally with hierarchical likelihood.

Journal ArticleDOI
TL;DR: A novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood is described.
Abstract: Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article, we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data.

Journal ArticleDOI
TL;DR: In this article, the authors discuss some of the special features of family-based association designs and their relevance in the era of genome-wide association studies (GWAS) and discuss the advantages of traditional case-control and cohort studies.
Abstract: Genome-Wide Association Studies (GWAS) offer an exciting and promising new research avenue for finding genes for complex diseases. Traditional case-control and cohort studies offer many advantages for such designs. Family-based association designs have long been attractive for their robustness properties, but robustness can mean a loss of power. In this paper we discuss some of the special features of family designs and their relevance in the era of GWAS.

Journal ArticleDOI
TL;DR: In this article, the authors propose a Bayesian approach to estimate the genealogy of the case-control sample by taking advantage of the HapMap haplotypes across the genome.
Abstract: The standard paradigm for the analysis of genome-wide association studies involves carrying out association tests at both typed and imputed SNPs. These methods will not be optimal for detecting the signal of association at SNPs that are not currently known or in regions where allelic heterogeneity occurs. We propose a novel association test, complementary to the SNP-based approaches, that attempts to extract further signals of association by explicitly modeling and estimating both unknown SNPs and allelic heterogeneity at a locus. At each site we estimate the genealogy of the case-control sample by taking advantage of the HapMap haplotypes across the genome. Allelic heterogeneity is modeled by allowing more than one mutation on the branches of the genealogy. Our use of Bayesian methods allows us to assess directly the evidence for a causative SNP not well correlated with known SNPs and for allelic heterogeneity at each locus. Using simulated data and real data from the WTCCC project, we show that our method (i) produces a significant boost in signal and accurately identifies the form of the allelic heterogeneity in regions where it is known to exist, (ii) can suggest new signals that are not found by testing typed or imputed SNPs and (iii) can provide more accurate estimates of effect sizes in regions of association.

Journal ArticleDOI
TL;DR: In this article, the authors consider a more general incomplete linkage disequilibrium (LD) model and examine the impact of penetrances at the marker locus when the genetic models are defined at the disease locus.
Abstract: Under complete linkage disequilibrium (LD), robust tests often have greater power than Pearson’s chi-square test and trend tests for the analysis of case-control genetic association studies. Robust statistics have been used in candidate-gene and genome-wide association studies (GWAS) when the genetic model is unknown. We consider here a more general incomplete LD model, and examine the impact of penetrances at the marker locus when the genetic models are defined at the disease locus. Robust statistics are then reviewed and their efficiency and robustness are compared through simulations in GWAS of 300,000 markers under the incomplete LD model. Applications of several robust tests to the Wellcome Trust Case-Control Consortium [Nature 447 (2007) 661–678] are presented.

Journal ArticleDOI
TL;DR: In this article, Imai et al. proposed a design-based estimator for matched pair cluster randomized studies that in many circumstances is a better estimator than the harmonic mean estimator.
Abstract: We congratulate Imai, King and Nall on a valuable paper which will help to improve the design and analysis of cluster randomized studies. Imai et al. make two key contributions: (1) they propose a design-based estimator for matched pair cluster randomized studies that in many circumstances is a better estimator than the harmonic mean estimator; (2) they present convincing evidence that the matched pair design, when accompanied with good inference methods, is more powerful than the unmatched pair design and should be used routinely.

Journal ArticleDOI
TL;DR: In this article, the authors give an overview of CNV genomics in humans, highlighting patterns that inform methods for identifying copy number variants (CNVs) and provide some recommendations for identifying CNVs contributing to common complex disorders.
Abstract: Copy number variants (CNVs) account for more polymorphic base pairs in the human genome than do single nucleotide polymorphisms (SNPs). CNVs encompass genes as well as noncoding DNA, making these polymorphisms good candidates for functional variation. Consequently, most modern genome-wide association studies test CNVs along with SNPs, after inferring copy number status from the data generated by high-throughput genotyping platforms. Here we give an overview of CNV genomics in humans, highlighting patterns that inform methods for identifying CNVs. We describe how genotyping signals are used to identify CNVs and provide an overview of existing statistical models and methods used to infer location and carrier status from such data, especially the most commonly used methods exploring hybridization intensity. We compare the power of such methods with the alternative method of using tag SNPs to identify CNV carriers. As such methods are only powerful when applied to common CNVs, we describe two alternative approaches that can be informative for identifying rare CNVs contributing to disease risk. We focus particularly on methods identifying de novo CNVs and show that such methods can be more powerful than case-control designs. Finally we present some recommendations for identifying CNVs contributing to common complex disorders.

Journal ArticleDOI
TL;DR: This article explored the trade-offs between using the infer ential framework advocated by IKN versus fitting fairly standard multilevel models (see, for instance, Gelman and Hill, 2007) and found that the IKN design-based treatment effect estimators have the advantage of being simple to calculate and having better statistical properties in general than the harmonic mean estimator that IKN view to be the most accurate.
Abstract: We appreciate having the opportunity to comment on the well-motivated, highly informative and care fully constructed article by Imai, King and Nail (IKN). There has been a great deal of confusion over the years about the issue of pair-matching, often due to a con flation of the implications of design versus analysis choice. This article sheds light on the debate and of fers a set of helpful alternative analysis choices. Our discussion does not take issue with IKN's provocative assertion that one should pair-match in cluster randomized trials "whenever feasible." Instead we will explore the trade-offs between using the infer ential framework advocated by IKN versus fitting fairly standard multilevel models (see, for instance, Gelman and Hill, 2007). The IKN design-based treatment effect estimators have the advantage of being simple to calculate and having better statistical properties in general than the harmonic mean estimator that IKN view to be the most

Journal ArticleDOI
TL;DR: Rejoinder to "The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation" [arXiv:0910.3752]
Abstract: We are grateful to our four discussants for their agreement with and contributions to the central points in our article (Imai et al., 2009b). As Zhang and Small (2009) write, "[our article] presents] convincing ev idence that the matched pair design, when accompa nied with good inference methods, is more powerful than the unmatched pair design and should be used routinely." And, as they put it, Hill and Scott (2009) "do not take issue with [our article's] provocative as sertion that one should pair-match in cluster random ized trials 'whenever feasible.' " Whether denominated in terms of research dollars saved, or additional knowl edge learned for the same expenditure, the advantages in any one research project of switching standard ex perimental protocols from complete randomization to a matched pair designs (along with the accompanying new statistical methods) can be considerable. In the two sections that follow, we address our dis cussants' points regarding ways to pair clusters (Sec tion 2) and the costs and benefits of designand model based estimation (Section 3). But first we offer a sense of how many experiments across fields of inquiry can be improved in the ways we discuss in our ar ticle. We do this by collecting data from the last 106 cluster-randomized experiments published in 27 leading journals in medicine, public health, political science, economics, and education. We then counted how many experiments used complete randomization, blocking (on some but not all pre-treatment infor mation), or pair-matching?which respectively exploit none, some and all of the available pre-randomization covariate information. Table 1 gives a summary. Over all, only 19% of cluster-randomized experiments used pair-matching, which means that 81% left at least some pre-randomization covariate information on the table. Indeed, almost 60% of these experiments used com plete randomization and so took no advantage of the information in pre-treatment covariates. The table con veys that there is some variation in these figures across fields, but in no field is the use of pair matching in cluster-randomized designs very high, and it never oc curs in even as many as 30% of published experiments. Administrative constraints may have prevented some of these experiments from being pair matched, but as using this information involves no modeling risks, the opportunities for improving experimental research across many fields of inquiry seem quite substantial.

Journal ArticleDOI
TL;DR: In 2006, David Brillinger and Richard Davis sat down with Murray and Ady Rosenblatt at their home in La Jolla, California for an enjoyable day of reminiscences and conversation as discussed by the authors.
Abstract: On an exquisite March day in 2006, David Brillinger and Richard Davis sat down with Murray and Ady Rosenblatt at their home in La Jolla, California for an enjoyable day of reminiscences and conversation. Our mentor, Murray Rosenblatt, was born on September 7, 1926 in New York City and attended City College of New York before entering graduate school at Cornell University in 1946. After completing his Ph.D. in 1949 under the direction of the renowned probabilist Mark Kac, the Rosenblatts’ moved to Chicago where Murray became an instructor/assistant professor in the Committee of Statistics at the University of Chicago. Murray’s academic career then took him to the University of Indiana and Brown University before his joining the University of California at San Diego in 1964. Along the way, Murray established himself as one of the most celebrated and leading figures in probability and statistics with particular emphasis on time series and Markov processes. In addition to being a fellow of the Institute of Mathematical Statistics and American Association for the Advancement of Science, he was a Guggenheim fellow (1965–1966, 1971–1972) and was elected to the National Academy of Sciences in 1984. Among his many contributions, Murray conducted seminal work on density estimation, central limit theorems under strong mixing, spectral domain methods and long memory processes. Murray and Ady Rosenblatt were married in 1949 and have two children, Karin and Daniel.

Journal ArticleDOI
TL;DR: This paper proposed a system more based on bibliometrics for research funding and reputation in the UK, which has been increasingly dependent on a regular peer-review of all UK departments.
Abstract: Research funding and reputation in the UK have, for over two decades, been increasingly dependent on a regular peer-review of all UK departments. This is to move to a system more based on bibliometrics. Assessment exercises of this kind influence the behavior of institutions, departments and individuals, and therefore bibliometrics will have effects beyond simple measurement.


Journal ArticleDOI
TL;DR: This paper reviewed the maxims used by three early modern fictional detectives: Monsieur Lecoq, C. Auguste Dupin and Sherlock Holmes and found similarities between these maxims and Bayesian thought.
Abstract: This paper reviews the maxims used by three early modern fictional detectives: Monsieur Lecoq, C. Auguste Dupin and Sherlock Holmes. It find similarities between these maxims and Bayesian thought. Poe’s Dupin uses ideas very similar to Bayesian game theory. Sherlock Holmes’ statements also show thought patterns justifiable in Bayesian terms.