scispace - formally typeset
Search or ask a question

Showing papers on "Random effects model published in 2013"


Journal ArticleDOI
TL;DR: It is argued that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades, and it is shown thatLMEMs generalize best when they include the maximal random effects structure justified by the design.

6,878 citations


Journal ArticleDOI
TL;DR: This issue focuses on statistical methods in medical research and proposes two probabilistic models to estimate male-to-female HIV-1 transmission rate in one sexual contact.
Abstract: Since John Snow first conducted a modern epidemiological study in 1854 during a cholera epidemic in London, statistics has been associated with medical research. After Austin Bradford Hill published a series of articles on the use of statistical methodology in medical research in 1937, statistical considerations and computational tools have been paramount in conductingmedical research [1]. For the past century, statistics has played an important role in the advancement of medical research and medical research has stimulated rapid development of statistical methods. For example, the development of modern survival analysis-an important branch of statistics has aimed to solve problems encountered in clinical trials and large-scale epidemiological studies. In this era of evidence-based medicine, the development of novel statistical methods will continue to be crucial in medical research. With the expansion of computer capacity and advancement of computational techniques, it is inevitable that modern statistical methods will likely incorporate, to a greater degree, complex computational procedures. This issue focuses on statistical methods in medical research. Several novel methods aiming on solving different medical research questions are introduced. Some unique approaches of statistical analysis are also present. Hanagal and Sharma contribute two papers. The first one deals with a bivariate survival model. They examine a parameter estimation issue when the samples are taken from a bivariate log-logistic distribution with shared gamma frailty. They propose to use a Bayesian approach along with theMarkov ChainMonte Carlo computational technique for implementation. The computer simulation is conducted for performance evaluation. Two well-known datasets, one about acute leukemia and the other about kidney infection are applied as examples. The second paper contributed by Hanagal and Sharma examines the shared inverse Gaussian frailty model with the bivariate exponential baseline hazard. They first derive the likelihood of the joint survival function. In their Bayesian approach, the parameters of the baseline hazard are assumed to follow a gamma distribution while the coefficients of the regression relationship are assumed to follow an independent normal distribution. The dependence of two components of the survival function is tested. Three information criteria are used for model comparisons. The proposed method is applied to analyze diabetic retinopathy data. The paper by Chang, Lyer, Bullitt and Wang provides a method to find determinants of the brain arterial system. They represent the brain arterial system as a binary tree and apply the mixed logistic regression model to find significant covariates. The authors also demonstrate model selection methods for both fixed and random effects. A case study is presented using the method. This paper provides a rigorous approach for analyzing the binary branching structure data. It is potentially applicable to other tree structure data. Chakraborty proposes two probabilistic models to estimate male-to-female HIV-1 transmission rate in one sexual contact. One model is applicable when the transmitter cell counts are known and the other model is applicable when the receptor cell counts are known. By first uniformizing each transmitter (or receptor) cell count and assuming as a beta distribution, this paper algebraically derives the transition probability by imposing some boundary conditions based on scientific phenomena related to HIV infection. The paper by Yeh, Jiang, Garrard, Lei and Gajewski proposes to use a zero-truncated Poisson model to analyze human cancer tissues transplanted to mice when the positive counts of affected ducts is subject to right censoring. A Bayesian approach choosing a Gamma distribution as the prior is adopted. After implementing through complex computational procedures, this paper obtains the estimates of the coefficients and demonstrates model fitting through

1,127 citations


Journal ArticleDOI
TL;DR: The authors discusses the limitations of demeaning the dependent variable with respect to the group and adding the mean of the group's dependent variable as a control, and shows that the fixed effects estimator is consistent and should be used instead.
Abstract: Controlling for unobserved heterogeneity (or “common errors”), such as industry-specific shocks, is a fundamental challenge in empirical research. This paper discusses the limitations of two approaches widely used in corporate finance and asset pricing research: demeaning the dependent variable with respect to the group (e.g., “industry-adjusting”) and adding the mean of the group’s dependent variable as a control. We show that these methods produce inconsistent estimates and can distort inference. In contrast, the fixed effects estimator is consistent and should be used instead. We also explain how to estimate the fixed effects model when traditional methods are computationally infeasible.Additional programming advice can be found on our websites.

680 citations


Journal ArticleDOI
TL;DR: The following new guideline is proposed: models testing interactions in designs with replications should include random slopes for the highest-order combination of within-unit factors subsumed by each interaction, coming from the logic of mixed-model ANOVA.
Abstract: In a recent paper on mixed-effects models for confirmatory analysis, Barr et al. (2013) offered the following guideline for testing interactions: “one should have by-unit [subject or item] random slopes for any interactions where all factors comprising the interaction are within-unit; if any one factor involved in the interaction is between-unit, then the random slope associated with that interaction cannot be estimated, and is not needed” (p. 275). Although this guideline is technically correct, it is inadequate for many situations, including mixed factorial designs. The following new guideline is therefore proposed: models testing interactions in designs with replications should include random slopes for the highest-order combination of within-unit factors subsumed by each interaction. Designs with replications are designs where there are multiple observations per sampling unit per cell. Psychological experiments typically involve replicated observations, because multiple stimulus items are usually presented to the same subjects within a single condition. If observations are not replicated (i.e., there is only a single observation per unit per cell), random slope variance cannot be distinguished from random error variance and thus random slopes need not be included. This new guideline implies that a model testing AB in a 2 × 2 design where A is between and B within should include a random slope for B. Likewise, a model testing all two- and three- way interactions in a 2 × 2 × 2 design where A is between and B, C are within should include random slopes for B, C, and BC. The justification for the guideline comes from the logic of mixed-model ANOVA. In an ANOVA analysis of the 2 × 2 design described above, the appropriate error term for the test of AB is MSUB, the mean squares for the unit-by-B interaction (e.g., the subjects-by-B or items-by-B interaction). For the 2 × 2 × 2 design, the appropriate error term for ABC and BC is MSUBC, the unit-by-BC interaction; for AB, it is MSUB; and for AC, it is MSUC. To what extent is this ANOVA logic applicable to tests of interactions in mixed-effects models? To address this question, Monte Carlo simulations were performed using R (R Core Team, 2013). Models were estimated using the lmer() function of lme4 (Bates et al., 2013), with p-values derived from model comparison (α = 0.05). The performance of mixed-effects models (in terms of Type I error and power) was assessed over two sets of simulations, one for each of two different mixed factorial designs. The first set focused on the test of the AB interaction in a 2 × 2 design with A between and B within; the second focused on the test of the ABC interaction in a 2 × 2 × 2 design with A between and B, C within. For simplicity all datasets included only a single source of random effect variance (e.g., by-subject but not by-item variance). The number of replications per cell was 4, 8, or 16. Predictors were coded using deviation (−0.5, 0.5) coding; identical results were obtained using treatment coding. In the rare case (~2%) that a model did not converge, it was removed from the analysis. Power was reported with and without adjustment for Type I error rate, using the adjustment method reported in Barr et al. (2013). For each set of simulations at each of the three replication levels, 10,000 datasets were randomly generated, each with 24 sampled units (e.g., subjects). The dependent variable was continuous and normally distributed, with all data-generating parameters drawn from uniform distributions. Fixed effects were either between −2 and −1 or between 1 and 2 (with equal probability). The error variance was fixed at 6, and the random effects variance/covariance matrix had variances ranging from 0 to 3 and covariances corresponding to correlations ranging from −0.9 to 0.9. For the 2 × 2 design, mixed-effects models with two different random effects structures were fit to the data: (1) by-unit random intercept but no random slope for B (“RI”), and (2) a maximal model including a slope for B in addition to the random intercept (“Max”). For comparison purposes, a test of the interaction using mixed-model ANOVA (“AOV”) was performed using R's aov() function. Results for the test of the AB interaction in the 2 × 2 design are in Tables ​Tables11 and ​and2.2. As expected, the Type I error rate for ANOVA and maximal models were very close to the stated α-level of 0.05. In contrast, models lacking the random slope for B (“RI”) showed unacceptably high Type I error rates, increasing with the number of replications. Adjusted power was comparable for all three types of analyses (Table ​(Table2),2), albeit with a slight overall advantage for RI. Table 1 Type I error rate for the test of AB in the 2 × 2 design. Table 2 Power for the test of AB in the 2 × 2 design, Adjusted (Raw) p-values. The test of the ABC interaction in the 2 × 2 design was evaluated under four different random effects structures, all including a random intercept but varying in which random slopes were included. The models were: (1) random intercept only (“RI”); (2) slopes for B and C but not for BC (“nBC”); (3) slope for BC but not for B nor C (“BC”); and (4) maximal (slopes for B, C, and BC; “Max”). For the test of the ABC interaction, ANOVA and maximal models both yielded acceptable Type I performance (Table ​(Table3);3); the model with the BC slope alone (“BC”) was comparably good. However, the model excluding the BC slope had unacceptably high Type I error rates; surprisingly, omitting this random slope may be even worse than a random-intercept-only model. Adjusted power was comparable across all analyses (Table ​(Table44). Table 3 Type I error rate for test of ABC in 2 × 2 × 2 design. Table 4 Power for test of ABC in 2 × 2 × 2 design, Adjusted (Raw) p-values. To summarize: when testing interactions in mixed designs with replications, it is critical to include the random slope corresponding to the highest-order combination of within-subject factors subsumed by each interaction of interest. It is just as important to attend to this guideline when one seeks to simplify a non-converging model as when one is deciding on what structure to fit in the first place. Failing to include the critical slope in the test of an interaction can yield unacceptably high Type I error rates. Indeed, a model that includes all relevant random slopes except for the single critical slope may perform just as badly as (or possibly even worse than) a random-intercepts-only model, even though such a model is nearly maximal. Finally, note that including only the critical random slope in the model was sufficient to obtain acceptable performance, as illustrated by the “BC” model in the 2 × 2 × 2 design. Although the current simulations only considered interactions between categorical variables, the guideline applies whenever there are replicated observations, regardless of what types of variables are involved in an interaction (e.g., continuous only, or a mix of categorical and continuous). For example, consider a design with two independent groups of subjects, where there are observations at multiple time points for each subject. When testing the time-by-group interaction, the model should include a random slope for the continuous variable of time; if time is modeled using multiple terms of a polynomial, then there should be a slope for each of the terms in the polynomial that interact with group. For instance, if the effect of time is modeled as Y = β0 + β1 t + β2 t2 and the interest is in whether the β0 and β1 parameters vary across group, then the random effects structure should include slopes for both the group-by-t and group-by-t2 interactions.

459 citations


Journal ArticleDOI
TL;DR: The simulations of one prototypical scenario indicate that the LME modeling keeps a balance between the control for false positives and the sensitivity for activation detection, and the importance of hypothesis formulation is also illustrated in the simulations.

362 citations


Book
25 Feb 2013
TL;DR: This chapter discusses APC Analysis of Data from Three Common Research Designs, and discusses the conceptualization of Cohort Effects Distinguishing Age, Period, and Cohort effects.
Abstract: Introduction Why Cohort Analysis? Introduction The Conceptualization of Cohort Effects Distinguishing Age, Period, and Cohort Summary APC Analysis of Data from Three Common Research Designs Introduction Repeated Cross-Sectional Data Designs Research Design I: Age-by-Time Period Tabular Array of Rates/Proportions Research Design II: Repeated Cross-Sectional Sample Surveys Research Design III: Prospective Cohort Panels and the Accelerated Longitudinal Design Formalities of the Age-Period-Cohort Analysis Conundrum and a Generalized Linear Mixed Models (GLMM) Framework Introduction Descriptive APC Analysis Algebra of the APC Model Identification Problem Conventional Approaches to the APC Identification Problem Generalized Linear Mixed Models (GLMM) Framework APC Accounting/Multiple Classification Model, Part I: Model Identification and Estimation Using the Intrinsic Estimator Introduction Algebraic, Geometric, and Verbal Definitions of the Intrinsic Estimator Statistical Properties Model Validation: Empirical Example Model Validation: Monte Carlo Simulation Analyses Interpretation and Use of the Intrinsic Estimator APC Accounting/Multiple Classification Model, Part II: Empirical Applications Introduction Recent U.S. Cancer Incidence and Mortality Trends by Sex and Race: A Three-Step Procedure APC Model-Based Demographic Projection and Forecasting Mixed Effects Models: Hierarchical APC-Cross-Classified Random Effects Models (HAPC-CCREM), Part I: The Basics Introduction Beyond the Identification Problem Basic Model Specification Fixed versus Random Effects HAPC Specifications Interpretation of Model Estimates Assessing the Significance of Random Period and Cohort Effects Random Coefficients HAPC-CCREM Mixed Effects Models: Hierarchical APC-Cross-Classified Random Effects Models (HAPC-CCREM), Part II: Advanced Analyses Introduction Level 2 Covariates: Age and Temporal Changes in Social Inequalities in Happiness HAPC-CCREM Analysis of Aggregate Rate Data on Cancer Incidence and Mortality Full Bayesian Estimation HAPC-Variance Function Regression Mixed Effects Models: Hierarchical APC-Growth Curve Analysis of Prospective Cohort Data Introduction Intercohort Variations in Age Trajectories Intracohort Heterogeneity in Age Trajectories Intercohort Variations in Intracohort Heterogeneity Patterns Summary Directions for Future Research and Conclusion Introduction Additional Models Longitudinal Cohort Analysis of Balanced Cohort Designs of Age Trajectories Conclusion Index References appear at the end of each chapter.

358 citations


Journal ArticleDOI
TL;DR: The present meta-analysis provides support for the use of implementation intentions to promote physical activity, even though the effect size is small to medium.
Abstract: Implementation intentions are a powerful strategy to promote health-related behaviours, but mixed results are observed regarding physical activity. The primary aim of this study was to systematically and quantitatively review the literature on the effectiveness of implementation intentions on physical activity. The second aim was to identify conditions under which effectiveness is optimal. A literature search was performed in several databases for published and non-published reports. The inverse variance method with random effect model was used for the meta-analysis of results. Effect sizes were reported as standard mean differences. Twenty-six independent studies were included in the systematic review. The overall effect size of implementation intentions was 0.31, 95% confidence intervals (CI) [0.11, 0.51] at post-intervention and 0.24, 95% CI [0.13, 0.35] at follow-up. The duration of follow-up had no significant effect on effect size (F(1, 18) = 0.21, p=0.66. This strategy was more effective a...

357 citations


Journal ArticleDOI
TL;DR: Correlated random effects (Mundlak, 1978, Econometrica 46: 69�85) and hybrid models (Allison, 2009, Fixed Effects Regression Models [Sage]) are attractive alternatives to standard random-effects and fixed-effects models because they provide within estimates of level 1 variables and allow for the inclusion of level 2 variables as mentioned in this paper.
Abstract: Correlated random-effects (Mundlak, 1978, Econometrica 46: 69�85; Wooldridge, 2010, Econometric Analysis of Cross Section and Panel Data [MIT Press]) and hybrid models (Allison, 2009, Fixed Effects Regression Models [Sage]) are attractive alternatives to standard random-effects and fixed-effects models because they provide within estimates of level 1 variables and allow for the inclusion of level 2 variables. I discuss these models, give estimation examples, and address some complications that arise when interaction effects are included

326 citations


Journal ArticleDOI
TL;DR: This work proposes a new parameterization of the spatial generalized linear mixed model that alleviates spatial confounding and speeds computation by greatly reducing the dimension of theatial random effects.
Abstract: Summary. Non-Gaussian spatial data are very common in many disciplines. For instance, count data are common in disease mapping, and binary data are common in ecology. When fitting spatial regressions for such data, one needs to account for dependence to ensure reliable inference for the regression coefficients. The spatial generalized linear mixed model offers a very popular and flexible approach to modelling such data, but this model suffers from two major shortcomings: variance inflation due to spatial confounding and high dimensional spatial random effects that make fully Bayesian inference for such models computationally challenging. We propose a new parameterization of the spatial generalized linear mixed model that alleviates spatial confounding and speeds computation by greatly reducing the dimension of the spatial random effects. We illustrate the application of our approach to simulated binary, count and Gaussian spatial data sets, and to a large infant mortality data set.

312 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered a panel data model with time-varying individual effects and proposed a generalized methods of moments procedure to estimate the true number of individual effects, where the unobservable individual effects are assumed to have a factor structure.

201 citations


Posted Content
TL;DR: In this paper, the authors show that the unrestricted weighted least squares estimator is superior to conventional random effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity.
Abstract: This study challenges two core conventional meta-analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random-effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small-sample) bias and identical to fixed-effect meta-analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed-effect meta-analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects.

Book ChapterDOI
01 Jan 2013
TL;DR: This chapter discusses methods for exploiting the features of longitudinal data to study causal effects, broadly termed fixed effects and random effects models, and describes hybrid models that combine attractive features of each.
Abstract: Longitudinal data are becoming increasingly common in social science research. In this chapter, we discuss methods for exploiting the features of longitudinal data to study causal effects. The methods we discuss are broadly termed fixed effects and random effects models. We begin by discussing some of the advantages of fixed effects models over traditional regression approaches and then present a basic notation for the fixed effects model. This notation serves also as a baseline for introducing the random effects model, a common alternative to the fixed effects approach. After comparing fixed effects and random effects models – paying particular attention to their underlying assumptions – we describe hybrid models that combine attractive features of each. To provide a deeper understanding of these models, and to help researchers determine the most appropriate approach to use when analyzing longitudinal data, we provide three empirical examples. We also briefly discuss several extensions of fixed/random effects models. We conclude by suggesting additional literature that readers may find helpful.

Journal ArticleDOI
TL;DR: A new user-written command, stjm, is described that allows the user to jointly model a continuous longitudinal response and the time to an event of interest through application to a dataset investigating the effect of serum bilirubin level on time to death from any cause in 312 patients with primary biliary cirrhosis.
Abstract: The joint modeling of longitudinal and survival data has received remarkable attention in the methodological literature over the past decade; however, the availability of software to implement the methods lags behind. The most common form of joint model assumes that the association between the survival and the longitudinal processes is underlined by shared random effects. As a result, computationally intensive numerical integration techniques such as adaptive Gauss�Hermite quadrature are required to evaluate the likelihood. We describe a new user-written command, stjm, that allows the user to jointly model a continuous longitudinal response and the time to an event of interest. We assume a linear mixed-effects model for the longitudinal submodel, allowing flexibility through the use of fixed or random fractional polynomials of time. Four choices are available for the survival submodel: the exponential, Weibull or Gompertz proportional hazard models, and the flexible parametric model (stpm2). Flexible parametric models are fit on the log cumulative-hazard scale, which has direct computational benefits because it avoids the use of numerical integration to evaluate the cumulative hazard. We describe the features of stjm through application to a dataset investigating the effect of serum bilirubin level on time to death from any cause in 312 patients with primary biliary cirrhosis.

Journal ArticleDOI
TL;DR: A specific value set is available to calculate QALYs for the conduction of health economic studies targeted at the Italian health care system using the time trade-off technique, and is higher than those estimated in the United Kingdom and Spain.

Journal ArticleDOI
TL;DR: In this article, a linear model with one additional random effect was used to analyze multiple predictors on the same subjects and each predictor was analyzed separately. But the model was not applied to a large-scale association study of multiple sclerosis including over 20,000 individuals and 500,000 genetic variants.
Abstract: Motivated by genome-wide association studies, we consider a standard linear model with one additional random effect in situations where many predictors have been collected on the same subjects and each predictor is analyzed separately. Three novel contributions are (1) a transformation between the linear and log-odds scales which is accurate for the important genetic case of small effect sizes; (2) a likelihood-maximization algorithm that is an order of magnitude faster than the previously published approaches; and (3) efficient methods for computing marginal likelihoods which allow Bayesian model comparison. The methodology has been successfully applied to a large-scale association study of multiple sclerosis including over 20,000 individuals and 500,000 genetic variants.

Journal ArticleDOI
TL;DR: The model results indicate that the effects of the selected variables on crash occurrence vary across seasons and crash units; and that geometric characteristic variables contribute to the segment variations: the more unobserved heterogeneity have been accounted, the better classification ability.

01 Jan 2013
TL;DR: In this paper, the authors propose tools to compute minimum detectable effect sizes (MDES) for existing studies and to estimate minimum required sample sizes (MRSS) for studies under design.
Abstract: This paper complements existing power analysis tools by offering tools to compute minimum detectable effect sizes (MDES) for existing studies and to estimate minimum required sample sizes (MRSS) for studies under design The tools that accompany this paper support estimates of MDES or MSSR for 21 different study designs that include 14 random assignment designs (6 designs in which individuals are randomly assigned to treatment or control condition and 8 in which clusters of individuals are randomly assigned to condition, with models differing depending on whether the sample was blocked prior to random assignment and by whether the analytic models assume constant, fixed, or random effects across blocks or assignment clusters); and 7 quasi-experimental designs (an interrupted time series design and 6 regression discontinuity designs that vary depending on whether the sample was blocked prior to randomization, whether individuals or clusters of individuals are assigned to treatment or control condition, and whether the analytic models assume fixed or random effects)

Journal ArticleDOI
TL;DR: In this paper, a meta-analysis of a selected sample of 87 estimates from studies based on panel data techniques published through until 2012 was performed to obtain a summary measure of the effects of tourism on economic growth by applying models for both fixed and random effects.
Abstract: This article provides a meta-analysis of a selected sample of 87 estimates from studies based on panel data techniques published through until 2012 The purpose is to obtain a summary measure of the effects of tourism on economic growth by applying models for both fixed and random effects The results show a positive elasticity between GDP and tourism, although the magnitude of the effect varies according to the methodological procedure employed in the original studies for empirical estimates In this sense, when estimates exclude other explanatory variables of economic growth, elasticities are overvalued

Journal ArticleDOI
TL;DR: In this article, a generalized panel data model with random effects and first-order spatially autocorrelated residuals is proposed, and three Lagrange multiplier (LM) and likelihood ration (LR) tests are derived to obtain the Anselin model, Kapoor, Kelejian, and Prucha model.
Abstract: This paper proposes a generalized panel data model with random effects and first-order spatially autocorrelated residuals that encompasses two previously suggested specifications. The first one is described in Anselin's (1988) book and the second one by Kapoor et al. (2007). Our encompassing specification allows us to test for these models as restricted specifications. In particular, we derive three Lagrange multiplier (LM) and likelihood ration (LR) tests that restrict our generalized model to obtain (i) the Anselin model, (ii) the Kapoor, Kelejian, and Prucha model, and (iii) the simple random effects model that ignores the spatial correlation in the residuals. For two of these three tests, we obtain closed form solutions and we derive their large sample distributions. Our Monte Carlo results show that the suggested tests are powerful in testing for these restricted specifications even in small and medium sized samples.

Journal ArticleDOI
TL;DR: This paper aims to extent the concept of safety performance functions to be used in areal models of crash frequency and shows that the multivariate spatial model performs better than its univariate counterpart in terms of the penalized goodness-of-fit measure Deviance Information Criteria.

Journal ArticleDOI
28 Oct 2013-PLOS ONE
TL;DR: A new statistic, effective degrees of freedom, is introduced that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) is introduced to learn the dimensionality of the correction for population structure and kinship and is assessed through simulations.
Abstract: Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.

Book ChapterDOI
01 Jan 2013
TL;DR: In this article, the authors consider the analysis of continuous, hierarchical data using a different class of models, namely, linear mixed-effects models (LMMs), which allow to take into account the correlation of observations contained in a dataset and partition the overall variation of the dependent variable into components corresponding to different levels of data hierarchy.
Abstract: In Chap.10, we presented linear models (LMs) models with fixed effects for correlated data. They are examples of population-averaged models, because their mean-structure parameters can be interpreted as effects of covariates on the mean value of the dependent variable in the entire population. The association between the observations in a dataset was a result of a grouping of the observations sharing the same level of a grouping factor(s). In this chapter, we consider the analysis of continuous, hierarchical data using a different class of models, namely, linear mixed-effects models (LMMs). They allow to take into account the correlation of observations contained in a dataset. Moreover, they allow to effectively partition the overall variation of the dependent variable into components corresponding to different levels of data hierarchy. The models are examples of subject-specific models, because they include subject-specific coefficients. In particular, in Sects.13.2–13.4, we describe the formulation of the model. Sections13.5,13.6, and13.7 are devoted to, respectively, the estimation approaches, diagnostic tools, and inferential methods used for the LMMs, in which the (conditional) residual variance-covariance matrix is independent of the mean value. This is the most common type of LMMs used in practice. In Sect.13.8, we focus on the LMMs, in which the (conditional) residual variance-covariance matrix depends on the mean value. Section13.9 summarizes the contents of this chapter and offers some general concluding comments.

Book
04 Nov 2013
TL;DR: A survey of methods for mixed linear models can be found in this paper, where the authors present a collection of tools for exploring the restricted likelihood for two-variance models, as well as a discussion of the complexity of these models.
Abstract: Mixed Linear Models: Syntax, Theory, and Methods An Opinionated Survey of Methods for Mixed Linear Models Mixed linear models in the standard formulation Conventional analysis of the mixed linear model Bayesian analysis of the mixed linear model Conventional and Bayesian approaches compared A few words about computing Two More Tools: Alternative Formulation, Measures of Complexity Alternative formulation: The "constraint-case" formulation Measuring the complexity of a mixed linear model fit Richly Parameterized Models as Mixed Linear Models Penalized Splines as Mixed Linear Models Penalized splines: Basis, knots, and penalty More on basis, knots, and penalty Mixed linear model representation Additive Models and Models with Interactions Additive models as mixed linear models Models with interactions Spatial Models as Mixed Linear Models Geostatistical models Models for areal data Two-dimensional penalized splines Time-Series Models as Mixed Linear Models Example: Linear growth model Dynamic linear models in some generality Example of a multi-component DLM Two Other Syntaxes for Richly Parameterized Models Schematic comparison of the syntaxes Gaussian Markov random fields Likelihood inference for models with unobservables From Linear Models to Richly Parameterized Models: Mean Structure Adapting Diagnostics from Linear Models Preliminaries Added variable plots Transforming variables Case influence Residuals Puzzles from Analyzing Real Datasets Four puzzles Overview of the next three chapters A Random Effect Competing with a Fixed Effect Slovenia data: Spatial confounding Kids and crowns: Informative cluster size Differential Shrinkage The simplified model and an overview of the results Details of derivations Conclusion: What might cause differential shrinkage? Competition between Random Effects Collinearity between random effects in three simpler models Testing hypotheses on the optical-imaging data and DLM models Discussion Random Effects Old and New Old-style random effects New-style random effects Practical consequences Conclusion Beyond Linear Models: Variance Structure Mysterious, Inconvenient, or Wrong Results from Real Datasets Periodontal data and the ICAR model Periodontal data and the ICAR with two classes of neighbor pairs Two very different smooths of the same data Misleading zero variance estimates Multiple maxima in posteriors and restricted likelihoods Overview of the remaining chapters Re-Expressing the Restricted Likelihood: Two-Variance Models The re-expression Examples A tentative collection of tools Exploring the Restricted Likelihood for Two-Variance Models Which vj tell us about which variance? Two mysteries explained Extending the Re-Expressed Restricted Likelihood Restricted likelihoods that can and can't be re-expressed Expedients for restricted likelihoods that can't be re-expressed Zero Variance Estimates Some observations about zero variance estimates Some thoughts about tools Multiple Maxima in the Restricted Likelihood and Posterior Restricted likelihoods with multiple local maxima Posteriors with multiple modes

Journal ArticleDOI
Simon N. Wood1
TL;DR: In this paper, the authors exploit the link between random effects and penalized regression to develop a simple test for a zero effect, which can be used with generalized linear mixed models, including those estimated by penalized quasilikelihood.
Abstract: SUMMARY Testing that random effects are zero is difficult, because the null hypothesis restricts the corresponding variance parameter to the edge of the feasible parameter space. In the context of generalized linear mixed models, this paper exploits the link between random effects and penalized regression to develop a simple test for a zero effect. The idea is to treat the variance components not being tested as fixed at their estimates and then to express the likelihood ratio as a readily computed quadratic form in the predicted values of the random effects. Under the null hypothesis this has the distribution of a weighted sum of squares of independent standardnormalrandomvariables.Thetestcanbeusedwithgeneralizedlinearmixedmodels, including those estimated by penalized quasilikelihood.

Journal ArticleDOI
01 Dec 2013-Chest
TL;DR: Semirigid thoracoscopy is an efficacious and safe procedure in diagnosis of EPE and more well-designed trials are required to confirm the results of this study.

Book
03 May 2013
TL;DR: Introduction to R What Is R?
Abstract: Introduction to R What Is R? Steps on Installing R and Updating R Packages Database Management and Data Manipulations A Simple Simulation on Multi-Center Studies Summary and Recommendations for Further Reading Research Protocol for Meta-Analyses Introduction Defining the Research Objective Criteria for Identifying Studies to Include in the Meta-Analysis Searching For and Collecting the Studies Data Abstraction and Extraction Meta-Analysis Methods Results Summary and Discussion Fixed Effects and Random Effects in Meta-Analysis Two Datasets from Clinical Studies Fixed-Effects and Random-Effects Models in Meta-Analysis Data Analysis in R Which Model Should We Use? Fixed Effects or Random Effects? Summary and Conclusions Meta-Analysis with Binary Data Meta-Analysis Methods Meta-Analysis of Lamotrigine Studies Discussions Meta-Analysis for Continuous Data Two Published Datasets Methods for Continuous Data Meta-Analysis of Tubeless versus Standard Percutaneous Nephrolithotomy Discussion Heterogeneity in Meta-Analysis Heterogeneity Quantity Q and the Test of heterogeneity in R meta The Quantifying Heterogeneity in R meta Step-By-Step Implementations in R Discussions Meta-Regression Data Meta-Regression Data Analysis Using R Discussion Individual Patient-Level Data Analysis versus Meta-Analysis Introduction Treatment Comparison for Changes in HAMD Treatment Comparison for Changes in MADRS Summary Simulation Study on Continuous Outcomes Discussions Meta-Analysis for Rare Events The Rosiglitazone Meta-Analysis Step-by-Step Data Analysis in R Discussion Other R Packages for Meta-Analysis Combining p-Values in Meta-Analysis R Packages for Meta-Analysis of Correlation Coefficients Multivariate Meta-Analysis Discussions Index

Journal ArticleDOI
TL;DR: It is concluded that multiple imputation provides a practicable approach that can handle arbitrary patterns of systematic missingness and bias is reduced by including sufficient between-study random effects in the imputation model.
Abstract: A variable is 'systematically missing' if it is missing for all individuals within particular studies in an individual participant data meta-analysis. When a systematically missing variable is a potential confounder in observational epidemiology, standard methods either fail to adjust the exposure-disease association for the potential confounder or exclude studies where it is missing. We propose a new approach to adjust for systematically missing confounders based on multiple imputation by chained equations. Systematically missing data are imputed via multilevel regression models that allow for heterogeneity between studies. A simulation study compares various choices of imputation model. An illustration is given using data from eight studies estimating the association between carotid intima media thickness and subsequent risk of cardiovascular events. Results are compared with standard methods and also with an extension of a published method that exploits the relationship between fully adjusted and partially adjusted estimated effects through a multivariate random effects meta-analysis model. We conclude that multiple imputation provides a practicable approach that can handle arbitrary patterns of systematic missingness. Bias is reduced by including sufficient between-study random effects in the imputation model.

Journal ArticleDOI
TL;DR: Bayes modal estimation performs well by avoiding boundary estimates; having smaller root mean squared error for the between-study standard deviation; and having better coverage for the overall effects than the other methods when the true model has at least a small or moderate amount of unexplained heterogeneity.
Abstract: Fixed-effects meta-analysis has been criticized because the assumption of homogeneity is often unrealistic and can result in underestimation of parameter uncertainty. Random-effects meta-analysis and meta-regression are therefore typically used to accommodate explained and unexplained between-study variability. However, it is not unusual to obtain a boundary estimate of zero for the (residual) between-study standard deviation, resulting in fixed-effects estimates of the other parameters and their standard errors. To avoid such boundary estimates, we suggest using Bayes modal (BM) estimation with a gamma prior on the between-study standard deviation. When no prior information is available regarding the magnitude of the between-study standard deviation, a weakly informative default prior can be used (with shape parameter 2 and rate parameter close to 0) that produces positive estimates but does not overrule the data, leading to only a small decrease in the log likelihood from its maximum. We review the most commonly used estimation methods for meta-analysis and meta-regression including classical and Bayesian methods and apply these methods, as well as our BM estimator, to real datasets. We then perform simulations to compare BM estimation with the other methods and find that BM estimation performs well by (i) avoiding boundary estimates; (ii) having smaller root mean squared error for the between-study standard deviation; and (iii) better coverage for the overall effects than the other methods when the true model has at least a small or moderate amount of unexplained heterogeneity.

Journal ArticleDOI
TL;DR: It was found that the rankings of the fixed-over-time random effects models are very consistent among them, and the standard errors of the crash frequency estimates are significantly reduced for the majority of the segments on the top of the ranking.

Journal ArticleDOI
TL;DR: Climate information was found to improve the estimation of malaria relative risk in 41% of the districts in Malawi, particularly at higher altitudes where transmission is irregular, highlighting the potential value of climate-driven seasonal malaria forecasts.
Abstract: Background: Malaria transmission is influenced by variations in meteorological conditions, which impact the biology of the parasite and its vector, but also socio-economic conditions, such as levels of urbanization, poverty and education, which impact human vulnerability and vector habitat. The many potential drivers of malaria, both extrinsic, such as climate, and intrinsic, such as population immunity are often difficult to disentangle. This presents a challenge for the modelling of malaria risk in space and time. Methods: A statistical mixed model framework is proposed to model malaria risk at the district level in Malawi, using an age-stratified spatio-temporal dataset of malaria cases from July 2004 to June 2011. Several climatic, geographic and socio-economic factors thought to influence malaria incidence were tested in an exploratory model. In order to account for the unobserved confounding factors that influence malaria, which are not accounted for using measured covariates, a generalized linear mixed model was adopted, which included structured and unstructured spatial and temporal random effects. A hierarchical Bayesian framework using Markov chain Monte Carlo simulation was used for model fitting and prediction. Results: Using a stepwise model selection procedure, several explanatory variables were identified to have significant associations with malaria including climatic, cartographic and socio-economic data. Once intervention variations, unobserved confounding factors and spatial correlation were considered in a Bayesian framework, a final model emerged with statistically significant predictor variables limited to average precipitation (quadratic relation) and average temperature during the three months previous to the month of interest. Conclusions: When modelling malaria risk in Malawi it is important to account for spatial and temporal heterogeneity and correlation between districts. Once observed and unobserved confounding factors are allowed for, precipitation and temperature in the months prior to the malaria season of interest are found to significantly determine spatial and temporal variations of malaria incidence. Climate information was found to improve the estimation of malaria relative risk in 41% of the districts in Malawi, particularly at higher altitudes where transmission is irregular. This highlights the potential value of climate-driven seasonal malaria forecasts.