scispace - formally typeset
Search or ask a question

Showing papers on "Random effects model published in 2018"


Journal ArticleDOI
TL;DR: In this article, the effect size reporting guidelines for reporting effect size in multilevel models have not been provided and the complexities associated with reporting R2 as an effect size measure are explored, as well as appropriate effect size measures for more complex models.
Abstract: Effect size reporting is crucial for interpretation of applied research results and for conducting meta-analysis. However, clear guidelines for reporting effect size in multilevel models have not been provided. This report suggests and demonstrates appropriate effect size measures including the ICC for random effects and standardized regression coefficients or f2 for fixed effects. Following this, complexities associated with reporting R2 as an effect size measure are explored, as well as appropriate effect size measures for more complex models including the three-level model and the random slopes model. An example using TIMSS data is provided.

351 citations


Posted Content
TL;DR: In this article, a linear model for the observed associations approximately holds in a wide variety of settings when all the genetic variants satisfy the exclusion restriction assumption, or in genetic terms, when there is no pleiotropy.
Abstract: Mendelian randomization (MR) is a method of exploiting genetic variation to unbiasedly estimate a causal effect in presence of unmeasured confounding MR is being widely used in epidemiology and other related areas of population science In this paper, we study statistical inference in the increasingly popular two-sample summary-data MR design We show a linear model for the observed associations approximately holds in a wide variety of settings when all the genetic variants satisfy the exclusion restriction assumption, or in genetic terms, when there is no pleiotropy In this scenario, we derive a maximum profile likelihood estimator with provable consistency and asymptotic normality However, through analyzing real datasets, we find strong evidence of both systematic and idiosyncratic pleiotropy in MR, echoing the omnigenic model of complex traits that is recently proposed in genetics We model the systematic pleiotropy by a random effects model, where no genetic variant satisfies the exclusion restriction condition exactly In this case we propose a consistent and asymptotically normal estimator by adjusting the profile score We then tackle the idiosyncratic pleiotropy by robustifying the adjusted profile score We demonstrate the robustness and efficiency of the proposed methods using several simulated and real datasets

290 citations


Journal ArticleDOI
TL;DR: The large success of spatial modeling with R‐INLA and the types of spatial models that can be fitted are discussed, an overview of recent developments for areal models are given, and the stochastic partial differential equation approach is given and some of the ways it can be extended beyond the assumptions of isotropy and separability are described.
Abstract: Coming up with Bayesian models for spatial data is easy, but performing inference with them can be challenging. Writing fast inference code for a complex spatial model with realistically-sized datasets from scratch is time-consuming, and if changes are made to the model, there is little guarantee that the code performs well. The key advantages of R-INLA are the ease with which complex models can be created and modified, without the need to write complex code, and the speed at which inference can be done even for spatial problems with hundreds of thousands of observations. R-INLA handles latent Gaussian models, where fixed effects, structured and unstructured Gaussian random effects are combined linearly in a linear predictor, and the elements of the linear predictor are observed through one or more likelihoods. The structured random effects can be both standard areal model such as the Besag and the BYM models, and geostatistical models from a subset of the Matern Gaussian random fields. In this review, we discuss the large success of spatial modeling with R-INLA and the types of spatial models that can be fitted, we give an overview of recent developments for areal models, and we give an overview of the stochastic partial differential equation (SPDE) approach and some of the ways it can be extended beyond the assumptions of isotropy and separability. In particular, we describe how slight changes to the SPDE approach leads to straight-forward approaches for nonstationary spatial models and nonseparable space–time models. This article is categorized under: Statistical and Graphical Methods of Data Analysis > Bayesian Methods and Theory Statistical Models > Bayesian Models Data: Types and Structure > Massive Data.

231 citations


Journal ArticleDOI
TL;DR: This work uses dynamic multilevel modeling, as it is incorporated in the new dynamic structural equation modeling (DSEM) toolbox in Mplus, to analyze the affective data from the COGITO study, and investigates whether prior depression affects later depression scores through the random effects of the daily diary measures.
Abstract: With the growing popularity of intensive longitudinal research, the modeling techniques and software options for such data are also expanding rapidly. Here we use dynamic multilevel modeling, as it is incorporated in the new dynamic structural equation modeling (DSEM) toolbox in Mplus, to analyze the affective data from the COGITO study. These data consist of two samples of over 100 individuals each who were measured for about 100 days. We use composite scores of positive and negative affect and apply a multilevel vector autoregressive model to allow for individual differences in means, autoregressions, and cross-lagged effects. Then we extend the model to include random residual variances and covariance, and finally we investigate whether prior depression affects later depression scores through the random effects of the daily diary measures. We end with discussing several urgent—but mostly unresolved—issues in the area of dynamic multilevel modeling.

199 citations


Journal ArticleDOI
TL;DR: This meta‐analysis provides strong evidence for the adverse effect of PM2.5 on mortality, that studies with poorer exposure have lower effect size estimates, that more control for SES increases effect size Estimates, and that significant effects are seen below 10 &mgr;g/m3.

194 citations


Journal ArticleDOI
TL;DR: It is concluded that generalised linear mixed models can result in better statistical inference than the conventional 2‐stage approach but also that this type of model presents issues and difficulties.
Abstract: Comparative trials that report binary outcome data are commonly pooled in systematic reviews and meta-analyses. This type of data can be presented as a series of 2-by-2 tables. The pooled odds ratio is often presented as the outcome of primary interest in the resulting meta-analysis. We examine the use of 7 models for random-effects meta-analyses that have been proposed for this purpose. The first of these models is the conventional one that uses normal within-study approximations and a 2-stage approach. The other models are generalised linear mixed models that perform the analysis in 1 stage and have the potential to provide more accurate inference. We explore the implications of using these 7 models in the context of a Cochrane Review, and we also perform a simulation study. We conclude that generalised linear mixed models can result in better statistical inference than the conventional 2-stage approach but also that this type of model presents issues and difficulties. These challenges include more demanding numerical methods and determining the best way to model study specific baseline risks. One possible approach for analysts is to specify a primary model prior to performing the systematic review but also to present the results using other models in a sensitivity analysis. Only one of the models that we investigate is found to perform poorly so that any of the other models could be considered for either the primary or the sensitivity analysis.

143 citations


Journal ArticleDOI
TL;DR: A new R package lme4qtl is developed that enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances and offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse.
Abstract: Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software. To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project. Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl .

118 citations


Journal ArticleDOI
TL;DR: The median rate ratio is the median relative change in the rate of the occurrence of the event when comparing identical subjects from 2 randomly selected different clusters that are ordered by rate and the variance partition coefficient is described, which denotes the proportion of the variation in the outcome that is attributable to between‐cluster differences that can be computed with count outcomes.
Abstract: Multilevel data occur frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models. These models incorporate cluster-specific random effects that allow one to partition the total variation in the outcome into between-cluster variation and between-individual variation. The magnitude of the effect of clustering provides a measure of the general contextual effect. When outcomes are binary or time-to-event in nature, the general contextual effect can be quantified by measures of heterogeneity like the median odds ratio or the median hazard ratio, respectively, which can be calculated from a multilevel regression model. Outcomes that are integer counts denoting the number of times that an event occurred are common in epidemiological and medical research. The median (incidence) rate ratio in multilevel Poisson regression for counts that corresponds to the median odds ratio or median hazard ratio for binary or time-to-event outcomes respectively is relatively unknown and is rarely used. The median rate ratio is the median relative change in the rate of the occurrence of the event when comparing identical subjects from 2 randomly selected different clusters that are ordered by rate. We also describe how the variance partition coefficient, which denotes the proportion of the variation in the outcome that is attributable to between-cluster differences, can be computed with count outcomes. We illustrate the application and interpretation of these measures in a case study analyzing the rate of hospital readmission in patients discharged from hospital with a diagnosis of heart failure.

117 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present the results of a simulation study that examines the estimation quality of univariate 2-level autoregressive models of order 1, AR(1), using Bayesian analysis in Mplus Version 8.
Abstract: Dynamic structural equation modeling (DSEM) is a novel, intensive longitudinal data (ILD) analysis framework. DSEM models intraindividual changes over time on Level 1 and allows the parameters of these processes to vary across individuals on Level 2 using random effects. DSEM merges time series, structural equation, multilevel, and time-varying effects models. Despite the well-known properties of these analysis areas by themselves, it is unclear how their sample size requirements and recommendations transfer to the DSEM framework. This article presents the results of a simulation study that examines the estimation quality of univariate 2-level autoregressive models of order 1, AR(1), using Bayesian analysis in Mplus Version 8. Three features are varied in the simulations: complexity of the model, number of subjects, and number of time points per subject. Samples with many subjects and few time points are shown to perform substantially better than samples with few subjects and many time points.

110 citations


Journal ArticleDOI
TL;DR: This work presents the user-friendly and comprehensive R package TreeBUGS, which implements the two most important hierarchical MPT approaches for participant heterogeneity—the beta-MPT approach (Smith & Batchelder, Journal of Mathematical Psychology 54:167-183, 2010) and the latent-trait MPT approach.
Abstract: Multinomial processing tree (MPT) models are a class of measurement models that account for categorical data by assuming a finite number of underlying cognitive processes. Traditionally, data are aggregated across participants and analyzed under the assumption of independently and identically distributed observations. Hierarchical Bayesian extensions of MPT models explicitly account for participant heterogeneity by assuming that the individual parameters follow a continuous hierarchical distribution. We provide an accessible introduction to hierarchical MPT modeling and present the user-friendly and comprehensive R package TreeBUGS, which implements the two most important hierarchical MPT approaches for participant heterogeneity—the beta-MPT approach (Smith & Batchelder, Journal of Mathematical Psychology 54:167-183, 2010) and the latent-trait MPT approach (Klauer, Psychometrika 75:70-98, 2010). TreeBUGS reads standard MPT model files and obtains Markov-chain Monte Carlo samples that approximate the posterior distribution. The functionality and output are tailored to the specific needs of MPT modelers and provide tests for the homogeneity of items and participants, individual and group parameter estimates, fit statistics, and within- and between-subjects comparisons, as well as goodness-of-fit and summary plots. We also propose and implement novel statistical extensions to include continuous and discrete predictors (as either fixed or random effects) in the latent-trait MPT model.

91 citations


Journal ArticleDOI
04 Feb 2018-Extremes
TL;DR: This work estimates a high non-stationary threshold using a gamma distribution for precipitation intensities that incorporates spatial and temporal random effects and develops a penalized complexity (PC) prior specification for the tail index that shrinks the GP model towards the exponential distribution, thus preventing unrealistically heavy tails.
Abstract: This work is motivated by the challenge organized for the 10th International Conference on Extreme-Value Analysis (EVA2017) to predict daily precipitation quantiles at the $99.8\%$ level for each month at observed and unobserved locations. Our approach is based on a Bayesian generalized additive modeling framework that is designed to estimate complex trends in marginal extremes over space and time. First, we estimate a high non-stationary threshold using a gamma distribution for precipitation intensities that incorporates spatial and temporal random effects. Then, we use the Bernoulli and generalized Pareto (GP) distributions to model the rate and size of threshold exceedances, respectively, which we also assume to vary in space and time. The latent random effects are modeled additively using Gaussian process priors, which provide high flexibility and interpretability. We develop a penalized complexity (PC) prior specification for the tail index that shrinks the GP model towards the exponential distribution, thus preventing unrealistically heavy tails. Fast and accurate estimation of the posterior distributions is performed thanks to the integrated nested Laplace approximation (INLA). We illustrate this methodology by modeling the daily precipitation data provided by the EVA2017 challenge, which consist of observations from 40 stations in the Netherlands recorded during the period 1972–2016. Capitalizing on INLA’s fast computational capacity and powerful distributed computing resources, we conduct an extensive cross-validation study to select the model parameters that govern the smoothness of trends. Our results clearly outperform simple benchmarks and are comparable to the best-scoring approaches of the other teams.

Journal ArticleDOI
TL;DR: Sequencing strength training prior to endurance in concurrent training appears to be beneficial for lower body strength adaptations, while the improvement of aerobic capacity is not affected by training order.
Abstract: We conducted a systematic literature review and meta-analysis to assess the chronic effects of the sequence of concurrent strength and endurance training on selected important physiological and performance parameters, namely lower body 1 repetition maximum (1RM) and maximal aerobic capacity (VO2max/peak). Based on predetermined eligibility criteria, chronic effect trials, comparing strength-endurance (SE) with endurance-strength (ES) training sequence in the same session were included. Data on effect sizes, sample size and SD as well other related study characteristics were extracted. The effect sizes were pooled using, Fixed or Random effect models as per level of heterogeneity between studies and a further sensitivity analyses was carried out using Inverse Variance Heterogeneity (IVHet) models to adjust for potential bias due to heterogeneity. Lower body 1RM was significantly higher when strength training preceded endurance with a pooled mean change of 3.96 kg (95%CI: 0.81 to 7.10 kg). However, ...

Journal ArticleDOI
TL;DR: This paper illustrates why and how to implement random effects in multilevel modeling and provides resources to promote implementation of analyses that control for the nonindependence inherent in many quasi-random sampling designs.
Abstract: Observations are rarely independent in many common discipline-based education research designs. Specifically, students may be clustered (e.g., within sections or courses) and student outcomes may b...

Journal ArticleDOI
TL;DR: The beta‐binominal model performed best for meta‐analysis of few studies considering the balance between Coverage probability and power and most inverse variance random effects models showed unsatisfactory statistical properties also if more studies were included in the meta‐ analysis.
Abstract: Meta-analyses often include only a small number of studies (≤5). Estimating between-study heterogeneity is difficult in this situation. An inaccurate estimation of heterogeneity can result in biased effect estimates and too narrow confidence intervals. The beta-binominal model has shown good statistical properties for meta-analysis of sparse data. We compare the beta-binominal model with different inverse variance random (eg, DerSimonian-Laird, modified Hartung-Knapp, and Paule-Mandel) and fixed effects methods (Mantel-Haenszel and Peto) in a simulation study. The underlying true parameters were obtained from empirical data of actually performed meta-analyses to best mirror real-life situations. We show that valid methods for meta-analysis of a small number of studies are available. In fixed effects situations, the Mantel-Haenszel and Peto methods performed best. In random effects situations, the beta-binominal model performed best for meta-analysis of few studies considering the balance between coverage probability and power. We recommended the beta-binominal model for practical application. If very strong evidence is needed, using the Paule-Mandel heterogeneity variance estimator combined with modified Hartung-Knapp confidence intervals might be useful to confirm the results. Notable most inverse variance random effects models showed unsatisfactory statistical properties also if more studies (10-50) were included in the meta-analysis.

Journal ArticleDOI
01 Oct 2018-Medicine
TL;DR: The prevalence of NSSI in Chinese middle-school students is relatively high and substantial heterogeneity in prevalence estimates was revealed, which may not be ideal for Chinese populations.

Journal ArticleDOI
TL;DR: This meta-analysis is the first quantitative systematic overview of BFLPE and reveals the necessity for educators from all countries to learn about operative means to help students avoid the potential negative effect of the Big-fish-little-Pond effect.
Abstract: The Big-fish-little-Pond effect is well acknowledged as the negative effect of class/school average achievement on student academic self-concept, which profoundly impacts student academic performance and mental development. Although a few studies have been done with regard to this effect, inconsistence exists in the effect size with little success in finding moderators. Here, we present a meta-analysis to synthesize related literatures to reach a summary conclusion on the BFLPE. Furthermore, student age, comparison target, academic self-concept domain, student location, sample size, and publication year were examined as potential moderators. Thirty-three studies with fifty-six effect sizes (total N = 1,276,838) were finally included. The random effects model led to a mean of the BFLPE at β = -0.28 (p < 0.001). Moreover, moderator analyses revealed that the Big-Fish-Little-Pond effect is an age-based process and an intercultural phenomenon, which is stronger among high school students, in Asia and when verbal self-concept is considered. This meta-analysis is the first quantitative systematic overview of BFLPE, whose results are valuable to the understanding of BFLPE and reveal the necessity for educators from all countries to learn about operative means to help students avoid the potential negative effect. Future research expectations are offered subsequently.

Journal ArticleDOI
TL;DR: In this article, the spatial, temporal, and spatio-temporal interaction random effects are reparameterized using the spectral decomposition of their precision matrices to establish the appropriate identifiability constraints.
Abstract: Disease mapping studies the distribution of relative risks or rates in space and time, and typically relies on generalized linear mixed models (GLMMs) including fixed effects and spatial, temporal, and spatio-temporal random effects. These GLMMs are typically not identifiable and constraints are required to achieve sensible results. However, automatic specification of constraints can sometimes lead to misleading results. In particular, the penalized quasi-likelihood fitting technique automatically centers the random effects even when this is not necessary. In the Bayesian approach, the recently-introduced integrated nested Laplace approximations computing technique can also produce wrong results if constraints are not well-specified. In this paper the spatial, temporal, and spatio-temporal interaction random effects are reparameterized using the spectral decompositions of their precision matrices to establish the appropriate identifiability constraints. Breast cancer mortality data from Spain is used to illustrate the ideas.

01 Jan 2018
TL;DR: Of these three models investigated, the results showed that the CRPNB model provided better goodness-of-fit and offered more insights into the factors that contribute to tunnel safety, and was not only able to allocate the part of the otherwise unobserved heterogeneity to the individual model parameters but also was able to estimate the cross-correlations between these parameters.
Abstract: The majority of past road safety studies focused on open road segments while only a few focused on tunnels. Moreover, the past tunnel studies produced some inconsistent results about the safety effects of the traffic patterns, the tunnel design, and the pavement conditions. The effects of these conditions therefore remain unknown, especially for freeway tunnels in China. The study presented in this paper investigated the safety effects of these various factors utilizing a four-year period (2009-2012) of data as well as three models: 1) a random effects negative binomial model (RENB), 2) an uncorrelated random parameters negative binomial model (URPNB), and 3) a correlated random parameters negative binomial model (CRPNB). Of these three, the results showed that the CRPNB model provided better goodness-of-fit and offered more insights into the factors that contribute to tunnel safety. The CRPNB was not only able to allocate the part of the otherwise unobserved heterogeneity to the individual model parameters but also was able to estimate the cross-correlations between these parameters. Furthermore, the study results showed that traffic volume, tunnel length, proportion of heavy trucks, curvature, and pavement rutting were associated with higher frequencies of traffic crashes, while the distance to the tunnel wall, distance to the adjacent tunnel, distress ratio, International Roughness Index (IRI), and friction coefficient were associated with lower crash frequencies. In addition, the effects of the heterogeneity of the proportion of heavy trucks, the curvature, the rutting depth, and the friction coefficient were identified and their inter-correlations were analyzed.

Journal ArticleDOI
TL;DR: This paper clarifies and illustrates issues, focusing on the comparison of conditional and marginal Deviance Information Criteria (DICs) and Watanabe-Akaike Informationcriteria (WAIC) in psychometric modeling, and makes recommendations on the general application of the criteria to models with latent variables.
Abstract: Typical Bayesian methods for models with latent variables (or random effects) involve directly sampling the latent variables along with the model parameters. In high-level software code for model definitions (using, e.g., BUGS, JAGS, Stan), the likelihood is therefore specified as conditional on the latent variables. This can lead researchers to perform model comparisons via conditional likelihoods, where the latent variables are considered model parameters. In other settings, however, typical model comparisons involve marginal likelihoods where the latent variables are integrated out. This distinction is often overlooked despite the fact that it can have a large impact on the comparisons of interest. In this paper, we clarify and illustrate these issues, focusing on the comparison of conditional and marginal Deviance Information Criteria (DICs) and Watanabe-Akaike Information Criteria (WAICs) in psychometric modeling. The conditional/marginal distinction corresponds to whether the model should be predictive for the clusters that are in the data or for new clusters (where "clusters" typically correspond to higher-level units like people or schools). Correspondingly, we show that marginal WAIC corresponds to leave-one-cluster out (LOcO) cross-validation, whereas conditional WAIC corresponds to leave-one-unit out (LOuO). These results lead to recommendations on the general application of the criteria to models with latent variables.

Journal ArticleDOI
TL;DR: In regression analyses of spatially structured data, it is common practice to introduce spatially correlated random effects into the regression model to reduce or even avoid unobserved variable bias.
Abstract: In regression analyses of spatially structured data, it is common practice to introduce spatially correlated random effects into the regression model to reduce or even avoid unobserved variable bia...

Journal ArticleDOI
TL;DR: This tutorial article discusses how to select models in the Bayesian distributional regression setting, how to monitor convergence of the Markov chains and how to use simulation-based inference also for quantities derived from the original model parametrization.
Abstract: :Bayesian methods have become increasingly popular in the past two decades. With the constant rise of computational power, even very complex models can be estimated on virtually any modern computer. Moreover, interest has shifted from conditional mean models to probabilistic distributional models capturing location, scale, shape and other aspects of a response distribution, where covariate effects can have flexible forms, for example, linear, non-linear, spatial or random effects. This tutorial article discusses how to select models in the Bayesian distributional regression setting, how to monitor convergence of the Markov chains and how to use simulation-based inference also for quantities derived from the original model parametrization. We exemplify the workflow using daily weather data on (a) temperatures on Germany's highest mountain and (b) extreme values of precipitation for the whole of Germany.

Journal ArticleDOI
TL;DR: Genetic and permanent environmental variances and heritability for particular days in milk were high at the beginning and at the end of lactation, and the residual variance decreased throughout the lactation.
Abstract: Multiple-lactation random regression model was applied to test-day records of milk, fat and pro- tein yields in the first three lactations of the Czech Holstein breed. Data included 9 583 cows, 89 584, 44 207 and 11 266 test-day records in the first, second and third lactation, respectively. Milk, fat and protein in the first three lactations were analysed separately and in a multiple-trait analysis. Linear model included herd-test date, fixed regressions within age-season class and two random effects: animal genetic and permanent environment modelled by regressions. Gibbs sampling method was used to generate samples from marginal posterior distributions of the model parameters. The single- and multiple-trait models provided similar results. Genetic and permanent environmental variances and heritability for particular days in milk were high at the beginning and at the end of lactation. The residual variance decreased throughout the lactation. The resulting heritability ranged from 0.13 to 0.52 and increased with parity.

Journal ArticleDOI
TL;DR: This research proposes a class of basis functions extracted from thin-plate splines that are ordered in terms of their degrees of smoothness with higher-order functions corresponding to larger-scale features and lower-order ones corresponding to smaller-scale details, leading to a parsimonious representation of a (nonstationary) spatial covariance function.
Abstract: The spatial random effects model is flexible in modeling spatial covariance functions and is computationally efficient for spatial prediction via fixed rank kriging (FRK). However, the model depend...

Posted Content
TL;DR: This paper introduces the R-package cAIC4 that allows for the computation of the conditional Akaike Information Criterion (cAIC), and introduces a fast and stable implementation for the calculation of the cA IC for linear mixed models estimated with lme4 and additive mixed models Estimated with gamm4.
Abstract: Model selection in mixed models based on the conditional distribution is appropriate for many practical applications and has been a focus of recent statistical research. In this paper we introduce the R-package cAIC4 that allows for the computation of the conditional Akaike Information Criterion (cAIC). Computation of the conditional AIC needs to take into account the uncertainty of the random effects variance and is therefore not straightforward. We introduce a fast and stable implementation for the calculation of the cAIC for linear mixed models estimated with lme4 and additive mixed models estimated with gamm4 . Furthermore, cAIC4 offers a stepwise function that allows for a fully automated stepwise selection scheme for mixed models based on the conditional AIC. Examples of many possible applications are presented to illustrate the practical impact and easy handling of the package.

Posted ContentDOI
08 Sep 2018-bioRxiv
TL;DR: This work takes advantage of the fact that the conditional logistic regression model (i. e., the SSF) is likelihood-equivalent to a Poisson model with stratum-specific intercepts, and interprets the intercepts as a random effect with a large (fixed) variance, so inference becomes feasible with standard Bayesian techniques, but also with frequentist methods that allow one to fix the variance of arandom effect.
Abstract: Popular frameworks for studying habitat selection include resource-selection functions (RSFs) and step-selection functions (SSFs) estimated using logistic and conditional logistic regression, respectively Both frameworks compare environmental covariates associated with locations animals visit with environmental covariates at a set of locations assumed available to the animal Conceptually, random coefficients could be used to accommodate inter-individual heterogeneity with either approach, but straightforward and efficient one-step procedures for fitting SSFs with random coefficients are currently lacking We take advantage of the fact that the conditional logistic regression model (ie, the SSF) is likelihood-equivalent to a Poisson model with stratum-specific intercepts By interpreting the intercepts as a random effect with a large (fixed) variance, inference becomes feasible with standard Bayesian techniques, but also with frequentist methods that allow one to fix the variance of a random effect We compare this approach to other commonly applied alternatives, including random intercept-only models, and to a two-step algorithm for fitting mixed-effects models We also reinforce the need to weight available points when fitting RSFs, since models fit using "infinitely weighted logistic regression" have been shown to be equivalent to an inhomogeneous Poisson process (IPP) We generalize this result to "infinitely weighted Poisson regression", which converges to the same underlying IPP distribution Using data from Eurasian otters (Lutra lutra) and mountain goats (Oreamnos americanus), we illustrate that our models lead to valid and feasible inference In addition, we conduct a simulation study to demonstrate the importance of including random slopes when estimating individual- and population-level habitat-selection parameters By providing coded examples using integrated nested Laplace approximations (INLA) and Template Model Builder (TMB) for Bayesian and frequentist analysis via the R packages R-INLA and glmmTMB, we hope to make efficient estimation of RSFs and SSFs with random effects accessible to anyone in the field SSFs with individual-specific coefficients are particularly attractive since they can provide insights into movement and habitat-selection processes at fine-spatial and temporal scales, but these models had previously been very challenging to fit

Journal ArticleDOI
TL;DR: In this paper, a meta-regression analysis was performed by examining 1,661 efficiency scores retrieved from 120 papers published over the period 2000-2014, and it was shown that parametric methods yield lower levels of banking efficiency than nonparametric studies.
Abstract: One learns two main lessons from studying the great quantity of banking efficiency literature. These lessons regard the heterogeneity in results and the absence of a comprehensive review aimed at understanding the reasons for this variability. Surprisingly, although this issue is well-known, it has not been systematically analyzed before. In order to fill this gap, we perform a Meta-Regression-Analysis (MRA) by examining 1,661 efficiency scores retrieved from 120 papers published over the period 2000-2014. The meta-regression is estimated by using the Random Effects Multilevel Model (REML), because it controls for within-study and between-study heterogeneity. The analysis yields four main results. Firstly, parametric methods yield lower levels of banking efficiency than nonparametric studies. This holds true even after controlling for the approach used in selecting the inputs and outputs of the frontier. Secondly, we show that banking efficiency is highest when using the value added approach, followed by estimates from studies based on the intermediation method, whereas those based on the hybrid approach are the lowest. Thirdly, efficiency scores also depend on the quality of studies and on the number of observations and variables used in the primary papers. As far as the effects of sample size, dimension and quality of papers are concerned, there are significant differences in sign and magnitude between parametric and nonparametric studies. Finally, cost efficiency is found to be higher than profit and production efficiency. Interestingly, MRA results are robust to the potential outliers in efficiency and sample size distributions

Journal ArticleDOI
TL;DR: The similarities of and differences between these three modelling approaches are illustrated, explain the reasons why they may provide different conclusions and offer advice on which model to choose depending on the characteristics of the study.
Abstract: Multicentre studies are common in epidemiological research aiming at identifying disease risk factors. A major advantage of multicentre over single-centre studies is that, by including a larger number of participants, they allow consideration of rare outcomes and exposures. Their multicentric nature introduces some complexities at the step of data analysis, in particular when it comes to controlling for confounding by centre, which is the focus of this tutorial. Commonly, epidemiologists use one of the following options: pooling all centre-specific data and adjusting for centre using fixed effects; adjusting for centre using random effects; or fitting centre-specific models and combining the results in a meta-analysis. Here, we illustrate the similarities of and differences between these three modelling approaches, explain the reasons why they may provide different conclusions and offer advice on which model to choose depending on the characteristics of the study. Two key issues to examine during the analyses are to distinguish within-centre from between-centre associations, and the possible heterogeneity of the effects (of exposure and/or confounders) by centre. A real epidemiological study is used to illustrate a situation in which these various options yield different results. A synthetic dataset and R and Stata codes are provided to reproduce the results.

Journal ArticleDOI
03 Jun 2018-Anemia
TL;DR: It has been identified that the number of children under five in the household, wealth index, age of children, mothers' current working status, education level, and source of drinking water have a significant effect on prevalence of anemia.
Abstract: Background. Anemia is a widely spread public health problem and affects individuals at all levels. However, there is a considerable regional variation in its distribution. Objective. Thus, this study aimed to assess and model the determinants of prevalence of anemia among children aged 6–59 months in Ethiopia. Data. Cross-sectional data from Ethiopian Demographic and Health Survey was used for the analysis. It was implemented by the Central Statistical Agency from 27 December 2010 through June 2011 and the sampling technique employed was multistage. Method. The statistical models that suit the hierarchical data such as variance components model, random intercept model, and random coefficients model were used to analyze the data. Likelihood and Bayesian approaches were used to estimate both fixed effects and random effects in multilevel analysis. Result. This study revealed that the prevalence of anemia among children aged between 6 and 59 months in the country was around 42.8%. The multilevel binary logistic regression analysis was performed to investigate the variation of predictor variables of the prevalence of anemia among children aged between 6 and 59 months. Accordingly, it has been identified that the number of children under five in the household, wealth index, age of children, mothers’ current working status, education level, given iron pills, size of child at birth, and source of drinking water have a significant effect on prevalence of anemia. It is found that variances related to the random term were statistically significant implying that there is variation in prevalence of anemia across regions. From the methodological aspect, it was found that random intercept model is better compared to the other two models in fitting the data well. Bayesian analysis gave consistent estimates with the respective multilevel models and additional solutions as posterior distribution of the parameters. Conclusion. The current study confirmed that prevalence of anemia among children aged 6–59 months in Ethiopia was severe public health problem, where 42.8 of them are anemic. Thus, stakeholders should pay attention to all significant factors mentioned in the analysis of this study but wealth index/improving household income and availability of pure drinking water are the most influential factors that should be improved anyway.

Journal ArticleDOI
TL;DR: This article describes an approach using numerical quadrature to obtain PA estimates from their SS counterparts in models with multiple random effects, and illustrates the proposed method using data from a smoking cessation study in which a binary outcome was measured longitudinally.
Abstract: This article discusses marginalization of the regression parameters in mixed models for correlated binary outcomes. As is well known, the regression parameters in such models have the "subject-specific" (SS) or conditional interpretation, in contrast to the "population-averaged" (PA) or marginal estimates that represent the unconditional covariate effects. We describe an approach using numerical quadrature to obtain PA estimates from their SS counterparts in models with multiple random effects. Standard errors for the PA estimates are derived using the delta method. We illustrate our proposed method using data from a smoking cessation study in which a binary outcome (smoking, Y/N) was measured longitudinally. We compare our estimates to those obtained using GEE and marginalized multilevel models, and present results from a simulation study.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors investigated the safety effects of these various factors utilizing a four-year period (2009-2012) of data as well as three models: (1) a random effects negative binomial model (RENB), (2) an uncorrelated random parameters negative binometric model (URPNB), and (3) a correlated Random Parameters Negative Binomial Model (CRPNB).