scispace - formally typeset
Search or ask a question

Showing papers on "Random effects model published in 2015"


Journal ArticleDOI
TL;DR: In this article, the authors challenge fixed effects (FE) for time-series-cross-sectional and panel data, and argue not simply for technical solutions to endogeneity, but the substantive importance of context/heterogeneity, modelled using RE.
Abstract: This article challenges Fixed Effects (FE) modelling as the ‘default’ for time-series-cross-sectional and panel data. Understanding different within- and between-effects is crucial when choosing modelling strategies. The downside of Random Effects (RE) modelling – correlated lower-level covariates and higher-level residuals – is omitted-variable bias, solvable with Mundlak’s (1978a) formulation. Consequently, RE can provide everything FE promises and more, as confirmed by Monte-Carlo simulations, which additionally show problems with Plumper and Troeger’s FE Vector Decomposition method when data are unbalanced. As well as incorporating time-invariant variables, RE models are readily extendable, with random coefficients, cross-level interactions, and complex variance functions. We argue not simply for technical solutions to endogeneity, but the substantive importance of context/heterogeneity, modelled using RE. The implications extend beyond political science, to all multilevel datasets. However, omitted variables could still bias estimated higher-level variable effects; as with any model, care is required in interpretation.

1,036 citations


Posted Content
TL;DR: This work shows that failure to converge typically is not due to a suboptimal estimation algorithm, but is a consequence of attempting to fit a model that is too complex to be properly supported by the data, irrespective of whether estimation is based on maximum likelihood or on Bayesian hierarchical modeling with uninformative or weakly informative priors.
Abstract: The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recently, Barr, Levy, Scheepers, and Tily, 2013 recommended fitting `maximal' models with all possible random effect components included. Estimation of maximal models, however, may not converge. We show that failure to converge typically is not due to a suboptimal estimation algorithm, but is a consequence of attempting to fit a model that is too complex to be properly supported by the data, irrespective of whether estimation is based on maximum likelihood or on Bayesian hierarchical modeling with uninformative or weakly informative priors. Importantly, even under convergence, overparameterization may lead to uninterpretable models. We provide diagnostic tools for detecting overparameterization and guiding model simplification.

889 citations


Journal ArticleDOI
TL;DR: This article performed a series of Monte Carlo simulations to evaluate the total error due to bias and variance in the inferences of each model, for typical sizes and types of datasets encountered in applied research.
Abstract: Empirical analyses in social science frequently confront quantitative data that are clustered or grouped. To account for group-level variation and improve model fit, researchers will commonly specify either a fixed- or random-effects model. But current advice on which approach should be preferred, and under what conditions, remains vague and sometimes contradictory. This study performs a series of Monte Carlo simulations to evaluate the total error due to bias and variance in the inferences of each model, for typical sizes and types of datasets encountered in applied research. The results offer a typology of dataset characteristics to help researchers choose a preferred model.

621 citations


Journal ArticleDOI
TL;DR: TMB as discussed by the authors is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, this http URL).
Abstract: TMB is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, this http URL). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (~10^6) and parameters (~10^3). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at this http URL.

506 citations


Journal ArticleDOI
TL;DR: Adaptive branch-site random effects likelihood (aBSREL), whose key innovation is variable parametric complexity chosen with an information theoretic criterion, delivers statistical performance matching or exceeding best-in-class existing approaches, while running an order of magnitude faster.
Abstract: Over the past two decades, comparative sequence analysis using codon-substitution models has been honed into a powerful and popular approach for detecting signatures of natural selection from molecular data. A substantial body of work has focused on developing a class of “branch-site” models which permit selective pressures on sequences, quantified by the ω ratio, to vary among both codon sites and individual branches in the phylogeny. We develop and present a method in this class, adaptive branch-site random effects likelihood (aBSREL), whose key innovation is variable parametric complexity chosen with an information theoretic criterion. By applying models of different complexity to different branches in the phylogeny, aBSREL delivers statistical performance matching or exceeding best-in-class existing approaches, while running an order of magnitude faster. Based on simulated data analysis, we offer guidelines for what extent and strength of diversifying positive selection can be detected reliably and suggest that there is a natural limit on the optimal parametric complexity for “branch-site” models. An aBSREL analysis of 8,893 Euteleostomes gene alignments demonstrates that over 80% of branches in typical gene phylogenies can be adequately modeled with a single ω ratio model, that is, current models are unnecessarily complicated. However, there are a relatively small number of key branches, whose identities are derived from the data using a model selection procedure, for which it is essential to accurately model evolutionary complexity.

501 citations


Journal ArticleDOI
TL;DR: It is shown that the known issues of underestimation of the statistical error and spuriously overconfident estimates with the RE model can be resolved by the use of an estimator under the fixed effect model assumption with a quasi-likelihood based variance structure - the IVhet model.

386 citations


Journal ArticleDOI
TL;DR: The authors showed that for typical psychological and psycholinguistic data, higher power is achieved without inflating Type I error rate if a model selection criterion is used to select a random effect structure that is supported by the data.
Abstract: Linear mixed-effects models have increasingly replaced mixed-model analyses of variance for statistical inference in factorial psycholinguistic experiments. Although LMMs have many advantages over ANOVA, like ANOVAs, setting them up for data analysis also requires some care. One simple option, when numerically possible, is to fit the full variance-covariance structure of random effects (the maximal model; Barr et al. 2013), presumably to keep Type I error down to the nominal alpha in the presence of random effects. Although it is true that fitting a model with only random intercepts may lead to higher Type I error, fitting a maximal model also has a cost: it can lead to a significant loss of power. We demonstrate this with simulations and suggest that for typical psychological and psycholinguistic data, higher power is achieved without inflating Type I error rate if a model selection criterion is used to select a random effect structure that is supported by the data.

330 citations


Journal ArticleDOI
21 Jul 2015-PeerJ
TL;DR: Simulation results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.
Abstract: Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained <5 levels of the random intercept term, especially for estimating variance components, and this effect appeared independent of total sample size. These results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.

287 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that the unrestricted weighted least squares estimator is superior to conventional random effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity.
Abstract: This study challenges two core conventional meta-analysis methods: fixed effect and random effects. We show how and explain why an unrestricted weighted least squares estimator is superior to conventional random-effects meta-analysis when there is publication (or small-sample) bias and better than a fixed-effect weighted average if there is heterogeneity. Statistical theory and simulations of effect sizes, log odds ratios and regression coefficients demonstrate that this unrestricted weighted least squares estimator provides satisfactory estimates and confidence intervals that are comparable to random effects when there is no publication (or small-sample) bias and identical to fixed-effect meta-analysis when there is no heterogeneity. When there is publication selection bias, the unrestricted weighted least squares approach dominates random effects; when there is excess heterogeneity, it is clearly superior to fixed-effect meta-analysis. In practical applications, an unrestricted weighted least squares weighted average will often provide superior estimates to both conventional fixed and random effects.

183 citations


Journal ArticleDOI
TL;DR: It is proved that the monotone control limit policy is optimal and sensitivity analysis of the model parameters on the optimal policy is conducted.

180 citations


Journal ArticleDOI
TL;DR: This article examines the performance of the updated quality effects (QE) estimator for meta-analysis of heterogeneous studies and shows that this approach leads to a decreased mean squared error (MSE) of the estimator while maintaining the nominal level of coverage probability of the confidence interval.

Journal ArticleDOI
TL;DR: It is shown that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification.
Abstract: In spatial generalized linear mixed models (SGLMMs), covariates that are spatially smooth are often collinear with spatially smooth random effects. This phenomenon is known as spatial confounding and has been studied primarily in the case where the spatial support of the process being studied is discrete (e.g., areal spatial data). In this case, the most common approach suggested is restricted spatial regression (RSR) in which the spatial random effects are constrained to be orthogonal to the fixed effects. We consider spatial confounding and RSR in the geostatistical (continuous spatial support) setting. We show that RSR provides computational benefits relative to the confounded SGLMM, but that Bayesian credible intervals under RSR can be inappropriately narrow under model misspecification. We propose a posterior predictive approach to alleviating this potential problem and discuss the appropriateness of RSR in a variety of situations. We illustrate RSR and SGLMM approaches through simulation studies and an analysis of malaria frequencies in The Gambia, Africa. Copyright © 2015 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The modified Knapp-Hartung method (mKH) as discussed by the authors applies an ad hoc correction and has been proposed to prevent counterintuitive effects and to yield more conservative inference.
Abstract: Random-effects meta-analysis is commonly performed by first deriving an estimate of the between-study variation, the heterogeneity, and subsequently using this as the basis for combining results, i.e., for estimating the effect, the figure of primary interest. The heterogeneity variance estimate however is commonly associated with substantial uncertainty, especially in contexts where there are only few studies available, such as in small populations and rare diseases. Confidence intervals and tests for the effect may be constructed via a simple normal approximation, or via a Student-t distribution, using the Hartung-Knapp-Sidik-Jonkman (HKSJ) approach, which additionally uses a refined estimator of variance of the effect estimator. The modified Knapp-Hartung method (mKH) applies an ad hoc correction and has been proposed to prevent counterintuitive effects and to yield more conservative inference. We performed a simulation study to investigate the behaviour of the standard HKSJ and modified mKH procedures in a range of circumstances, with a focus on the common case of meta-analysis based on only a few studies. The standard HKSJ procedure works well when the treatment effect estimates to be combined are of comparable precision, but nominal error levels are exceeded when standard errors vary considerably between studies (e.g. due to variations in study size). Application of the modification on the other hand yields more conservative results with error rates closer to the nominal level. Differences are most pronounced in the common case of few studies of varying size or precision. Use of the modified mKH procedure is recommended, especially when only a few studies contribute to the meta-analysis and the involved studies’ precisions (standard errors) vary.

Journal ArticleDOI
Chien-Yu Peng1
TL;DR: The properties of the lifetime distribution and parameter estimation using the EM-type algorithm are presented in addition to providing a simple model-checking procedure to assess the validity of different stochastic processes.
Abstract: Degradation models are widely used to assess the lifetime information of highly reliable products. This study proposes a degradation model based on an inverse normal-gamma mixture of an inverse Gaussian process. This article presents the properties of the lifetime distribution and parameter estimation using the EM-type algorithm, in addition to providing a simple model-checking procedure to assess the validity of different stochastic processes. Several case applications are performed to demonstrate the advantages of the proposed model with random effects and explanatory variables. Technical details, data, and R code are available online as supplementary materials.

Journal ArticleDOI
TL;DR: A simulation study is performed to investigate the behaviour of the standard HKSJ and modified mKH procedures in a range of circumstances, with a focus on the common case of meta-analysis based on only a few studies.
Abstract: BACKGROUND: Random-effects meta-analysis is commonly performed by first deriving an estimate of the between-study variation, the heterogeneity, and subsequently using this as the basis for combining results, i.e., for estimating the effect, the figure of primary interest. The heterogeneity variance estimate however is commonly associated with substantial uncertainty, especially in contexts where there are only few studies available, such as in small populations and rare diseases. METHODS: Confidence intervals and tests for the effect may be constructed via a simple normal approximation, or via a Student-t distribution, using the Hartung-Knapp-Sidik-Jonkman (HKSJ) approach, which additionally uses a refined estimator of variance of the effect estimator. The modified Knapp-Hartung method (mKH) applies an ad hoc correction and has been proposed to prevent counterintuitive effects and to yield more conservative inference. We performed a simulation study to investigate the behaviour of the standard HKSJ and modified mKH procedures in a range of circumstances, with a focus on the common case of meta-analysis based on only a few studies. RESULTS: The standard HKSJ procedure works well when the treatment effect estimates to be combined are of comparable precision, but nominal error levels are exceeded when standard errors vary considerably between studies (e.g. due to variations in study size). Application of the modification on the other hand yields more conservative results with error rates closer to the nominal level. Differences are most pronounced in the common case of few studies of varying size or precision. CONCLUSIONS: Use of the modified mKH procedure is recommended, especially when only a few studies contribute to the meta-analysis and the involved studies' precisions (standard errors) vary.

Journal ArticleDOI
TL;DR: In this article, a unified Bayesian approach for multivariate structured additive distributional regression analysis comprising a huge class of continuous, discrete and latent multivariate response distributions is proposed, where each parameter of these potentially complex distributions is modelled by a structured additive predictor.
Abstract: Summary We propose a unified Bayesian approach for multivariate structured additive distributional regression analysis comprising a huge class of continuous, discrete and latent multivariate response distributions, where each parameter of these potentially complex distributions is modelled by a structured additive predictor. The latter is an additive composition of different types of covariate effects, e.g. non-linear effects of continuous covariates, random effects, spatial effects or interaction effects. Inference is realized by a generic, computationally efficient Markov chain Monte Carlo algorithm based on iteratively weighted least squares approximations and with multivariate Gaussian priors to enforce specific properties of functional effects. Applications to illustrate our approach include a joint model of risk factors for chronic and acute childhood undernutrition in India and ecological regressions studying the drivers of election results in Germany.

Journal ArticleDOI
TL;DR: The present results suggest that evening orientation is associated with a worse academic performance, both in school pupils and university students; for the first time, it has been shown that such relationship changes over time, being weaker in university students.
Abstract: The association between circadian preference and academic achievement has been assessed through a systematic review and meta-analysis. The literature searches retrieved 1647 studies; 31 studies, with a total sample size of 27 309 participants, fulfilled the inclusion criteria and were included in the meta-analysis. With reference to all these 31 studies, before running the meta-analysis, the sign of the correlation between the investigated variables was set in a way that a positive correlation showed that eveningness was related to worse academic performance. The meta-analysis yielded a small overall effect size of 0.143 (CI [0,129; 0,156]) under a fixed effects model (Z = 20.584, p < 0.001, I2 = 72.656; Q = 109.715) and of 0.145 (CI [0.117; 0.172]) under a random effects model (Z = 10.077, p < 0.001). A random effects model with a grouping variable (participants) revealed 15 studies based on school pupils and 16 on university students. The random model showed a higher effect size in school pupils (0.166,...

Journal ArticleDOI
TL;DR: In this article, a quasi-maximum likelihood (QML) estimator for dynamic panel models with spatial errors is proposed, where the cross-sectional dimension n is large and the time dimension T is fixed.

Journal ArticleDOI
TL;DR: This paper describes the inversion scheme using a worked example based upon simulated electrophysiological responses, and uses empirical priors from the second level to iteratively optimize posterior densities over parameters at the first level.
Abstract: This technical note considers a simple but important methodological issue in estimating effective connectivity; namely, how do we integrate measurements from multiple subjects to infer functional brain architectures that are conserved over subjects. We offer a solution to this problem that rests on a generalization of random effects analyses to Bayesian inference about nonlinear models of electrophysiological time-series data. Specifically, we present an empirical Bayesian scheme for group or hierarchical models, in the setting of dynamic causal modeling (DCM). Recent developments in approximate Bayesian inference for hierarchical models enable the efficient estimation of group effects in DCM studies of multiple trials, sessions, or subjects. This approach estimates second (e.g., between-subject) level parameters based on posterior estimates from the first (e.g., within-subject) level. Here, we use empirical priors from the second level to iteratively optimize posterior densities over parameters at the first level. The motivation for this iterative application is to finesse the local minima problem inherent in the (first level) inversion of nonlinear and ill-posed models. Effectively, the empirical priors shrink the first level parameter estimates toward the global maximum, to provide more robust and efficient estimates of within (and between-subject) effects. This paper describes the inversion scheme using a worked example based upon simulated electrophysiological responses. In a subsequent paper, we will assess its robustness and reproducibility using an empirical example.

Book
02 Nov 2015
TL;DR: Theoretical reasons for multilevel modeling and why should I use it are discussed in this article, where a review of single-level regression nested structures in our data is presented.
Abstract: Chapter 1: What Is Multilevel Modeling and Why Should I Use It? Mixing levels of analysis Theoretical reasons for multilevel modeling What are the advantages of using multilevel models? Statistical reasons for multilevel modeling Assumptions of OLS Software How this book is organized Chapter 2: Random Intercept Models: When intercepts vary A review of single-level regression Nesting structures in our data Getting starting with random intercept models What do our findings mean so far? Changing the grouping to schools Adding Level 1 explanatory variables Adding Level 2 explanatory variables Group mean centring Interactions Model fit What about R-squared? R-squared? A further assumption and a short note on random and fixed effects Chapter 3: Random Coefficient Models: When intercepts and coefficients vary Getting started with random coefficient models Trying a different random coefficient Shrinkage Fanning in and fanning out Examining the variances A dichotomous variable as a random coefficient More than one random coefficient A note on parsimony and fitting a model with multiple random coefficients A model with one random and one fixed coefficient Adding Level 2 variables Residual diagnostics First steps in model-building Some tasters of further extensions to our basic models Where to next? Chapter 4: Communicating Results to a Wider Audience Creating journal-formatted tables The fixed part of the model The importance of the null model Centring variables Stata commands to make table-making easier What do you talk about? Models with random coefficients What about graphs? Cross-level interactions Parting words

Journal ArticleDOI
TL;DR: This work focuses on a crossed-random effects extension of the Bayesian latent-trait pair-clustering MPT model that assumes that participant and item effects combine additively on the probit scale and postulates (multivariate) normal distributions for the random effects.
Abstract: Multinomial processing tree (MPT) models are theoretically motivated stochastic models for the analysis of categorical data. Here we focus on a crossed-random effects extension of the Bayesian latent-trait pair-clustering MPT model. Our approach assumes that participant and item effects combine additively on the probit scale and postulates (multivariate) normal distributions for the random effects. We provide a WinBUGS implementation of the crossed-random effects pair-clustering model and an application to novel experimental data. The present approach may be adapted to handle other MPT models.

Journal ArticleDOI
TL;DR: The main findings of this simulation are that in many cases, parameter estimates of the extended RE-ESF are more accurate than other ESF models; the elimination of the spatial component confounding with explanatory variables results in biased parameter estimates; efficiency of an accuracy maximization-based conventional ESF is comparable to RE- ESF inMany cases.
Abstract: Eigenvector spatial filtering (ESF) is becoming a popular way to address spatial dependence. Recently, a random effects specification of ESF (RE-ESF) is receiving considerable attention because of its usefulness for spatial dependence analysis considering spatial confounding. The objective of this study was to analyze theoretical properties of RE-ESF and extend it to overcome some of its disadvantages. We first compare the properties of RE-ESF and ESF with geostatistical and spatial econometric models. There, we suggest two major disadvantages of RE-ESF: it is specific to its selected spatial connectivity structure, and while the current form of RE-ESF eliminates the spatial dependence component confounding with explanatory variables to stabilize the parameter estimation, the elimination can yield biased estimates. RE-ESF is extended to cope with these two problems. A computationally efficient residual maximum likelihood estimation is developed for the extended model. Effectiveness of the extended RE-ESF is examined by a comparative Monte Carlo simulation. The main findings of this simulation are as follows: Our extension successfully reduces errors in parameter estimates; in many cases, parameter estimates of our RE-ESF are more accurate than other ESF models; the elimination of the spatial component confounding with explanatory variables results in biased parameter estimates; efficiency of an accuracy maximization-based conventional ESF is comparable to RE-ESF in many cases.

Journal ArticleDOI
TL;DR: In this paper, the authors validate the Endogenous Growth Model by examining the impacts of Human Capital (HK) and Foreign Direct Investment (FDI) on economic growth in ten countries from Commonwealth of Independent States (CIS).
Abstract: Purpose – The purpose of this paper is to validate the Endogenous Growth Model by examining the impacts of Human Capital (HK) and Foreign Direct Investment (FDI) on economic growth in ten countries from Commonwealth of Independent States (CIS). Design/methodology/approach – For empirical investigation, a linear regression model based on growth theory and panel data set covering the time-period from 1993 to 2011 are used. Fixed and random effects models are applied. On the basis of the Hausman test, the fixed effects model has been preferred over the random effects model. Findings – The results support the hypothesis of the study by confirming that HK development is critical for economic growth. Similarly, FDI has been found to have a facilitating role in promoting growth in the former Soviet Republics now comprising Central Asian independent economies. This is despite of the fact that there are country-specific differences across CIS. Practical implications – The findings suggest that investment climate i...

Journal ArticleDOI
TL;DR: The effect ofMissing observations on the estimated range of influence depended to some extent on the missing data mechanism, and the overall effect of missing observations was small compared to the uncertainty of the range estimate.
Abstract: The range of influence refers to the average distance between locations at which the observed outcome is no longer correlated. In many studies, missing data occur and a popular tool for handling missing data is multiple imputation. The objective of this study was to investigate how the estimated range of influence is affected when 1) the outcome is only observed at some of a given set of locations, and 2) multiple imputation is used to impute the outcome at the non-observed locations. The study was based on the simulation of missing outcomes in a complete data set. The range of influence was estimated from a logistic regression model with a spatially structured random effect, modelled by a Gaussian field. Results were evaluated by comparing estimates obtained from complete, missing, and imputed data. In most simulation scenarios, the range estimates were consistent with ≤25% missing data. In some scenarios, however, the range estimate was affected by even a moderate number of missing observations. Multiple imputation provided a potential improvement in the range estimate with ≥50% missing data, but also increased the uncertainty of the estimate. The effect of missing observations on the estimated range of influence depended to some extent on the missing data mechanism. In general, the overall effect of missing observations was small compared to the uncertainty of the range estimate.

Journal ArticleDOI
TL;DR: In this article, the authors introduce a class of models for analyzing degradation data with dynamic covariate information, which are referred to as dynamic covariates as a useful resource for obtaining reliability information for some highly reliable products and systems.
Abstract: Degradation data provide a useful resource for obtaining reliability information for some highly reliable products and systems. In addition to product/system degradation measurements, it is common nowadays to dynamically record product/system usage as well as other life-affecting environmental variables, such as load, amount of use, temperature, and humidity. We refer to these variables as dynamic covariate information. In this article, we introduce a class of models for analyzing degradation data with dynamic covariate information. We use a general path model with individual random effects to describe degradation paths and a vector time series model to describe the covariate process. Shape-restricted splines are used to estimate the effects of dynamic covariates on the degradation process. The unknown parameters in the degradation data model and the covariate process model are estimated by using maximum likelihood. We also describe algorithms for computing an estimate of the lifetime distribution induced...

Journal ArticleDOI
TL;DR: A new approach is proposed that exploits recent distributional results for the extended skew normal family to allow exact likelihood inference for a flexible class of random-effects models and place no restriction on the times at which repeated measurements are made.
Abstract: Random effects or shared parameter models are commonly advocated for the analysis of combined repeated measurement and event history data, including dropout from longitudinal trials. Their use in practical applications has generally been limited by computational cost and complexity, meaning that only simple special cases can be fitted by using readily available software. We propose a new approach that exploits recent distributional results for the extended skew normal family to allow exact likelihood inference for a flexible class of random-effects models. The method uses a discretization of the timescale for the time-to-event outcome, which is often unavoidable in any case when events correspond to dropout. We place no restriction on the times at which repeated measurements are made. An analysis of repeated lung function measurements in a cystic fibrosis cohort is used to illustrate the method.

Journal ArticleDOI
TL;DR: Inverse-variance methods perform poorly when the data contains zeros in either the control or intervention arms, and methods based on Poisson regression with random effect terms for the variance components are very flexible and offer substantial improvement.
Abstract: When summary results from studies of counts of events in time contain zeros, the study-specific incidence rate ratio (IRR) and its standard error cannot be calculated because the log of zero is undefined. This poses problems for the widely used inverse-variance method that weights the study-specific IRRs to generate a pooled estimate. We conducted a simulation study to compare the inverse-variance method of conducting a meta-analysis (with and without the continuity correction) with alternative methods based on either Poisson regression with fixed interventions effects or Poisson regression with random intervention effects. We manipulated the percentage of zeros in the intervention group (from no zeros to approximately 80 percent zeros), the levels of baseline variability and heterogeneity in the intervention effect, and the number of studies that comprise each meta-analysis. We applied these methods to an example from our own work in suicide prevention and to a recent meta-analysis of the effectiveness of condoms in preventing HIV transmission. As the percentage of zeros in the data increased, the inverse-variance method of pooling data shows increased bias and reduced coverage. Estimates from Poisson regression with fixed interventions effects also display evidence of bias and poor coverage, due to their inability to account for heterogeneity. Pooled IRRs from Poisson regression with random intervention effects were unaffected by the percentage of zeros in the data or the amount of heterogeneity. Inverse-variance methods perform poorly when the data contains zeros in either the control or intervention arms. Methods based on Poisson regression with random effect terms for the variance components are very flexible offer substantial improvement.

Journal ArticleDOI
TL;DR: This article considers a multilevel first-order autoregressive [AR(1)] model with random intercepts, random autoregression, and random innovation variance (i.e., the level 1 residual variance) and shows that modeling the innovation variance as fixed across individuals, when it should be modeled as a random effect, leads to biased parameter estimates.
Abstract: In this article we consider a multilevel first-order autoregressive [AR(1)] model with random intercepts, random autoregression, and random innovation variance (i.e., the level 1 residual variance). Including random innovation variance is an important extension of the multilevel AR(1) model for two reasons. First, between-person differences in innovation variance are important from a substantive point of view, in that they capture differences in sensitivity and/or exposure to unmeasured internal and external factors that influence the process. Second, using simulation methods we show that modeling the innovation variance as fixed across individuals, when it should be modeled as a random effect, leads to biased parameter estimates. Additionally, we use simulation methods to compare maximum likelihood estimation to Bayesian estimation of the multilevel AR(1) model and investigate the trade-off between the number of individuals and the number of time points. We provide an empirical illustration by applying the extended multilevel AR(1) model to daily positive affect ratings from 89 married women over the course of 42 consecutive days.

Journal ArticleDOI
TL;DR: In this article, the authors theoretically and empirically evaluate the impacts of this simplified approach and show that the potential bias depends on three quantities: the amount of missingness, the intraclass correlation, and the cluster size.
Abstract: Multiple imputation is widely accepted as the method of choice to address item-nonresponse in surveys. However, research on imputation strategies for the hierarchical structures that are typically found in the data in educational contexts is still limited. While a multilevel imputation model should be preferred from a theoretical point of view if the analysis model of interest is also a multilevel model, many practitioners prefer a fixed effects imputation model with dummies for the clusters since these models are easy to set up with standard imputation software. In this article, we theoretically and empirically evaluate the impacts of this simplified approach. We illustrate that the cluster effects that are often of central interest in educational research can be biased if a fixed effects imputation model is used. We show that the potential bias depends on three quantities: the amount of missingness, the intraclass correlation, and the cluster size. We argue that the bias for the random effects can be su...

Journal ArticleDOI
TL;DR: The use of meta-analytic techniques to combine these estimates is often a useful addition, because it allows a more precise overall estimate of the size of the treatment effects to be generated, and provides a complete and concise summary of the results of the trials.
Abstract: Meta-analysis plays an important role in the analysis and interpretation of clinical trials in medicine and of trials in the social sciences but is of importance in other fields (eg, particle physics [1]) as well In 2001, Hartung and Knapp [2],[3] introduced a new approach to test for a nonzero treatment effect in a meta-analysis of k studies Hartung and Knapp [2],[3] suggest to use the random effects estimate according to DerSimonian and Laird [4] and propose a variance estimator q so that the test statistics for the treatment effect is t distributed with k − 1 degrees of freedom In their paper on dichotomous endpoints, results of a simulation study with 6 and 12 studies illustrate for risk differences, log relative risks and log odds ratios, the excellent properties regarding control of the type I error, and the achieved power [2] They investigate different sample sizes in each study, and different amounts of heterogeneity between studies and compare their new approach (Hartung and Knapp approach (HK)) with the fixed effects approach (FE) and the classical random effects approach by DerSimonian and Laird (DL) It can be clearly seen that, with increasing heterogeneity, the FE as well as the DL does not control the type I error rate, while the HK keeps the type I error rate in nearly every situation and in every scale Advantages and disadvantages of the two standard approaches and respective test statistics have been extensively discussed (eg, [5–7]) While it is well known that the FE is too liberal in the presence of heterogeneity, the DL is often thought to be rather conservative because heterogeneity is incorporated into the standard error of the estimate for the treatment effect and this should lead to larger confidence intervals and smaller test statistics for the treatment effect ([8] chapter 9443) This was disproved among others by Ziegler and Victor [7], who observed in situations with increasing heterogeneity severe inflation of the type I error for the DerSimonian and Laird test statistic Notably, the asymptotic properties of this approach will be valid, if both the number of studies and the number of patients per study are large enough ([8] chapter 954, [9,10]) Although power issues of meta-analysis tests have received some interest, comparisons between the approaches and the situation with two studies were not the main interest [11,12] Borenstein et al ([10], pp 363/364) recommend the random effects approach in general for meta-analysis and do not recommend meta-analyses of small numbers of studies However, meta-analyses of few and of even only two trials are of importance In drug licensing in many instances, two successful phase III clinical trials have to be submitted as pivotal evidence for drug licensing [13], and summarizing the findings of these studies is required according to the International Conference on Harmonisation guidelines E9 and M4E ([14,15]) It is stated that ‘An overall summary and synthesis of the evidence on safety and efficacy from all the reported clinical trials is required for a marketing application [] This may be accompanied, when appropriate, by a statistical combination of results’ ([14], p 31) For the summary, ‘The use of meta-analytic techniques to combine these estimates is often a useful addition, because it allows a more precise overall estimate of the size of the treatment effects to be generated, and provides a complete and concise summary of the results of the trials’ ([14], p 32) While in standard drug development, this summary will include usually more than two studies; in rare diseases for the same intervention, barely ever more than two studies are available because of the limited number of patients Likewise, decision making in the context of health technology assessment is based on systematic reviews and meta-analyses Often in practice, only two studies are considered homogeneous enough from clinical grounds to be included into a meta-analysis and then form the basis for decision making about reimbursement [16] Despite the fact that meta-analysis is non-experimental observational (secondary) research [17] and p-values should be interpreted with caution, meta-analyses of randomized clinical trials are termed highest-level information in evidence-based medicine and are the recommended basis for decision making [18] As statistical significance plays an important role in the assessment of the meta-analysis, it is mandatory to understand the statistical properties of the relevant methodology also in a situation, where only two clinical trials are included into a meta-analysis We found Cochrane reviews including meta-analyses with two studies only, which are considered for evidence-based decision making even in the presence of a large amount of heterogeneity (I2≈75%) [19–21] We repeated the simulation study for dichotomous endpoints of Hartung and Knapp [2] with programs written in R 310 [22] to compare the statistical properties of the FE, the DL, and the HK for testing the overall treatment effect θ (H0: θ = 0) in a situation with two to six clinical trials We considered scenarios under the null and alternative hypothesis for the treatment effect with and without underlying heterogeneity We present the findings for the odds ratio with pC=02 and did vary probability of success in the treatment group pT to investigate the type I error and the power characteristics The total sample size per meta-analysis was kept constant in the different scenarios (n = 480) and n/k number of patients per study to clearly demonstrate the effect of the number of included studies on power and type I error of the various approaches Likewise, we attempted to avoid problems with zero cell counts or extremely low event rates that may impact on type I error and power as well I2 was used to describe heterogeneity because thresholds have been published (low: I2=25%, moderate: I2=50%, and high: I2=75%) [23] for the quantification of the degree of heterogeneity with this measure We termed I2≤15% negligible, and this refers to simulations assuming no heterogeneity (ie, the fixed effects model) Table I summarizes the results of our simulation study The well-known anticonservative behavior of the FE and the DL in the presence of even low heterogeneity is visible for small numbers of studies in the meta-analysis Particularly for the FE, the increase in the type I error is pronounced With more than four studies even in situations with substantial heterogeneity, the HK perfectly controls the type I error There is almost no impact on the power of the test in situations with no or low heterogeneity, and overall, it seems as if the only price to be paid for an increased heterogeneity is a reduced power of the test Table I Overview of the empirical type I error and power This is in strong contrast to the situation with only two studies Again, the HK perfectly controls the prespecified type I error However, even in a homogeneous situation, the power of the meta-analysis test was lower than 15% in situations where the power of the FE and the DL approximates 70% and 60%, respectively In the presence of even low heterogeneity with the HK, there is not much chance to arrive at a positive conclusion even with substantial treatment effects Figure 1 summarizes the main finding of our simulation study with k = 2 and 6 studies impressively Figure 1 (a–d): Influence of heterogeneity in meta-analysis with two and six studies on empirical power FE, fixed effects approach; DL, DerSimonian and Laird approach; HK, Hartung and Knapp approach In the left column, simulation results with two studies In the homogeneous situation with two studies, the DL and even better the FE can be used to efficiently base conclusions on a meta-analysis In contrast, already with mild to moderate heterogeneity, both standard tests severely violate the prespecified type I error, and there is a high risk of false positive conclusion with the classical approaches This has major implications for decision making in drug licensing as well We have noted previously that a meta-analysis can be confirmatory if a drug development program was designed to include a preplanned meta-analysis of the two pivotal trials [24] As an example, thrombosis prophylaxis was discussed in the paper by Koch and Rohmel [24], where venous thromboembolism is accepted as primary endpoint in the pivotal trials In case when both pivotal trials are successful, they can be combined to demonstrate a positive impact on, for example, mortality This can be preplanned as a hierarchical testing procedure: first, both pivotal trials will be assessed individually before confirmatory conclusions will be based on the meta-analysis As explained, neither the FE, nor the DL, nor the HK can be the methodology to be recommended for a priori planning in this sensitive area unless any indication for heterogeneity is taken as a trigger not to combine studies in a meta-analysis at all It is our belief that not enough emphasis has been given to this finding in the original paper and the important role of heterogeneity is not acknowledged enough in the discussion of findings from meta-analyses, in general