scispace - formally typeset
Search or ask a question

Showing papers on "Random effects model published in 2019"


Journal ArticleDOI
TL;DR: Simulations reveal that failing to include random slopes can generate anti-conservative standard errors, and that assuming random intercepts are Normally distributed, when they are not, introduces only modest biases, which strengthens the case for the use of fixed and random effects models.
Abstract: This paper assesses the options available to researchers analysing multilevel (including longitudinal) data, with the aim of supporting good methodological decision-making. Given the confusion in the literature about the key properties of fixed and random effects (FE and RE) models, we present these models’ capabilities and limitations. We also discuss the within-between RE model, sometimes misleadingly labelled a ‘hybrid’ model, showing that it is the most general of the three, with all the strengths of the other two. As such, and because it allows for important extensions—notably random slopes—we argue it should be used (as a starting point at least) in all multilevel analyses. We develop the argument through simulations, evaluating how these models cope with some likely mis-specifications. These simulations reveal that (1) failing to include random slopes can generate anti-conservative standard errors, and (2) assuming random intercepts are Normally distributed, when they are not, introduces only modest biases. These results strengthen the case for the use of, and need for, these models.

509 citations


Journal ArticleDOI
TL;DR: The estimated summary effect of the meta-analysis and its confidence interval derived from the Hartung-Knapp-Sidik-Jonkman method are more robust to changes in the heterogeneity variance estimate and show minimal deviation from the nominal coverage of 95% under most of the simulated scenarios.
Abstract: Studies combined in a meta-analysis often have differences in their design and conduct that can lead to heterogeneous results. A random-effects model accounts for these differences in the underlying study effects, which includes a heterogeneity variance parameter. The DerSimonian-Laird method is often used to estimate the heterogeneity variance, but simulation studies have found the method can be biased and other methods are available. This paper compares the properties of nine different heterogeneity variance estimators using simulated meta-analysis data. Simulated scenarios include studies of equal size and of moderate and large differences in size. Results confirm that the DerSimonian-Laird estimator is negatively biased in scenarios with small studies and in scenarios with a rare binary outcome. Results also show the Paule-Mandel method has considerable positive bias in meta-analyses with large differences in study size. We recommend the method of restricted maximum likelihood (REML) to estimate the heterogeneity variance over other methods. However, considering that meta-analyses of health studies typically contain few studies, the heterogeneity variance estimate should not be used as a reliable gauge for the extent of heterogeneity in a meta-analysis. The estimated summary effect of the meta-analysis and its confidence interval derived from the Hartung-Knapp-Sidik-Jonkman method are more robust to changes in the heterogeneity variance estimate and show minimal deviation from the nominal coverage of 95% under most of our simulated scenarios.

408 citations


Journal ArticleDOI
TL;DR: This paper presents a special capability of Sisvar to deal with fixed effect models with several restriction in the randomization procedure, which lead to models with fixed treatment effects, but with several random errors.
Abstract: This paper presents a special capability of Sisvar to deal with fixed effect models with several restriction in the randomization procedure. These restrictions lead to models with fixed treatment effects, but with several random errors. One way do deal with models of this kind is to perform a mixed model analysis, considering only the error effects in the model as random effects and with different covariance structure for the error terms. Another way is to perform a analysis of variance with several error. These kind of analysis, when the data are balanced, can be done by using Sisvar. The software lead a exact $F$ test for the fixed effects and allow the user to applied multiple comparison procedures or regression analysis for the levels of the fixed effect factors, regarding they are single effects, interaction effects or hierarchical effects. Sisvar is an interesting statistical computer system for using in balanced agricultural and industrial data sets.

398 citations


Journal ArticleDOI
TL;DR: In this paper, the Chamberlain-Mundlak approach for balanced panels is extended to allow unobserved heterogeneity to be correlated with observed covariates and sample selection for unbalanced panels.

356 citations


Journal ArticleDOI
TL;DR: This paper provides a worked example of using Dynamic Causal Modelling (DCM) and Parametric Empirical Bayes (PEB) to characterise inter-subject variability in neural circuitry (effective connectivity) and provides a tutorial style explanation of the underlying theory and assumptions.

220 citations


Book ChapterDOI
28 Oct 2019
TL;DR: In this article, the authors describe a class of statistical model that is able to account for most of the cases of nonindependence that are typically encountered in psychological experiments, linear mixed-effects models, or mixed models for short.
Abstract: This chapter describes a class of statistical model that is able to account for most of the cases of nonindependence that are typically encountered in psychological experiments, linear mixed-effects models, or mixed models for short. It introduces the concepts underlying mixed models and how they allow accounting for different types of nonindependence that can occur in psychological data. The chapter discusses how to set up a mixed model and how to perform statistical inference with a mixed model. The most important concept for understanding how to estimate and how to interpret mixed models is the distinction between fixed and random effects. One important characteristic of mixed models is that they allow random effects for multiple, possibly independent, random effects grouping factors. Mixed models are a modern class of statistical models that extend regular regression models by including random-effects parameters to account for dependencies among related data points.

211 citations


Journal ArticleDOI
TL;DR: It is shown how the one-stage method for meta-analysis of non-linear curves is particularly suited for dose–response meta-analyses of aggregated where the complexity of the research question is better addressed by including all the studies.
Abstract: The standard two-stage approach for estimating non-linear dose-response curves based on aggregated data typically excludes those studies with less than three exposure groups. We develop the one-stage method as a linear mixed model and present the main aspects of the methodology, including model specification, estimation, testing, prediction, goodness-of-fit, model comparison, and quantification of between-studies heterogeneity. Using both fictitious and real data from a published meta-analysis, we illustrated the main features of the proposed methodology and compared it to a traditional two-stage analysis. In a one-stage approach, the pooled curve and estimates of the between-studies heterogeneity are based on the whole set of studies without any exclusion. Thus, even complex curves (splines, spike at zero exposure) defined by several parameters can be estimated. We showed how the one-stage method may facilitate several applications, in particular quantification of heterogeneity over the exposure range, prediction of marginal and conditional curves, and comparison of alternative models. The one-stage method for meta-analysis of non-linear curves is implemented in the dosresmeta R package. It is particularly suited for dose-response meta-analyses of aggregated where the complexity of the research question is better addressed by including all the studies.

173 citations


Journal ArticleDOI
TL;DR: This paper aims to provide a comprehensive overview of available methods for calculating point estimates, confidence intervals, and prediction intervals for the overall effect size under the random‐effects model, and indicates whether some methods are preferable than others by considering the results of comparative simulation and real‐life data studies.
Abstract: Meta-analyses are an important tool within systematic reviews to estimate the overall effect size and its confidence interval for an outcome of interest. If heterogeneity between the results of the relevant studies is anticipated, then a random-effects model is often preferred for analysis. In this model, a prediction interval for the true effect in a new study also provides additional useful information. However, the DerSimonian and Laird method-frequently used as the default method for meta-analyses with random effects-has been long challenged due to its unfavorable statistical properties. Several alternative methods have been proposed that may have better statistical properties in specific scenarios. In this paper, we aim to provide a comprehensive overview of available methods for calculating point estimates, confidence intervals, and prediction intervals for the overall effect size under the random-effects model. We indicate whether some methods are preferable than others by considering the results of comparative simulation and real-life data studies.

127 citations


Journal ArticleDOI
TL;DR: It is suggested that the RPMV-Tobit model is a superior approach for comprehensive crash rates modeling and traffic safety evaluation purposes and accounting for the unobserved heterogeneous can further improve model fit.

90 citations


Journal ArticleDOI
TL;DR: LonGP is presented, an additive Gaussian process regression model specifically designed for statistical analysis of longitudinal experimental data that can model time-varying random effects and non-stationary signals, incorporate multiple kernel learning, and provide interpretable results for the effects of individual covariates and their interactions.
Abstract: Biomedical research typically involves longitudinal study designs where samples from individuals are measured repeatedly over time and the goal is to identify risk factors (covariates) that are associated with an outcome value. General linear mixed effect models are the standard workhorse for statistical analysis of longitudinal data. However, analysis of longitudinal data can be complicated for reasons such as difficulties in modelling correlated outcome values, functional (time-varying) covariates, nonlinear and non-stationary effects, and model inference. We present LonGP, an additive Gaussian process regression model that is specifically designed for statistical analysis of longitudinal data, which solves these commonly faced challenges. LonGP can model time-varying random effects and non-stationary signals, incorporate multiple kernel learning, and provide interpretable results for the effects of individual covariates and their interactions. We demonstrate LonGP’s performance and accuracy by analysing various simulated and real longitudinal -omics datasets.

90 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigated the determinants of renewable energy consumption in Africa, with a view to understanding the current pattern and its potential determinants, and employed the panel data analysis involving five most populous and biggest economy in each of the five regions of Africa namely; Nigeria (West), Egypt (North), Ethiopia (East), DR Congo (Central) and South Africa (Southern) and using annual data from 1990 to 2015.

Journal ArticleDOI
TL;DR: In this paper, the authors compared the performance of different meta-analysis methods, including the DerSimonian-Laird approach, empirically and in a simulation study, based on few studies, imbalanced study sizes, and considering odds-ratio and risk ratio (RR) effect sizes.
Abstract: Standard random-effects meta-analysis methods perform poorly when applied to few studies only. Such settings however are commonly encountered in practice. It is unclear, whether or to what extent small-sample-size behaviour can be improved by more sophisticated modeling. We consider likelihood-based methods, the DerSimonian-Laird approach, Empirical Bayes, several adjustment methods and a fully Bayesian approach. Confidence intervals are based on a normal approximation, or on adjustments based on the Student-t-distribution. In addition, a linear mixed model and two generalized linear mixed models (GLMMs) assuming binomial or Poisson distributed numbers of events per study arm are considered for pairwise binary meta-analyses. We extract an empirical data set of 40 meta-analyses from recent reviews published by the German Institute for Quality and Efficiency in Health Care (IQWiG). Methods are then compared empirically as well as in a simulation study, based on few studies, imbalanced study sizes, and considering odds-ratio (OR) and risk ratio (RR) effect sizes. Coverage probabilities and interval widths for the combined effect estimate are evaluated to compare the different approaches. Empirically, a majority of the identified meta-analyses include only 2 studies. Variation of methods or effect measures affects the estimation results. In the simulation study, coverage probability is, in the presence of heterogeneity and few studies, mostly below the nominal level for all frequentist methods based on normal approximation, in particular when sizes in meta-analyses are not balanced, but improve when confidence intervals are adjusted. Bayesian methods result in better coverage than the frequentist methods with normal approximation in all scenarios, except for some cases of very large heterogeneity where the coverage is slightly lower. Credible intervals are empirically and in the simulation study wider than unadjusted confidence intervals, but considerably narrower than adjusted ones, with some exceptions when considering RRs and small numbers of patients per trial-arm. Confidence intervals based on the GLMMs are, in general, slightly narrower than those from other frequentist methods. Some methods turned out impractical due to frequent numerical problems. In the presence of between-study heterogeneity, especially with unbalanced study sizes, caution is needed in applying meta-analytical methods to few studies, as either coverage probabilities might be compromised, or intervals are inconclusively wide. Bayesian estimation with a sensibly chosen prior for between-trial heterogeneity may offer a promising compromise.

Journal ArticleDOI
TL;DR: A new approach to account for heteroscedasticity and covariance among observations present in residual error or induced by random effects is proposed and is universally applicable for arbitrary variance-covariance structures including spatial models and repeated measures.
Abstract: Extensions of linear models are very commonly used in the analysis of biological data. Whereas goodness of fit measures such as the coefficient of determination (R2 ) or the adjusted R2 are well established for linear models, it is not obvious how such measures should be defined for generalized linear and mixed models. There are by now several proposals but no consensus has yet emerged as to the best unified approach in these settings. In particular, it is an open question how to best account for heteroscedasticity and for covariance among observations present in residual error or induced by random effects. This paper proposes a new approach that addresses this issue and is universally applicable for arbitrary variance-covariance structures including spatial models and repeated measures. It is exemplified using three biological examples.

Journal ArticleDOI
TL;DR: Given the complex interactive influence among sample sizes, effect sizes and predictor distribution characteristics, it seems unwarranted to make generic rule-of-thumb sample size recommendations for multilevel logistic regression, aside from the fact that larger sample sizes are required when the distributions of the predictors are not symmetric or balanced.
Abstract: Despite its popularity, issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation (i.e., computer-simulation-based approaches). These issues are further compounded by the fact that the distribution of the predictors can play a role in the power to estimate these effects. To address both matters, we present a sample of cases documenting the influence that predictor distribution have on statistical power as well as a user-friendly, web-based application to conduct power analysis for multilevel logistic regression. Computer simulations are implemented to estimate statistical power in multilevel logistic regression with varying numbers of clusters, varying cluster sample sizes, and non-normal and non-symmetrical distributions of the Level 1/2 predictors. Power curves were simulated to see in what ways non-normal/unbalanced distributions of a binary predictor and a continuous predictor affect the detection of population effect sizes for main effects, a cross-level interaction and the variance of the random effects. Skewed continuous predictors and unbalanced binary ones require larger sample sizes at both levels than balanced binary predictors and normally-distributed continuous ones. In the most extreme case of imbalance (10% incidence) and skewness of a chi-square distribution with 1 degree of freedom, even 110 Level 2 units and 100 Level 1 units were not sufficient for all predictors to reach power of 80%, mostly hovering at around 50% with the exception of the skewed, continuous Level 2 predictor. Given the complex interactive influence among sample sizes, effect sizes and predictor distribution characteristics, it seems unwarranted to make generic rule-of-thumb sample size recommendations for multilevel logistic regression, aside from the fact that larger sample sizes are required when the distributions of the predictors are not symmetric or balanced. The more skewed or imbalanced the predictor is, the larger the sample size requirements. To assist researchers in planning research studies, a user-friendly web application that conducts power analysis via computer simulations in the R programming language is provided. With this web application, users can conduct simulations, tailored to their study design, to estimate statistical power for multilevel logistic regression models.

Journal ArticleDOI
TL;DR: In this paper, the authors evaluated the best model for drought forecasting and determined which differences if any were present in model performance using standardised precipitation index (SPI), in addition, the most effective combination of the SPI with its respective timescale and lead time was investigated.
Abstract: Quality and reliable drought prediction is essential for mitigation strategies and planning in disaster-stricken regions globally. Prediction models such as empirical or data-driven models play a fundamental role in forecasting drought. However, selecting a suitable prediction model remains a challenge because of the lack of succinct information available on model performance. Therefore, this review evaluated the best model for drought forecasting and determined which differences if any were present in model performance using standardised precipitation index (SPI). In addition, the most effective combination of the SPI with its respective timescale and lead time was investigated. The effectiveness of data-driven models was analysed using meta-regression analysis by applying a linear mixed model to the coefficient of determination and the root mean square error of the validated model results. Wavelet-transformed neural networks had superior performance with the highest correlation and minimum error. Preprocessing data to eliminate non-stationarity performed substantially better than did the regular artificial neural network (ANN) model. Additionally, the best timescale to calculate the SPI was 24 and 12 months and a lead time of 1–3 months provided the most accurate forecasts. Studies from China and Sicily had the most variation based on geographical location as a random effect; while studies from India rendered consistent results overall. Variation in the result can be attributed to geographical differences, seasonal influence, incorporation of climate indices and author bias. Conclusively, this review recommends use of the wavelet-based ANN (WANN) model to forecast drought indices.

Journal ArticleDOI
TL;DR: Developing spatio-temporal random-effect models, considering other priors, using a dataset that covers an extended time period, and investigating other covariates would help to better understand and control DF transmission.
Abstract: Dengue fever (DF) is one of the world's most disabling mosquito-borne diseases, with a variety of approaches available to model its spatial and temporal dynamics. This paper aims to identify and compare the different spatial and spatio-temporal Bayesian modelling methods that have been applied to DF and examine influential covariates that have been reportedly associated with the risk of DF. A systematic search was performed in December 2017, using Web of Science, Scopus, ScienceDirect, PubMed, ProQuest and Medline (via Ebscohost) electronic databases. The search was restricted to refereed journal articles published in English from January 2000 to November 2017. Thirty-one articles met the inclusion criteria. Using a modified quality assessment tool, the median quality score across studies was 14/16. The most popular Bayesian statistical approach to dengue modelling was a generalised linear mixed model with spatial random effects described by a conditional autoregressive prior. A limited number of studies included spatio-temporal random effects. Temperature and precipitation were shown to often influence the risk of dengue. Developing spatio-temporal random-effect models, considering other priors, using a dataset that covers an extended time period, and investigating other covariates would help to better understand and control DF transmission.

Journal ArticleDOI
TL;DR: This work proposes an extended inverse Gaussian (EIG) process model by incorporating skew-normal random effects, and derive its analytical lifetime distribution, and two illustrative examples of GaAs laser degradation and fatigue crack growth are provided.

Journal ArticleDOI
TL;DR: It is shown that inferences from fitted random‐effects models, using both the conventional and the Hartung‐Knapp method, are equivalent to those from closely related intercept only weighted least squares regression models.
Abstract: The Hartung-Knapp method for random-effects meta-analysis, that was also independently proposed by Sidik and Jonkman, is becoming advocated for general use. This method has previously been justified by taking all estimated variances as known and using a different pivotal quantity to the more conventional one when making inferences about the average effect. We provide a new conceptual framework for, and justification of, the Hartung-Knapp method. Specifically, we show that inferences from fitted random-effects models, using both the conventional and the Hartung-Knapp method, are equivalent to those from closely related intercept only weighted least squares regression models. This observation provides a new link between Hartung and Knapp's methodology for meta-analysis and standard linear models, where it can be seen that the Hartung-Knapp method can be justified by a linear model that makes a slightly weaker assumption than taking all variances as known. This provides intuition for why the Hartung-Knapp method has been found to perform better than the conventional one in simulation studies. Furthermore, our new findings give more credence to ad hoc adjustments of confidence intervals from the Hartung-Knapp method that ensure these are at least as wide as more conventional confidence intervals. The conceptual basis for the Hartung-Knapp method that we present here should replace the established one because it more clearly illustrates the potential benefit of using it.

Book ChapterDOI
01 Jan 2019
TL;DR: In this article, the authors consider two types of nonlinear models: a probit conditional mean function for binary or fractional responses and an exponential conditional mean for nonnegative responses.
Abstract: We study testing and estimation in panel data models with two potential sources of endogeneity: correlation of covariates with time-constant, unobserved heterogeneity and correlation of covariates with time-varying idiosyncratic errors. In the linear case, we show that two control function approaches allow us to test exogeneity with respect to the idiosyncratic errors while being silent about exogeneity with respect to heterogeneity. The linear case suggests a general approach for nonlinear models. We consider two leading cases of nonlinear models: an exponential conditional mean function for nonnegative responses and a probit conditional mean function for binary or fractional responses. In the former case, we exploit the full robustness of the fixed effects Poisson quasiMLE for the probit case, we propose correlated random effects.

Journal ArticleDOI
TL;DR: This work proposes an alternative based on bootstrap and shows by simulations that its coverage is close to the nominal level, unlike the Higgins–Thompson–Spiegelhalter method and its extensions.
Abstract: Prediction intervals are commonly used in meta-analysis with random-effects models. One widely used method, the Higgins–Thompson–Spiegelhalter prediction interval, replaces the heterogeneity parame...

Journal ArticleDOI
TL;DR: Understanding how nested data structures and data with repeated measures work enables researchers and managers to define several types of constructs from which multilevel models can be used.

Journal ArticleDOI
TL;DR: It is concluded that both AB and CB models are suitable for the analysis of NMA data, but using random study intercepts requires a strong rationale such as relating treatment effects to study Intercepts.
Abstract: Differences between arm-based (AB) and contrast-based (CB) models for network meta-analysis (NMA) are controversial. We compare the CB model of Lu and Ades (2006), the AB model of Hong et al(2016), and two intermediate models, using hypothetical data and a selected real data set. Differences between models arise primarily from study intercepts being fixed effects in the Lu-Ades model but random effects in the Hong model, and we identify four key difference. (1) If study intercepts are fixed effects then only within-study information is used, but if they are random effects then between-study information is also used and can cause important bias. (2) Models with random study intercepts are suitable for deriving a wider range of estimands, eg, the marginal risk difference, when underlying risk is derived from the NMA data; but underlying risk is usually best derived from external data, and then models with fixed intercepts are equally good. (3) The Hong model allows treatment effects to be related to study intercepts, but the Lu-Ades model does not. (4) The Hong model is valid under a more relaxed missing data assumption, that arms (rather than contrasts) are missing at random, but this does not appear to reduce bias. We also describe an AB model with fixed study intercepts and a CB model with random study intercepts. We conclude that both AB and CB models are suitable for the analysis of NMA data, but using random study intercepts requires a strong rationale such as relating treatment effects to study intercepts.

Journal ArticleDOI
TL;DR: In this article, a new model was proposed to estimate under-five mortality rate across regions and years and to investigate the association between the under five mortality rate and spatially varying covariate surfaces.
Abstract: Accurate estimates of the under-five mortality rate in a developing world context are a key barometer of the health of a nation. This paper describes a new model to analyze survey data on mortality in this context. We are interested in both spatial and temporal description, that is wishing to estimate under-five mortality rate across regions and years and to investigate the association between the under-five mortality rate and spatially varying covariate surfaces. We illustrate the methodology by producing yearly estimates for subnational areas in Kenya over the period 1980-2014 using data from the Demographic and Health Surveys, which use stratified cluster sampling. We use a binomial likelihood with fixed effects for the urban/rural strata and random effects for the clustering to account for the complex survey design. Smoothing is carried out using Bayesian hierarchical models with continuous spatial and temporally discrete components. A key component of the model is an offset to adjust for bias due to the effects of HIV epidemics. Substantively, there has been a sharp decline in Kenya in the under-five mortality rate in the period 1980-2014, but large variability in estimated subnational rates remains. A priority for future research is understanding this variability. In exploratory work, we examine whether a variety of spatial covariate surfaces can explain the variability in under-five mortality rate. Temperature, precipitation, a measure of malaria infection prevalence, and a measure of nearness to cities were candidates for inclusion in the covariate model, but the interplay between space, time, and covariates is complex.

Journal ArticleDOI
TL;DR: This paper reviews statistical methods for analyzing zero-inflated nonnegative outcome data, discussing ways to separate zero and positive values and introducing flexible models to characterize right skewness and heteroscedasticity in the positive values.
Abstract: Zero-inflated nonnegative continuous (or semicontinuous) data arise frequently in biomedical, economical, and ecological studies. Examples include substance abuse, medical costs, medical care utilization, biomarkers (e.g., CD4 cell counts, coronary artery calcium scores), single cell gene expression rates, and (relative) abundance of microbiome. Such data are often characterized by the presence of a large portion of zero values and positive continuous values that are skewed to the right and heteroscedastic. Both of these features suggest that no simple parametric distribution may be suitable for modeling such type of outcomes. In this paper, we review statistical methods for analyzing zero-inflated nonnegative outcome data. We will start with the cross-sectional setting, discussing ways to separate zero and positive values and introducing flexible models to characterize right skewness and heteroscedasticity in the positive values. We will then present models of correlated zero-inflated nonnegative continuous data, using random effects to tackle the correlation on repeated measures from the same subject and that across different parts of the model. We will also discuss expansion to related topics, for example, zero-inflated count and survival data, nonlinear covariate effects, and joint models of longitudinal zero-inflated nonnegative continuous data and survival. Finally, we will present applications to three real datasets (i.e., microbiome, medical costs, and alcohol drinking) to illustrate these methods. Example code will be provided to facilitate applications of these methods.

Journal ArticleDOI
01 Apr 2019-Genetics
TL;DR: Using simulations, it is shown that the proposed methodology yields (nearly) unbiased estimates when the sample size is not too small relative to the number of SNPs used, suggesting that effect heterogeneity varies between regions of the genome.
Abstract: In humans, most genome-wide association studies have been conducted using data from Caucasians and many of the reported findings have not replicated in other populations. This lack of replication may be due to statistical issues (small sample sizes or confounding) or perhaps more fundamentally to differences in the genetic architecture of traits between ethnically diverse subpopulations. What aspects of the genetic architecture of traits vary between subpopulations and how can this be quantified? We consider studying effect heterogeneity using Bayesian random effect interaction models. The proposed methodology can be applied using shrinkage and variable selection methods, and produces useful information about effect heterogeneity in the form of whole-genome summaries (e.g., the proportions of variance of a complex trait explained by a set of SNPs and the average correlation of effects) as well as SNP-specific attributes. Using simulations, we show that the proposed methodology yields (nearly) unbiased estimates when the sample size is not too small relative to the number of SNPs used. Subsequently, we used the methodology for the analyses of four complex human traits (standing height, high-density lipoprotein, low-density lipoprotein, and serum urate levels) in European-Americans (EAs) and African-Americans (AAs). The estimated correlations of effects between the two subpopulations were well below unity for all the traits, ranging from 0.73 to 0.50. The extent of effect heterogeneity varied between traits and SNP sets. Height showed less differences in SNP effects between AAs and EAs whereas HDL, a trait highly influenced by lifestyle, exhibited a greater extent of effect heterogeneity. For all the traits, we observed substantial variability in effect heterogeneity across SNPs, suggesting that effect heterogeneity varies between regions of the genome.

Journal ArticleDOI
TL;DR: In this article, the authors have examined the liquidity determinants of Indian listed commercial banks and applied both GMM and pooled, fixed and random effect models to a panel of banks.
Abstract: The objective of this study is to examine the liquidity (LQD) determinants of Indian listed commercial banks. The study has applied both GMM and pooled, fixed and random effect models to a panel of...

Journal ArticleDOI
TL;DR: The results of a generalized linear mixed-effects model applied to single-subject data taken from Ackerlund Brandt, Dozier, Juanico, Laudont, & Mick, 2015, in which children chose from one of three reinforcers for completing a task are presented.
Abstract: Behavior analysis and statistical inference have shared a conflicted relationship for over fifty years. However, a significant portion of this conflict is directed toward statistical tests (e.g., t-tests, ANOVA) that aggregate group and/or temporal variability into means and standard deviations and as a result remove much of the data important to behavior analysts. Mixed-effects modeling, a more recently developed statistical test, addresses many of the limitations of more basic tests by incorporating random effects. Random effects quantify individual subject variability without eliminating it from the model, hence producing a model that can predict both group and individual behavior. We present the results of a generalized linear mixed-effects model applied to single-subject data taken from Ackerlund Brandt, Dozier, Juanico, Laudont, & Mick, 2015, in which children chose from one of three reinforcers for completing a task. Results of the mixed-effects modeling are consistent with visual analyses and importantly provide a statistical framework to predict individual behavior without requiring aggregation. We conclude by discussing the implications of these results and provide recommendations for further integration of mixed-effects models in the analyses of single-subject designs.

Journal ArticleDOI
TL;DR: Variance components estimation and mixed model analysis are central themes in statistics with applications in numerous scientific disciplines as mentioned in this paper, however, despite the best efforts of generations of statisticians, they have not yet achieved the state-of-the-art performance.
Abstract: Variance components estimation and mixed model analysis are central themes in statistics with applications in numerous scientific disciplines. Despite the best efforts of generations of statisticia...

Journal ArticleDOI
TL;DR: A methodology for high-resolution mapping of vaccination coverage using areal data in settings where point-referenced survey data are inaccessible and can be readily applied to wider disaggregation problems in related contexts, including mapping other health and development indicators.
Abstract: The growing demand for spatially detailed data to advance the Sustainable Development Goals agenda of ‘leaving no one behind’ has resulted in a shift in focus from aggregate national and province-based metrics to small areas and high-resolution grids in the health and development arena. Vaccination coverage is customarily measured through aggregate-level statistics, which mask fine-scale heterogeneities and ‘coldspots’ of low coverage. This paper develops a methodology for high-resolution mapping of vaccination coverage using areal data in settings where point-referenced survey data are inaccessible. The proposed methodology is a binomial spatial regression model with a logit link and a combination of covariate data and random effects modelling two levels of spatial autocorrelation in the linear predictor. The principal aspect of the model is the melding of the misaligned areal data and the prediction grid points using the regression component and each of the conditional autoregressive and the Gaussian spatial process random effects. The Bayesian model is fitted using the INLA-SPDE approach. We demonstrate the predictive ability of the model using simulated data sets. The results obtained indicate a good predictive performance by the model, with correlations of between 0.66 and 0.98 obtained at the grid level between true and predicted values. The methodology is applied to predicting the coverage of measles and diphtheria-tetanus-pertussis vaccinations at 5 × 5 km2 in Afghanistan and Pakistan using subnational Demographic and Health Surveys data. The predicted maps are used to highlight vaccination coldspots and assess progress towards coverage targets to facilitate the implementation of more geographically precise interventions. The proposed methodology can be readily applied to wider disaggregation problems in related contexts, including mapping other health and development indicators.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a new parametric model for the precision matrix based on a directed acyclic graph (DAG) representation of the spatial dependence, which guarantees positive definiteness and, hence, can also directly model the outcome from dependent data like images and networks.
Abstract: Hierarchical models for regionally aggregated disease incidence data commonly involve region specific latent random effects that are modeled jointly as having a multivariate Gaussian distribution. The covariance or precision matrix incorporates the spatial dependence between the regions. Common choices for the precision matrix include the widely used ICAR model, which is singular, and its nonsingular extension which lacks interpretability. We propose a new parametric model for the precision matrix based on a directed acyclic graph (DAG) representation of the spatial dependence. Our model guarantees positive definiteness and, hence, in addition to being a valid prior for regional spatially correlated random effects, can also directly model the outcome from dependent data like images and networks. Theoretical results establish a link between the parameters in our model and the variance and covariances of the random effects. Substantive simulation studies demonstrate that the improved interpretability of our model reaps benefits in terms of accurately recovering the latent spatial random effects as well as for inference on the spatial covariance parameters. Under modest spatial correlation, our model far outperforms the CAR models, while the performances are similar when the spatial correlation is strong. We also assess sensitivity to the choice of the ordering in the DAG construction using theoretical and empirical results which testify to the robustness of our model. We also present a large-scale public health application demonstrating the competitive performance of the model.