scispace - formally typeset
Search or ask a question

Showing papers on "Mixed model published in 2004"


Book
11 Aug 2004
TL;DR: This paper presents a meta--analysis model for the growth curve of the Tumor Regrowth Curves of the LME Model and some of the properties of the model suggest that this model is likely to be biased towards growth curve-based models.
Abstract: Preface. 1. Introduction: Why Mixed Models? 2. MLE for LME Model. 3. Statistical Properties of the LME Model. 4. Growth Curve Model and Generalizations. 5. Meta--analysis Model. 6. Nonlinear Marginal Model. 7. Generalized Linear Mixed Models. 8. Nonlinear Mixed Effects Model. 9. Diagnostics and Influence Analysis. 10. Tumor Regrowth Curves. 11. Statistical Analysis of Shape. 12. Statistical Image Analysis. 13. Appendix: Useful Facts and Formulas. References. Index.

789 citations


Journal ArticleDOI
TL;DR: Maximum likelihood estimation and empirical Bayes latent score prediction within the GLLAMM framework can be performed using adaptive quadrature in gllamm, a freely available program running in Stata.
Abstract: A unifying framework for generalized multilevel structural equation modeling is introduced. The models in the framework, called generalized linear latent and mixed models (GLLAMM), combine features of generalized linear mixed models (GLMM) and structural equation models (SEM) and consist of a response model and a structural model for the latent variables. The response model generalizes GLMMs to incorporate factor structures in addition to random intercepts and coefficients. As in GLMMs, the data can have an arbitrary number of levels and can be highly unbalanced with different numbers of lower-level units in the higher-level units and missing data. A wide range of response processes can be modeled including ordered and unordered categorical responses, counts, and responses of mixed types. The structural model is similar to the structural part of a SEM except that it may include latent and observed variables varying at different levels. For example, unit-level latent variables (factors or random coefficients) can be regressed on cluster-level latent variables. Special cases of this framework are explored and data from the British Social Attitudes Survey are used for illustration. Maximum likelihood estimation and empirical Bayes latent score prediction within the GLLAMM framework can be performed using adaptive quadrature in gllamm, a freely available program running in Stata.

755 citations


Journal ArticleDOI
TL;DR: The objective of this paper is to provide a background understanding of mixed model methodology in a repeated measures analysis and to use balanced steer data from a growth study to illustrate the use of PROC MIXED in the SAS system using five covariance structures.
Abstract: The analysis of data containing repeated observations measured on animals (experimental unit) allocated to different treatments over time is a common design in animal science. Conventionally, repeated measures data were either analyzed as a univariate (split-plot in time) or a multivariate ANOVA (analysis of contrasts), both being handled by the General Linear Model procedure of SAS. In recent times, the mixed model has become more appealing for analyzing repeated data. The objective of this paper is to provide a background understanding of mixed model methodology in a repeated measures analysis and to use balanced steer data from a growth study to illustrate the use of PROC MIXED in the SAS system using five covariance structures. The split-plot in time approach assumes a constant variance and equal correlations (covariance) between repeated measures or compound symmetry, regardless of their proximity in time, and often these assumptions are not true. Recognizing this limitation, the analysis of contrast...

461 citations


Journal ArticleDOI
TL;DR: The purpose of this article is to demonstrate the advantages of using the mixed model for analyzing nonlinear, longitudinal datasets with multiple missing data points by comparing the mixedmodel to the widely used repeated measures ANOVA using an experimental set of data.
Abstract: Longitudinal methods are the methods of choice for researchers who view their phenomena of interest as dynamic. Although statistical methods have remained largely fixed in a linear view of biology and behavior, more recent methods, such as the general linear mixed model (mixed model), can be used to analyze dynamic phenomena that are often of interest to nurses. Two strengths of the mixed model are (1) the ability to accommodate missing data points often encountered in longitudinal datasets and (2) the ability to model nonlinear, individual characteristics. The purpose of this article is to demonstrate the advantages of using the mixed model for analyzing nonlinear, longitudinal datasets with multiple missing data points by comparing the mixed model to the widely used repeated measures ANOVA using an experimental set of data. The decision-making steps in analyzing the data using both the mixed model and the repeated measures ANOVA are described.

453 citations


Journal ArticleDOI
TL;DR: The phylogenetic mixed model is particularly rich in terms of the evolutionary insight that might be drawn from model parameters, so the interpretation of the model parameters in a specific comparative analysis is illustrated and discussed.
Abstract: The phylogenetic mixed model is an application of the quantitative-genetic mixed model to interspecific data. Although this statistical framework provides a potentially unifying approach to quantitative-genetic and phylogenetic analysis, the model has been applied infrequently because of technical difficulties with parameter estimation. We recommend a reparameterization of the model that eliminates some of these difficulties, and we develop a new estimation algorithm for both the original maximum likelihood and new restricted maximum likelihood estimators. The phylogenetic mixed model is particularly rich in terms of the evolutionary insight that might be drawn from model parameters, so we also illustrate and discuss the interpretation of the model parameters in a specific comparative analysis.

271 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider mean squared errors (MSE) of empirical predictors under a general setup, where ML or REML estimators are used for the second stage.
Abstract: The term “empirical predictor” refers to a two-stage predictor of a linear combination of fixed and random effects. In the first stage, a predictor is obtained but it involves unknown parameters; thus, in the second stage, the unknown parameters are replaced by their estimators. In this paper, we consider mean squared errors (MSE) of empirical predictors under a general setup, where ML or REML estimators are used for the second stage. We obtain second-order approximation to the MSE as well as an estimator of the MSE correct to the same order. The general results are applied to mixed linear models to obtain a second-order approximation to the MSE of the empirical best linear unbiased predictor (EBLUP) of a linear mixed effect and an estimator of the MSE of EBLUP whose bias is correct to second order. The general mixed linear model includes the mixed ANOVA model and the longitudinal model as special cases.

204 citations


Journal ArticleDOI
TL;DR: In this article, the authors compared the performance of Simple Fixed Effects Models (SFEM) and Hierarchical Linear Models (HLM) for value-added analysis, adjusting for important student and school-level covariates such as socioeconomic status.
Abstract: Hierarchical Linear Models (HLM) have been used extensively for value-added analysis, adjusting for important student and school-level covariates such as socioeconomic status. A recently proposed alternative, the Layered Mixed Effects Model (LMEM) also analyzes learning gains, but ignores sociodemographic factors. Other features of LMEM, such as its ability to apportion credit for learning gains among multiple schools and its utilization of incomplete observations, make it appealing. A third model that is appealing due to its simplicity is the Simple Fixed Effects Models (SFEM). Statistical and computing specifications are given for each of these models. The models were fitted to obtain value-added measures of school performance by grade and subject area, using a common data set with two years of test scores. We investigate the practical impact of differences among these models by comparing their value-added measures. The value-added measures obtained from the SFEM were highly correlated with those from t...

184 citations


Journal ArticleDOI
TL;DR: The present paper proposes a general method to formulate mixed models for designed experiments with repeated measurements, exemplified by way of several examples.
Abstract: Repeated measurements on the same experimental unit are common in plant research. Due to lack of randomization and the serial ordering of observations on the same unit, such data give rise to correlations, which need to be accounted for in statistical analysis. Mixed modelling provides a flexible framework for this task. The present paper proposes a general method to formulate mixed models for designed experiments with repeated measurements. The approach is exemplified by way of several examples.

183 citations


Journal ArticleDOI
TL;DR: This work illustrates that software for mixed model analysis can be used for smoothing for several smoothing models such as additive and varying coefficient models for both S-PLUS and SAS software.
Abstract: Smoothing methods that use basis functions with penalization can be formulated as fits in a mixed model framework One of the major benefits is that software for mixed model analysis can be used for smoothing We illustrate this for several smoothing models such as additive and varying coefficient models for both S-PLUS and SAS software Code for each of the illustrations is available on the Internet

177 citations


Journal ArticleDOI
TL;DR: In this paper, a mixed-model approach, available through SAS PROC MIXED, was compared to a Welch-James type statistic, which is known to provide generally robust tests of treatment effects in a repeated measures between-by within-subjects design under assumption violations given certain sample size requirements.
Abstract: One approach to the analysis of repeated measures data allows researchers to model the covariance structure of their data rather than presume a certain structure, as is the case with conventional univariate and multivariate test statistics. This mixed-model approach, available through SAS PROC MIXED, was compared to a Welch-James type statistic. The Welch-James approach is known to provide generally robust tests of treatment effects in a repeated measures between-by within-subjects design under assumption violations given certain sample size requirements. The mixed-model F tests were based on Kenward-Roger’s adjusted degrees of freedom solution, an approach specifically proposed for small sample settings. The authors investigated Type I error control for repeated measures main and interaction effects in unbalanced designs when normality and covariance homogeneity assumptions did not hold. The mixed-model Kenward-Roger’s adjusted F tests showed superior Type I error control in small sample size conditions ...

122 citations


Book ChapterDOI
TL;DR: This chapter considers mixed-model regression analysis, which is a specific technique for analyzing longitudinal data that properly deals with within- and between-subjects variance, and applies nonlinear mixed- model regression analysis of the data at hand to demonstrate the considerable potential of this relatively novel statistical approach.
Abstract: Publisher Summary This chapter considers mixed-model regression analysis, which is a specific technique for analyzing longitudinal data that properly deals with within- and between-subjects variance. The term ‘‘mixed model’’ refers to the inclusion of both fixed effects, which are model components used to define systematic relationships such as overall changes over time and/ or experimentally induced group differences; and random effects, which account for variability among subjects around the systematic relationships captured by the fixed effects. To illustrate how the mixed-model regression approach can help analyze longitudinal data with large inter-individual differences, the psychomotor vigilance data is considered from an experiment involving 88 h of total sleep deprivation, during which subjects received either sustained low-dose caffeine or placebo. The traditional repeated-measures analysis of variance (ANOVA) is applied, and it is shown that that this method is not robust against systematic interindividual variability. The data are then reanalyzed using linear mixed-model regression analysis in order to properly take into account the interindividual differences. The study concludes with an application of nonlinear mixed-model regression analysis of the data at hand, to demonstrate the considerable potential of this relatively novel statistical approach.

Journal ArticleDOI
TL;DR: A linear mixed model with a smooth random effects density is proposed and is applied to the cholesterol data first analyzed by Zhang and Davidian and shows that it yields almost unbiased estimates of the regression and the smoothing parameters in small sample settings.
Abstract: A linear mixed model with a smooth random effects density is proposed. A similar approach to P-spline smoothing of Eilers and Marx (1996, Statistical Science 11, 89-121) is applied to yield a more flexible estimate of the random effects density. Our approach differs from theirs in that the B-spline basis functions are replaced by approximating Gaussian densities. Fitting the model involves maximizing a penalized marginal likelihood. The best penalty parameters minimize Akaike's Information Criterion employing Gray's (1992, Journal of the American Statistical Association 87, 942-951) results. Although our method is applicable to any dimensions of the random effects structure, in this article the two-dimensional case is explored. Our methodology is conceptually simple, and it is relatively easy to fit in practice and is applied to the cholesterol data first analyzed by Zhang and Davidian (2001, Biometrics 57, 795-802). A simulation study shows that our approach yields almost unbiased estimates of the regression and the smoothing parameters in small sample settings. Consistency of the estimates is shown in a particular case.

Journal ArticleDOI
TL;DR: Two likelihood-based methods proposed to handle left-censoring of the outcome in linear mixed model are reviewed and how to fit these models using SAS Proc NLMIXED is shown and this tool is compared with other programs.

Journal ArticleDOI
01 Apr 2004-Ecology
TL;DR: This paper developed a restricted maximum likelihood (REML)-based method for estimating trend, process variation, and sampling error from a single time series based on a discrete-time model of density-independent growth coupled with a model of the sampling process.
Abstract: Time series of population abundance estimates often are the only data available for evaluating the prospects for persistence of a species of concern. With such data, it is possible to perform a population viability analysis (PVA) with diffusion approximation methods using estimates of the mean population trend and the variance of the trend, the so-called process variation. Sampling error in the data, however, may bias estimators of process variation derived by simple methods. We develop a restricted maximum likelihood (REML)-based method for estimating trend, process variation, and sampling error from a single time series based on a discrete-time model of density-independent growth coupled with a model of the sampling process. Transformation of the data yields a conventional linear mixed model, in which the variance components are functions of the process variation and sampling error. Simulation results show essentially unbiased estimators of trend, process variation, and sampling error over a range of process variation/sampling error combinations. Example data analyses are provided for the Whooping Crane (Grus americana), grizzly bear (Ursus arctos horribilis), California Condor (Gymnogyps californianus), and Puerto Rican Parrot (Amazona vittata). This REML-based method is useful for PVA methods that depend on accurate estimation of process variation from time-series data.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the interpretation of predictions formed including or excluding random terms, and show the need for different weighting schemes that recognize nesting and aliasing during prediction, and the necessity of being able to detect inestimateable predictions.
Abstract: Summary Following estimation of effects from a linear mixed model, it is often useful to form predicted values for certain factor/variate combinations. The process has been well defined for linear models, but the introduction of random effects into the model means that a decision has to be made about the inclusion or exclusion of random model terms from the predictions. This paper discusses the interpretation of predictions formed including or excluding random terms. Four datasets are used to illustrate circumstances where different prediction strategies may be appropriate: in an orthogonal design, an unbalanced nested structure, a model with cubic smoothing spline terms and for kriging after spatial analysis. The examples also show the need for different weighting schemes that recognize nesting and aliasing during prediction, and the necessity of being able to detect inestimable predictions.

Journal ArticleDOI
TL;DR: In this paper, a technique for finding robust maximum likelihood (RML) estimates of the model parameters in generalized linear mixed models (GLMM's) is presented. But the method is not suitable for fitting the GLMM's efficiently under strict model assumptions and can be highly influenced by the presence of unusual data points.
Abstract: The method of maximum likelihood (ML) is widely used for analyzing generalized linear mixed models (GLMM's). A full maximum likelihood analysis requires numerical integration techniques for calculation of the log-likelihood, and to avoid the computational problems involving irreducibly high-dimensional integrals, several maximum likelihood algorithms have been proposed in the literature to estimate the model parameters by approximating the log-likelihood function. Although these likelihood algorithms are useful for fitting the GLMM's efficiently under strict model assumptions, they can be highly influenced by the presence of unusual data points. In this article, the author develops a technique for finding robust maximum likelihood (RML) estimates of the model parameters in GLMM's, which appears to be useful in downweighting the influential data points when estimating the parameters. The asymptotic properties of the robust estimators are investigated under some regularity conditions. Small simulations are ...

Journal ArticleDOI
TL;DR: Using OLS to analyze repeated measures data is inappropriate when the covariance structure is not known to be CS, and Random coefficients growth curve models may be useful when the variance/covariance structure of the data set is unknown.
Abstract: UGRINOWITSCH, C., G. W. FELLINGHAM, and M. D. RICARD. Limitations of Ordinary Least Squares Models in Analyzing Repeated Measures Data. Med. Sci. Sports. Exerc., Vol. 36, No. 12, pp. 2144–2148, 2004. Purpose: To a) introduce and present the advantages of linear mixed models using generalized least squares (GLS) when analyzing repeated measures data; and b) show how model misspecification and an inappropriate analysis using repeated measures ANOVA with ordinary least squares (OLS) methodology can negatively impact the probability of occurrence of Type I error. Methods:The effects of three strength-training groups were simulated. Strength gains had two slope conditions: null (no gain), and moderate (moderate gain). Ten subjects were hypothetically measured at five time points, and the correlation between measurements within a subject was modeled as compound symmetric (CS), autoregressive lag 1 (AR(1)), and random coefficients (RC). A thousand data sets were generated for each correlation structure. Then, each was analyzed four times—once using OLS, and three times using GLS, assuming the following variance/covariance structures: CS, AR(1), and RC. Results: OLS produced substantially inflated probabilities of Type I errors when the variance/covariance structure of the data set was not CS. The RC model was less affected by the actual variance/covariance structure of the data set, and gave good estimates across all conditions. Conclusions: Using OLS to analyze repeated measures data is inappropriate when the covariance structure is not known to be CS. Random coefficients growth curve models may be useful when the variance/covariance structure of

Journal ArticleDOI
TL;DR: The properties of (t,m,s)-nets imply that two of them should outperform the other two, and the results confirm this expectation.
Abstract: We describe the properties of (t,m,s)-nets and Halton draws. Four types of (t,m,s)-nets, two types of Halton draws, and independent draws are compared in an application of maximum simulated likelihood estimation of a mixed logit model. All of the quasi-random procedures are found to perform far better than independent draws. The best performance is attained by one of the (t,m,s)-nets. The properties of the nets imply that two of them should outperform the other two, and our results confirm this expectation. The two more-accurate nets perform better than both types of Halton draws, while the two less-accurate nets perform worse than the Halton draws.

Journal ArticleDOI
TL;DR: In this paper, a method for approximating prediction error variances and covariances among estimates of individual animals genetic effects for multiple-trait and random regression models is described.
Abstract: Summary A method for approximating prediction error variances and covariances among estimates of individual animals genetic effects for multiple-trait and random regression models is described These approximations are used to calculate the prediction error variances of linear functions of the terms in the model In the multiple-trait case these are indexes of estimated breeding values, and for random regression models these are estimated breeding values at individual points on the longitudinal scale Approximate reliabilities for terms in the model and linear functions thereof are compared with corresponding reliabilities obtained from the inverse of the coefficient matrix in the mixed model equations Results show good agreement between approximate and true values

Book
24 Aug 2004
TL;DR: In this paper, the authors present an analysis of variance in two-way ANVMs with and without replicates, as well as with two-factor Nested models and four-sequence designs.
Abstract: Preface.1. Exact Generalized Inference.1.1 Introduction.1.2 Test Statistics and p-Values.1.3 Test Variables and Generalized p-Values.1.4 Substitution Method.1.5 Fixed Level Testing.1.6 Generalized Confidence Intervals.1.7 Substitution Method in Interval Estimation.1.8 Generalized p-Values Based Intervals.2. Methods in Analysis of Variance.2.1 Introduction.2.2 Comparing Two Population Means.2.3 Case of Unequal Variances.2.4 One-Way ANOVA.2.5 Multiple Comparisons: Case of Equal Variances.2.6 Multiple Comparisons: Case of Unequal Variances.2.7 Two-Way ANOVA Under Equal Variances.2.8 Two-Way ANOVA Under Heteroscedasticity.2.9 Two-Factor Nested Design.3. Introduction to Mixed Models.3.1 Introduction.3.2 Random Effects One-Way ANOVA.3.3 Inference About Variance Components.3.4 Fixed Level Testing.3.5 Inference About the Mean.3.6 Two-Way Mixed Model without Replicates.4. Higher-Way Mixed Models.4.1 Introduction.4.2 Canonical Form of the Problem.4.3 Testing Fixed Effects.4.4 Estimating Variance Components.4.5 Testing Variance Components.4.6 Confidence Intrvals.4.7 Functions of Variance Components.5. Multivariate Populations.5.1 Introduction.5.2 Multivariate Normal Populations.5.3 Inferences About the Mean Vector.5.4 Inferences About Linear Functions of .5.5 Multivariate Regression.6. Multivariate Analysis of Variance.6.1 Introduction.6.2 Comparing Two Multivariate Populations.6.3 Multivariate Behrens-Fisher Problem.6.4 MANOVA with Equal Covariances.6.5 MANOVA with Unequal Covariances.7. Mixed Models in Repeated Measures.7.1 Introduction.7.2 Mixed Models for One Group.7.3 Analysis of Data from Two Factors.7.4 ANOVA Under Equal Error Variances.7.5 Other Two-Factors Models.7.6 Regression and RM ANCOVA.8. Repeated Measures Under Heteroscedasticity.8.1 Introduction.8.2 Two-Factor Model with Unequal Group Variances.8.3 Point Estimation.8.4 Testing Fixed Effects.8.5 Multiple Comparisons.8.6 Inference on Variance Components.8.7 RM ANCOVA Under Heteroscedasticity.9. Crossover Designs.9.1 Introduction.9.2 Two-Sequence Design.9.3 Comparing Teatments.9.4 Four-Sequence Design.9.5 Distributional Results.9.6 Testing and Interval Estimation.10. Growth Curves.10.1 Introduction.10.2 Growth Curve Models.10.3 Infewrence with Unstructured Covariances.10.4 Inferences on General Linear Contrasts.10.5 Simultaneous Confidence Intervals.10.6 Mixed Models in Growth Curves.10.7 Exact Infernece Under Structured Covariances.10.8 Comparing Growth Curves.10.9 Case of Unequal Group Variances.Appendix A: Univariate Technical Arguments.Appendix B: Multivariate Technical Arguments.References.

Journal ArticleDOI
TL;DR: An efficient computational strategy for calculating predictions and their standard errors is given, which includes the ability to detect the invariance of predictions to the parameterisation used in the model.

Journal ArticleDOI
TL;DR: This paper proposes a method whereby height–diameter regression from an inventory can be incorporated into a height imputation algorithm, implying that substantial environmental variation existed in the height¬diameter relati...
Abstract: This paper proposes a method whereby height–diameter regression from an inventory can be incorporated into a height imputation algorithm. Point-level subsampling is often employed in forest inventory for efficiency. Some trees will be measured for diameter and species, while others will be measured for height and 10-year increment. Predictions of these missing measures would be useful for estimating volume and growth, respectively, so they are often imputed. We present and compare three imputation strategies: using a published model, using a localized version of a published model, and using best linear unbiased predictions from a mixed-effects model. The bases of our comparison are four-fold: minimum fitted root mean squared error and minimum predicted root mean squared error under a 2000-fold cross-validation for tree-level height and volume imputations. In each case the mixed-effects model proved superior. This result implies that substantial environmental variation existed in the height–diameter relati...

01 Jan 2004
TL;DR: In this article, a generalised linear mixed model (GLM) with a more rigorous theoretical basis is introduced, where catch is modelled as the response variable using a GLM with a power variance function, with the power parameter estimated using a profi le extended quasi-likelihood, and a log link function with log of effort as an offset.
Abstract: The current standard method for modelling catch and effort data for Patagonian toothfi sh (Dissostichus eleginoides) for CCAMLR areas is to model the haul-by-haul ratios of catch to effort as the response variable in a generalised linear model (GLM) with a square-root link function and a unit variance function. A time series of standardised CPUE estimates and their precision can be obtained from the ‘fi shing year’ parameter estimates together with ‘baseline’ parameter estimates, their variance–covariance matrix, and the inverse-link function. An alternative GLM with a more rigorous theoretical basis is introduced here. Catch is modelled as the response variable using a GLM with a power variance function, with the power parameter (λ) estimated using a profi le extended quasi-likelihood, and a log link function with log of effort as an offset. For 1 < λ < 2 this model is equivalent to assuming a compound Poisson-gamma distribution (i.e. Tweedie distribution) for catch that, unlike lognormal or gamma distributions, admits zero values. In addition, random vessel effects are introduced into the GLM, as specifi ed by a generalised linear mixed model (GLMM), in order to provide more effi cient estimates of the standardised CPUE time series and more realistic estimates of their precision. Extra effi ciency is gained by recovery of inter-vessel information as a result of the imbalance in the number of hauls in the year-by-vessel cross-classifi cation. Further, the inclusion of an area stratum by fi year interaction as an additional random effect in the GLMM is investigated. Fitting the stratum-by-year interaction as a fi xed effect is problematic since it requires weighting of the individual stratum estimates by the areal extent of the stratum in order to obtain overall yearly standardised catch-per-unit-effort (CPUE) estimates. Without stratifi ed random sampling, the determination of stratum areas that will give unbiased standardised CPUE estimates may be diffi cult. Fitting the stratum-by-year interaction as a random effect avoids this diffi culty, and diagnostic methods to evaluate the validity of considering this interaction as random are described.

Journal ArticleDOI
TL;DR: In this article, a generalized linear mixed model (GLMM) was used to examine fishing power among chartered industry-based vessels and a research trawler, the FRV Miller Freeman, for bottom trawl surveys on the upper continental slope of U.S. West coast.

Journal ArticleDOI
TL;DR: It is shown how test-retest reliability can be derived using linear mixed models when the scale is continuous or quasi-continuous, and the full modeling power of mixed models can be used.

Journal ArticleDOI
TL;DR: This paper proposes a method to detect influential observations in nonlinear mixed-effects models, on the basis of the maximum likelihood estimates that are obtained by a stochastic approximation algorithm with Markov chain Monte Carlo method.

Journal ArticleDOI
TL;DR: The mixed model partitioned BW variation into between- and within-bird variation, and the covariance structure assumed with the random effect accounted for part of the BW correlation across ages in the same individual.

Journal ArticleDOI
TL;DR: A scaled chi-squared test based on the mixed model representation of the proposed model is developed to test whether an underlying varying coefficient is a polynomial of certain degree, and evaluate the performance of the procedures through simulation studies and illustrate their application with Indonesian children infectious disease data.
Abstract: Summary. The routinely assumed parametric functional form in the linear predictor of a generalized linear mixed model for longitudinal data may be too restrictive to represent true underlying covariate effects. We relax this assumption by representing these covariate effects by smooth but otherwise arbitrary functions of time, with random effects used to model the correlation induced by among-subject and within-subject variation. Due to the usually intractable integration involved in evaluating the quasi-likelihood function, the double penalized quasi-likelihood (DPQL) approach of Lin and Zhang (1999, Journal of the Royal Statistical Society, Series B61, 381–400) is used to estimate the varying coefficients and the variance components simultaneously by representing a nonparametric function by a linear combination of fixed effects and random effects. A scaled chi-squared test based on the mixed model representation of the proposed model is developed to test whether an underlying varying coefficient is a polynomial of certain degree. We evaluate the performance of the procedures through simulation studies and illustrate their application with Indonesian children infectious disease data.

Journal ArticleDOI
TL;DR: In this paper, the authors compared generalized estimating equation (GEE) solutions utilizing the robust empirical sandwich estimator for modeling of the error structure and general linear mixed model (GLMM) solutions that utilized the commonly employed restricted maximum likelihood (REML) procedure.
Abstract: Generalized linear model analyses of repeated measurements typically rely on simplifying mathematical models of the error covariance structure for testing the significance of differences in patterns of change across time. The robustness of the tests of significance depends, not only on the degree of agreement between the specified mathematical model and the actual population data structure, but also on the precision and robustness of the computational criteria for fitting the specified covariance structure to the data. Generalized estimating equation (GEE) solutions utilizing the robust empirical sandwich estimator for modeling of the error structure were compared with general linear mixed model (GLMM) solutions that utilized the commonly employed restricted maximum likelihood (REML) procedure. Under the conditions considered, the GEE and GLMM procedures were identical in assuming that the data are normally distributed and that the variance-covariance structure of the data is the one specified by the user. The question addressed in this article concerns relative sensitivity of tests of significance for treatment effects to varying degrees of misspecification of the error covariance structure model when fitted by the alternative procedures. Simulated data that were subjected to monte carlo evaluation of actual Type I error and power of tests of the equal slopes hypothesis conformed to assumptions of ordinary linear model ANOVA for repeated measures except for autoregressive covariance structures and missing data due to dropouts. The actual within-groups correlation structures of the simulated repeated measurements ranged from AR(1) to compound symmetry in graded steps, whereas the GEE and GLMM formulations restricted the respective error structure models to be either AR(1), compound symmetry (CS), or unstructured (UN). The GEE-based tests utilizing empirical sandwich estimator criteria were documented to be relatively insensitive to misspecification of the covariance structure models, whereas GLMM tests which relied on restricted maximum likelihood (REML) were highly sensitive to relatively modest misspecification of the error correlation structure even though normality, variance homogeneity, and linearity were not an issue in the simulated data.Goodness-of-fit statistics were of little utility in identifying cases in which relatively minor misspecification of the GLMM error structure model resulted in inadequate alpha protection for tests of the equal slopes hypothesis. Both GEE and GLMM formulations that relied on unstructured (UN) error model specification produced nonconservative results regardless of the actual correlation structure of the repeated measurements. A random coefficients model produced robust tests with competitive power across all conditions examined. (© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)

Journal ArticleDOI
TL;DR: Various genetic models that can be analysed using an extended family structure, using the generalized linear mixed model to deal with the family structure and likelihood‐based methodology for parameter inference are described.
Abstract: While the family-based analysis of genetic and environmental contributions to continuous or Gaussian traits is now straightforward using the linear mixed models approach, the corresponding analysis of complex binary traits is still rather limited. In the latter we usually rely on twin studies or pairs of relatives, but these studies often have limited sample size or have difficulties in dealing with the dependence between the pairs. Direct analysis of extended family data can potentially overcome these limitations. In this paper, we will describe various genetic models that can be analysed using an extended family structure. We use the generalized linear mixed model to deal with the family structure and likelihood-based methodology for parameter inference. The method is completely general, accommodating arbitrary family structures and incomplete data. We illustrate the methodology in great detail using the Swedish birth registry data on pre-eclampsia, a hypertensive condition induced by pregnancy. The statistical challenges include the specification of sensible models that contain a relatively large number of variance components compared to standard mixed models. In our illustration the models will account for maternal or foetal genetic effects, environmental effects, or a combination of these and we show how these effects can be readily estimated using family data.