scispace - formally typeset
Search or ask a question

Showing papers on "Mixed model published in 2006"


Journal ArticleDOI
TL;DR: The Bayesian and likelihood-based methods studied are considerably faster computationally than MCMC, and steady improvements in recent years in both hardware speed and e-ciency of MonteCarloalgorithms are steady.
Abstract: We use simulation studies, whose design is realistic for educational andmedicalresearch(aswellasotherfleldsofinquiry),tocompareBayesianand likelihood-basedmethodsforflttingvariance-components(VC)andrandom-efiects logistic regression (RELR) models. The likelihood (and approximate likelihood) approachesweexaminearebasedonthemethodsmostwidelyusedincurrentap- plied multilevel (hierarchical) analyses: maximum likelihood (ML) and restricted ML(REML)forGaussianoutcomes,andmarginalandpenalizedquasi-likelihood (MQL and PQL) for Bernoulli outcomes. Our Bayesian methods use Markov chain Monte Carlo (MCMC) estimation, with adaptive hybrid Metropolis-Gibbs sampling for RELR models, and several difiuse prior distributions (i i1 (†;†) and U(0; 1 ) priors for variance components). For evaluation criteria we consider bias of point estimates and nominal versus actual coverage of interval estimates in re- peated sampling. In two-level VC models we flnd that (a) both likelihood-based and Bayesian approaches can be made to produce approximately unbiased esti- mates, although the automatic manner in which REML accomplishes this is an advantage, but (b) both approaches had di-culty achieving nominal coverage in smallsamplesandwithsmallvaluesoftheintraclasscorrelation. Withthethree- levelRELRmodelsweexamineweflndthat(c)quasi-likelihoodmethodsforesti- mating random-efiects variances perform badly with respect to bias and coverage intheexamplewesimulated,and(d)Bayesiandifiuse-priormethodsleadtowell- calibratedpointandintervalRELRestimates. Whileitistruethatthelikelihood- based methods we study are considerably faster computationally than MCMC, (i) steady improvements in recent years in both hardware speed and e-ciency of MonteCarloalgorithmsand(ii)thelackofcalibrationoflikelihood-basedmethods insomecommonhierarchicalsettingscombinetomakeMCMC-basedBayesianflt- tingofmultilevelmodelsanattractiveapproach,evenwithratherlargedatasets. Other analytic strategies based on less approximate likelihood methods are also possible butwouldbeneflt fromfurtherstudy ofthe type summarized here.

522 citations


Book
13 Jul 2006
TL;DR: In this paper, the authors proposed an extended framework for estimating the likelihood of fixed parameters using a mixture of conditional and conditional likelihoods, which is derived from the profile likelihood distribution of the likelihood-ratio statistic distribution.
Abstract: LIST OF NOTATIONS PREFACE INTRODUCTION CLASSICAL LIKELIHOOD THEORY Definition Quantities derived from the likelihood Profile likelihood Distribution of the likelihood-ratio statistic Distribution of the MLE and the Wald statistic Model selection Marginal and conditional likelihoods Higher-order approximations Adjusted profile likelihood Bayesian and likelihood methods Jacobian in likelihood methods GENERALIZED LINEAR MODELS Linear models Generalized linear models Model checking Examples QUASI-LIKELIHOOD Examples Iterative weighted least squares Asymptotic inference Dispersion models Extended Quasi-likelihood Joint GLM of mean and dispersion Joint GLMs for quality improvement EXTENDED LIKELIHOOD INFERENCES Two kinds of likelihood Inference about the fixed parameters Inference about the random parameters Optimality in random-parameter estimation Canonical scale, h-likelihood and joint inference Statistical prediction Regression as an extended model Missing or incomplete-data problems Is marginal likelihood enough for inference about fixed parameters? Summary: likelihoods in extended framework NORMAL LINEAR MIXED MODELS Developments of normal mixed linear models Likelihood estimation of fixed parameters Classical estimation of random effects H-likelihood approach Example Invariance and likelihood inference HIERARCHICAL GLMS HGLMs H-likelihood Inferential procedures using h-likelihood Penalized quasi-likelihood Deviances in HGLMs Examples Choice of random-effect scale HGLMS WITH STRUCTURED DISPERSION HGLMs with structured dispersion Quasi-HGLMs Examples CORRELATED RANDOM EFFECTS FOR HGLMS HGLMs with correlated random effects Random effects described by fixed L matrices Random effects described by a covariance matrix Random effects described by a precision matrix Fitting and model-checking Examples Twin and family data Ascertainment problem SMOOTHING Spline models Mixed model framework Automatic smoothing Non-Gaussian smoothing RANDOM-EFFECT MODELS FOR SURVIVAL DATA Proportional-hazard model Frailty models and the associated h-likelihood *Mixed linear models with censoring Extensions Proofs DOUBLE HGLMs DHGLMs Models for finance data H-likelihood procedure for fitting DHGLMs Random effects in the ? component Examples FURTHER TOPICS Model for multivariate responses Joint model for continuous and binary data Joint model for repeated measures and survival time Missing data in longitudinal studies Denoising signals by imputation REFERENCE DATA INDEX AUTHOR INDEX SUBJECT INDEX

495 citations


Journal ArticleDOI
TL;DR: New methodology is presented that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian wavelet-based approach, which is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and between-curve covariance structures that are available in the Mixed model framework.
Abstract: Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian wavelet-based approach. This method is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and between-curve covariance structures that are available in the mixed model framework. It yields nonparametric estimates of the fixed and random-effects functions as well as the various between-curve and within-curve covariance matrices. The functional fixed effects are adaptively regularized as a result of the non-linear shrinkage prior that is imposed on the fixed effects' wavelet coefficients, and the random-effect functions experience a form of adaptive regularization because of the separately estimated variance components for each wavelet coefficient. Because we have posterior samples for all model quantities, we can perform pointwise or joint Bayesian inference or prediction on the quantities of the model. The adaptiveness of the method makes it especially appropriate for modelling irregular functional data that are characterized by numerous local features like peaks.

408 citations


Journal ArticleDOI
01 Jun 2006-Test
TL;DR: In this paper, the authors present a review of the classical inferential approach for linear and generalized linear mixed models that are relevant to different issues concerning small area estimation and related problems, and present a general framework for solving these problems.
Abstract: Over the last three decades, mixed models have been frequently used in a wide range of small area applications. Such models offer great flexibilities in combining information from various sources, and thus are well suited for solving most small area estimation problems. The present article reviews major research developments in the classical inferential approach for linear and generalized linear mixed models that are relevant to different issues concerning small area estimation and related problems.

317 citations


Journal ArticleDOI
TL;DR: In this article, the authors extend the mixed logit model to account for this heterogeneity and illustrate the implications this has on the moments of the willingness to pay for travel time savings in the context of commuter choice of mode.
Abstract: The growing popularity of mixed logit to obtain estimates of willingness to pay (WTP) has focussed on the distribution of the random parameters and the possibility of estimating deep parameters to account for heterogeneity around the mean of the distribution. However the possibility exists to add further behavioural information associated with the variance of the random parameter distribution, through parameterisation of its heterogeneity (or heteroskedasticity). In this paper we extend the mixed logit model to account for this heterogeneity and illustrate the implications this has on the moments of the willingness to pay for travel time savings in the context of commuter choice of mode. The empirical study highlights the statistical and behavioural gains but warns of the potential downside of exposing the distribution of the parameterised numerator and/or denominator of the more complex WTP function to a sign change and extreme values over the range of the distribution.

239 citations


01 Jan 2006
TL;DR: In this paper, the authors describe the REML-E-BLUP method and illustrate the method with some data on soil water content that exhibit a pronounced spatial trend, which is a special case of the linear mixed model where our data are modelled as the additive combination of fixed effects (e.g. the unknown mean, coefficients of a trend model), random effects (the spatially dependent random variation in the geostatistical context) and independent random error (nugget variation in geostatsistics).
Abstract: Geostatistical estimates of a soil property by kriging are equivalent to the best linear unbiased predictions (BLUPs). Universal kriging is BLUP with a fixed-effect model that is some linear function of spatial coordinates, or more generally a linear function of some other secondary predictor variable when it is called kriging with external drift. A problem in universal kriging is to find a spatial variance model for the random variation, since empirical variograms estimated from the data by method-of-moments will be affected by both the random variation and that variation represented by the fixed effects. The geostatistical model of spatial variation is a special case of the linear mixed model where our data are modelled as the additive combination of fixed effects (e.g. the unknown mean, coefficients of a trend model), random effects (the spatially dependent random variation in the geostatistical context) and independent random error (nugget variation in geostatistics). Statisticians use residual maximum likelihood (REML) to estimate variance parameters, i.e. to obtain the variogram in a geostatistical context. REML estimates are consistent (they converge in probability to the parameters that are estimated) with less bias than both maximum likelihood estimates and method-of-moment estimates obtained from residuals of a fitted trend. If the estimate of the random effects variance model is inserted into the BLUP we have the empirical BLUP or E-BLUP. Despite representing the state of the art for prediction from a linear mixed model in statistics, the REML-E-BLUP has not been widely used in soil science, and in most studies reported in the soils literature the variogram is estimated with methods that are seriously biased if the fixed-effect structure is more complex than just an unknown constant mean (ordinary kriging). In this paper we describe the REML-E-BLUP and illustrate the method with some data on soil water content that exhibit a pronounced spatial trend.

237 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe the REML-E-BLUP method and illustrate the method with some data on soil water content that exhibit a pronounced spatial trend, which is a special case of the linear mixed model where our data are modelled as the additive combination of fixed effects (e.g. the unknown mean, coefficients of a trend model), random effects (the spatially dependent random variation in the geostatistical context) and independent random error (nugget variation in geostatsistics).
Abstract: Geostatistical estimates of a soil property by kriging are equivalent to the best linear unbiased predictions (BLUPs). Universal kriging is BLUP with a fixed-effect model that is some linear function of spatial coordinates, or more generally a linear function of some other secondary predictor variable when it is called kriging with external drift. A problem in universal kriging is to find a spatial variance model for the random variation, since empirical variograms estimated from the data by method-of-moments will be affected by both the random variation and that variation represented by the fixed effects. The geostatistical model of spatial variation is a special case of the linear mixed model where our data are modelled as the additive combination of fixed effects (e.g. the unknown mean, coefficients of a trend model), random effects (the spatially dependent random variation in the geostatistical context) and independent random error (nugget variation in geostatistics). Statisticians use residual maximum likelihood (REML) to estimate variance parameters, i.e. to obtain the variogram in a geostatistical context. REML estimates are consistent (they converge in probability to the parameters that are estimated) with less bias than both maximum likelihood estimates and method-of-moment estimates obtained from residuals of a fitted trend. If the estimate of the random effects variance model is inserted into the BLUP we have the empirical BLUP or E-BLUP. Despite representing the state of the art for prediction from a linear mixed model in statistics, the REML-E-BLUP has not been widely used in soil science, and in most studies reported in the soils literature the variogram is estimated with methods that are seriously biased if the fixed-effect structure is more complex than just an unknown constant mean (ordinary kriging). In this paper we describe the REML-E-BLUP and illustrate the method with some data on soil water content that exhibit a pronounced spatial trend.

237 citations


Journal ArticleDOI
TL;DR: The M-quantile model as mentioned in this paper is based on modeling quantile-like parameters of the conditional distribution of the target variable given the covariates, which avoids the problems associated with specification of random effects, allowing inter-domain differences to be characterized by the variation of area-specific Mquantile coefficients.
Abstract: Small area estimation techniques are employed when sample data are insufficient for acceptably precise direct estimation in domains of interest. These techniques typically rely on regression models that use both covariates and random effects to explain variation between domains. However, such models also depend on strong distributional assumptions, require a formal specification of the random part of the model and do not easily allow for outlier robust inference. We describe a new approach to small area estimation that is based on modelling quantile-like parameters of the conditional distribution of the target variable given the covariates. This avoids the problems associated with specification of random effects, allowing inter-domain differences to be characterized by the variation of area-specific M-quantile coefficients. The proposed approach is easily made robust against outlying data values and can be adapted for estimation of a wide range of area specific parameters, including that of the quantiles of the distribution of the target variable in the different small areas. Results from two simulation studies comparing the performance of the M-quantile modelling approach with more traditional mixed model approaches are also provided.

233 citations


Journal ArticleDOI
TL;DR: A pairwise approach in which all possible bivariate models are fitted, and where inference follows from pseudo-likelihood arguments is proposed, applicable for linear, generalized linear, and nonlinear mixed models, or for combinations of these.
Abstract: A mixed model is a flexible tool for joint modeling purposes, especially when the gathered data are unbalanced. However, computational problems due to the dimension of the joint covariance matrix of the random effects arise as soon as the number of outcomes and/or the number of used random effects per outcome increases. We propose a pairwise approach in which all possible bivariate models are fitted, and where inference follows from pseudo-likelihood arguments. The approach is applicable for linear, generalized linear, and nonlinear mixed models, or for combinations of these. The methodology will be illustrated for linear mixed models in the analysis of 22-dimensional, highly unbalanced, longitudinal profiles of hearing thresholds.

229 citations


Journal ArticleDOI
TL;DR: The authors showed that conditional maximum likelihood can eliminate this bias by partitioning the covariate into between-and within-cluster components and models that include separate terms for these components also eliminate the source of the bias.
Abstract: Summary. We consider the situation where the random effects in a generalized linear mixed model may be correlated with one of the predictors, which leads to inconsistent estimators. We show that conditional maximum likelihood can eliminate this bias. Conditional likelihood leads naturally to the partitioning of the covariate into between- and within-cluster components and models that include separate terms for these components also eliminate the source of the bias. Another viewpoint that we develop is the idea that many violations of the assumptions (including correlation between the random effects and a covariate) in a generalized linear mixed model may be cast as misspecified mixing distributions. We illustrate the results with two examples and simulations.

193 citations


Journal ArticleDOI
TL;DR: The MCMC package WinBUGS facilitates sound fitting of general design Bayesian generalized linear mixed models in practice, and is described as a Bayesian approach and Markov chain Monte Carlo (MCMC) is used for estimation and inference.
Abstract: Linear mixed models are able to handle an extraordinary range of complications in regression-type analyses. Their most common use is to account for within-subject correlation in longitudinal data analysis. They are also the standard vehicle for smoothing spatial count data. However, when treated in full generality, mixed models can also handle spline-type smoothing and closely approximate kriging. This allows for nonparametric regression models (e.g., additive models and varying coefficient models) to be handled within the mixed model framework. The key is to allow the ran- dom effects design matrix to have general structure; hence our label general design. For continuous response data, particularly when Gaussianity of the response is reasonably assumed, computation is now quite mature and sup- ported by the R, SAS and S-PLUS packages. Such is not the case for bi- nary and count responses, where generalized linear mixed models (GLMMs) are required, but are hindered by the presence of intractable multivariate in- tegrals. Software known to us supports special cases of the GLMM (e.g., PROC NLMIXED in SAS or glmmML in R) or relies on the sometimes crude Laplace-type approximation of integrals (e.g., the SAS macro glimmix or glmmPQL in R). This paper describes the fitting of general design general- ized linear mixed models. A Bayesian approach is taken and Markov chain Monte Carlo (MCMC) is used for estimation and inference. In this gener- alized setting, MCMC requires sampling from nonstandard distributions. In this article, we demonstrate that the MCMC package WinBUGS facilitates sound fitting of general design Bayesian generalized linear mixed models in practice.

01 Jan 2006
TL;DR: In this article, a thorough treatment of methods for solving over and under-determined systems of equations, e.g., the minimum norm solution method with respect to weighted norms, is presented.
Abstract: This monograph contains a thorough treatment of methods for solving over- and under-determined systems of equations, e.g. the minimum norm solution method with respect to weighted norms. The considered equations can be nonlinear or linear, and deterministic models as well as probabilistic ones are considered. An extensive appendix provides all necessary prerequisites like matrix algebra, matrix analysis and Lagrange multipliers, and a long list of reference is also included.

Journal ArticleDOI
TL;DR: In this article, the authors assume that the unknown mixed distribution is symmetric and obtain the identifiability of this model, which is defined by four unknown parameters: the mixing proportion, two location parameters and the cumulative distribution function of the symmetric mixed distribution.
Abstract: Suppose that univariate data are drawn from a mixture of two distributions that are equal up to a shift parameter Such a model is known to be nonidentifiable from a nonparametric viewpoint However, if we assume that the unknown mixed distribution is symmetric, we obtain the identifiability of this model, which is then defined by four unknown parameters: the mixing proportion, two location parameters and the cumulative distribution function of the symmetric mixed distribution We propose estimators for these four parameters when no training data is available Our estimators are shown to be strongly consistent under mild regularity assumptions and their convergence rates are studied Their finite-sample properties are illustrated by a Monte Carlo study and our method is applied to real data

Book
27 Oct 2006
TL;DR: The residual maximum likelihood (RM) criterion for fitting mixed models has been proposed in this paper, which is based on the best linear unbiased predictors of random effects in mixed models.
Abstract: Preface. 1. The need for more than one random-effect term when fitting a regression line. 2. The need for more than one random-effect term in a designed experiment. 3. Estimation of the variances of random-effect terms. 4. Interval estimates for fixed-effect terms in mixed models. 5. Estimation of random effects in mixed models: best linear unbiased predictors. 6. More advanced mixed models for more elaborate data sets. 7. Two case studies. 8. The use of mixed models for the analysis of unbalanced experimental designs. 9. Beyond mixed modelling. 10. Why is the criterion for fitting mixed models called residual maximum likelihood? References. Index.

Journal ArticleDOI
TL;DR: This work estimates the correlation coefficient between two variables with repeated observations on each variable, using linear mixed effects (LME) model, and describes how to select the correlation structure on the repeated measures by using Proc Mixed of SAS.
Abstract: We estimate the correlation coefficient between two variables with repeated observations on each variable, using linear mixed effects (LME) model. The solution to this problem has been studied by many authors. Bland and Altman (1995) considered the problem in many ad hoc methods. Lam, Webb and O'Donnell (1999) solved the problem by considering different correlation structures on the repeated measures. They assumed that the repeated measures are linked over time but their method needs specialized software. However, they never addressed the question of how to choose the correlation structure on the repeated measures for a particular data set. Hamlett et al. (2003) generalized this model and used Proc Mixed of SAS to solve the problem. Unfortunately, their method also cannot implement the correlation structure on the repeated measures that is present in the data. We also assume that the repeated measures are linked over time and generalize all the previous models, and can account for the correlation structure on the repeated measures that is present in the data. We study how the correlation coefficient between the variables gets affected by incorrect assumption of the correlation structure on the repeated measures itself by using Proc Mixed of SAS, and describe how to select the correlation structure on the repeated measures. We also extend the model by including random intercept and random slope over time for each subject. Our model will also be useful when some of the repeated measures are missing at random.

Journal ArticleDOI
TL;DR: In this article, a general class of structured additive regression models for categorical responses, allowing for a flexible semiparametric predictor, is proposed for forest health with damage state of trees as the response.
Abstract: Motivated by a space-time study on forest health with damage state of trees as the response, we propose a general class of structured additive regression models for categorical responses, allowing for a flexible semiparametric predictor Nonlinear effects of continuous covariates, time trends, and interactions between continuous covariates are modeled by penalized splines Spatial effects can be estimated based on Markov random fields, Gaussian random fields, or two-dimensional penalized splines We present our approach from a Bayesian perspective, with inference based on a categorical linear mixed model representation The resulting empirical Bayes method is closely related to penalized likelihood estimation in a frequentist setting Variance components, corresponding to inverse smoothing parameters, are estimated using (approximate) restricted maximum likelihood In simulation studies we investigate the performance of different choices for the spatial effect, compare the empirical Bayes approach to competing methodology, and study the bias of mixed model estimates As an application we analyze data from the forest health survey

Journal ArticleDOI
TL;DR: This article showed that the success of a transformation may be judged solely in terms of how closely the total error follows a Gaussian distribution, which avoids the complexity of separately evaluating pure errors and random effects.
Abstract: Summary. For a univariate linear model, the Box–Cox method helps to choose a response transformation to ensure the validity of a Gaussian distribution and related assumptions. The desire to extend the method to a linear mixed model raises many vexing questions. Most importantly, how do the distributions of the two sources of randomness (pure error and random effects) interact in determining the validity of assumptions? For an otherwise valid model, we prove that the success of a transformation may be judged solely in terms of how closely the total error follows a Gaussian distribution. Hence the approach avoids the complexity of separately evaluating pure errors and random effects. The extension of the transformation to the mixed model requires an exploration of its potential effect on estimation and inference of the model parameters. Analysis of longitudinal pulmonary function data and Monte Carlo simulations illustrate the methodology discussed.

Journal ArticleDOI
TL;DR: The present study investigates the direct use of the variance covariance matrix of all observations in AIREML for LD mapping with a general complex pedigree and finds the method presented is more efficient than the usual approach based on mixed model equations and robust to numerical problems caused by near-singularity due to closely linked markers.
Abstract: Variance component (VC) approaches based on restricted maximum likelihood (REML) have been used as an attractive method for positioning of quantitative trait loci (QTL). Linkage disequilibrium (LD) information can be easily implemented in the covariance structure among QTL effects (e.g. genotype relationship matrix) and mapping resolution appears to be high. Because of the use of LD information, the covariance structure becomes much richer and denser compared to the use of linkage information alone. This makes an average information (AI) REML algorithm based on mixed model equations and sparse matrix techniques less useful. In addition, (near-) singularity problems often occur with high marker densities, which is common in fine-mapping, causing numerical problems in AIREML based on mixed model equations. The present study investigates the direct use of the variance covariance matrix of all observations in AIREML for LD mapping with a general complex pedigree. The method presented is more efficient than the usual approach based on mixed model equations and robust to numerical problems caused by near-singularity due to closely linked markers. It is also feasible to fit multiple QTL simultaneously in the proposed method whereas this would drastically increase computing time when using mixed model equation-based methods.

Journal ArticleDOI
TL;DR: An overview of the various modeling frameworks for non-Gaussian longitudinal data is provided, and a focus on generalized linear mixed-effects models, on the one hand, of which the parameters can be estimated using full likelihood, and on generalized estimating equations, which is a nonlikelihood method and hence requires a modification to be valid under MAR.
Abstract: Commonly used methods to analyze incomplete longitudi- nal clinical trial data include complete case analysis (CC) and last observation carried forward (LOCF). However, such methods rest on strong assumptions, including missing completely at random (MCAR) for CC and unchanging profile after dropout for LOCF. Such assump- tions are too strong to generally hold. Over the last decades, a number of full longitudinal data analysis methods have become available, such as the linear mixed model for Gaussian outcomes, that are valid un- der the much weaker missing at random (MAR) assumption. Such a method is useful, even if the scientific question is in terms of a sin- gle time point, for example, the last planned measurement occasion, and it is generally consistent with the intention-to-treat principle. The validity of such a method rests on the use of maximum likelihood, un- der which the missing data mechanism is ignorable as soon as it is MAR. In this paper, we will focus on non-Gaussian outcomes, such as binary, categorical or count data. This setting is less straightforward since there is no unambiguous counterpart to the linear mixed model. We first provide an overview of the various modeling frameworks for non-Gaussian longitudinal data, and subsequently focus on generalized linear mixed-effects models, on the one hand, of which the parameters can be estimated using full likelihood, and on generalized estimating equations, on the other hand, which is a nonlikelihood method and hence requires a modification to be valid under MAR. We briefly com- ment on the position of models that assume missingness not at random and argue they are most useful to perform sensitivity analysis. Our developments are underscored using data from two studies. While the case studies feature binary outcomes, the methodology applies equally well to other discrete-data settings, hence the qualifier "discrete" in the title.

Journal ArticleDOI
TL;DR: A new method for stochastically imputing the missing data that allows us to incorporate these incomplete profiles in the authors' analysis of accelerometer data, revealing some interesting insights into children's activity patterns.
Abstract: We present a case study illustrating the challenges of analyzing accelerometer data taken from a sample of children participating in an intervention study designed to increase physical activity. An accelerometer is a small device worn on the hip that records the minute-by-minute activity levels of the child throughout the day for each day it is worn. The resulting data are irregular functions characterized by many peaks representing short bursts of intense activity. We model these data using the wavelet-based functional mixed model. This approach incorporates multiple fixed effect and random effect functions of arbitrary form, the estimates of which are adaptively regularized using wavelet shrinkage. The method yields posterior samples for all functional quantities of the model, which can be used to perform various types of Bayesian inference and prediction. In our case study, a high proportion of the daily activity profiles are incomplete, i.e. have some portion of the profile missing, so cannot be directly modeled using the previously described method. We present a new method for stochastically imputing the missing data that allows us to incorporate these incomplete profiles in our analysis. Our approach borrows strength from both the observed measurements within the incomplete profiles and from other profiles, from the same child as well as other children with similar covariate levels, while appropriately propagating the uncertainty of the imputation throughout all subsequent inference. We apply this method to our case study, revealing some interesting insights into children's activity patterns. We point out some strengths and limitations of using this approach to analyze accelerometer data.

Journal ArticleDOI
TL;DR: In this paper, a general methodology for producing a model-assisted empirical best predictor (EBP) of a finite population domain mean using data from a complex survey was introduced, which converges in probability to the customary design-consistent estimator as the domain and sample sizes increase.
Abstract: In this article we introduce a general methodology for producing a model-assisted empirical best predictor (EBP) of a finite population domain mean using data from a complex survey. Our method improves on the commonly used design-consistent survey estimator by using a suitable mixed model. Such a model combines information from related sources, such as census and administrative data. Unlike a purely model-based EBP, the proposed model-assisted EBP converges in probability to the customary design-consistent estimator as the domain and sample sizes increase. The convergence in probability is shown to hold with respect to the sampling design, irrespective of the assumed mixed model, a property commonly known as design consistency. This property ensures robustness of the proposed predictor against possible model failures. In addition, the convergence in probability is shown to be valid with respect to the assumed mixed model (model consistency). A new mean squared prediction error (MSPE) estimator is proposed...

Journal ArticleDOI
TL;DR: In this article, the generalized information criterion (GIC) was extended to select linear mixed-effects models that are widely applied in analyzing longitudinal data, and the procedure for selecting fixed effects and random effects based on the extended GIC was provided.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a robust estimator for very general mixed linear models that include covariates, which belongs to the class of S-estimators, from which they can derive asymptotic properties for inference.
Abstract: Mixed linear models are used to analyze data in many settings. These models have a multivariate normal formulation in most cases. The maximum likelihood estimator (MLE) or the residual MLE (REML) is usually chosen to estimate the parameters. However, the latter are based on the strong assumption of exact multivariate normality. Welsh and Richardson have shown that these estimators are not robust to small deviations from multivariate normality. This means that in practice a small proportion of data (even only one) can drive the value of the estimates on their own. Because the model is multivariate, we propose a high-breakdown robust estimator for very general mixed linear models that include, for example, covariates. This robust estimator belongs to the class of S-estimators, from which we can derive asymptotic properties for inference. We also use it as a diagnostic tool to detect outlying subjects. We discuss the advantages of this estimator compared with other robust estimators proposed previously and i...

22 Feb 2006
TL;DR: Several possibilities to model non-standard covariate effects such as nonlinear effects of continuous covariates, temporal effects, spatial effects, interaction effects or unobserved heterogeneity are reviewed and embedded in the general framework of structured additive regression.
Abstract: Due to the increasing availability of spatial or spatio-temporal regression data, models that allow to incorporate the special structure of such data sets in an appropriate way are highly desired in practice. A flexible modeling approach should not only be able to account for spatial and temporal correlations, but also to model further covariate effects in a semi- or nonparametric fashion. In addition, regression models for different types of responses are available and extensions require special attention in each of these cases. Within this thesis, numerous possibilities to model non-standard covariate effects such as nonlinear effects of continuous covariates, temporal effects, spatial effects, interaction effects or unobserved heterogeneity are reviewed and embedded in the general framework of structured additive regression. Beginning with exponential family regression, extensions to several types of multicategorical responses and the analysis of continuous survival times are described. A new inferential procedure based on mixed model methodology is introduced, allowing for a unified treatment of the different regression problems. Estimation of the regression coefficients is based on penalized likelihood, whereas smoothing parameters are estimated using restricted maximum likelihood or marginal likelihood. In several applications and simulation studies, the new approach turns out to be a promising alternative to competing methodology, especially estimation based on Markov Chain Monte Carlo simulation techniques.

Journal ArticleDOI
TL;DR: In this article, the authors explored the interpretation of the available individual data in the framework of longitudinal data analysis, making use of the theory of linear mixed models, a flexible model for loss reserving is built.
Abstract: Traditional claims-reserving techniques are based on so-called run-off triangles containing aggregate claim figures. Such a triangle provides a summary of an underlying data set with individual claim figures. This contribution explores the interpretation of the available individual data in the framework of longitudinal data analysis. Making use of the theory of linear mixed models, a flexible model for loss reserving is built. Whereas traditional claims-reserving techniques don’t lead directly to predictions for individual claims, the mixed model enables such predictions on a sound statistical basis with, for example, confidence regions. Both a likelihood-based as well as a Bayesian approach are considered. In the frequentist approach, expressions for the mean squared error of prediction of an individual claim reserve, origin year reserves, and the total reserve are derived. Using MCMC techniques, the Bayesian approach allows simulation from the complete predictive distribution of the reserves an...

Journal ArticleDOI
TL;DR: A method is proposed to take into account left-censored values for estimating parameters of non linear mixed models and its impact is demonstrated through a simulation study and an actual clinical trial of anti-HCV drugs.
Abstract: Mathematical models are widely used for studying the dynamic of infectious agents such as hepatitis C virus (HCV) Most often, model parameters are estimated using standard least-square procedures for each individual Hierarchical models have been proposed in such applications However, another issue is the left-censoring (undetectable values) of plasma viral load due to the lack of sensitivity of assays used for quantification A method is proposed to take into account left-censored values for estimating parameters of non linear mixed models and its impact is demonstrated through a simulation study and an actual clinical trial of anti-HCV drugs The method consists in a full likelihood approach distinguishing the contribution of observed and left-censored measurements assuming a lognormal distribution of the outcome Parameters of analytical solution of system of differential equations taking into account left-censoring are estimated using standard software A simulation study with only 14% of measurements being left-censored showed that model parameters were largely biased (from -55% to +133% according to the parameter) with the exception of the estimate of initial outcome value when left-censored viral load values are replaced by the value of the threshold When left-censoring was taken into account, the relative bias on fixed effects was equal or less than 2% Then, parameters were estimated using the 100 measurements of HCV RNA available (with 12% of left-censored values) during the first 4 weeks following treatment initiation in the 17 patients included in the trial Differences between estimates according to the method used were clinically significant, particularly on the death rate of infected cells With the crude approach the estimate was 013 day-1 (95% confidence interval [CI]: 011; 017) compared to 019 day-1 (CI: 014; 026) when taking into account left-censoring The relative differences between estimates of individual treatment efficacy according to the method used varied from 0001% to 37% We proposed a method that gives unbiased estimates if the assumed distribution is correct (eg lognormal) and that is easy to use with standard software

Journal ArticleDOI
TL;DR: The objective of this technical note is to outline an applied method that estimates statistical power of a dairy nutrition experiment that employs a Latin square as the experimental design.

Journal ArticleDOI
TL;DR: In this article, a height increment model is developed and evaluated for individual trees of ponderosa pine throughout the species range in western United States, using long-term permanent research plots in even-aged, pure stands both planted and of natural origin.

Journal ArticleDOI
TL;DR: In this article, a non-linear hierarchical mixed model approach is used to describe height growth of Norway spruce from longitudinal measurements and the parameter variation in the model was divided into unknown random effects, fixed effects and covariate-dependent effects in order to model tree height growth.
Abstract: A non-linear hierarchical mixed model approach is used to describe height growth of Norway spruce from longitudinal measurements. The parameter variation in the model was divided into unknown random effects, fixed effects and covariate-dependent effects in order to model tree height growth. The values for fixed effect parameters and the variance–covariance matrix of random effects were estimated. Covariates could only explain up to 10% of parameter variability. Height curves were calibrated by means of BLUPs for the unknown random effects using prior height measurements and evaluated using a separate dataset. The resulting curves had a small error variance and plausible shapes.

Journal ArticleDOI
TL;DR: A small example describing the mixed model equations for genetic evaluations and two simulated examples to illustrate the Bayesian variance component estimation are presented.
Abstract: An equivalent model for multibreed variance covariance estimation is presented. It considers the additive case including or not the segregation variances. The model is based on splitting the additive genetic values in several independent parts depending on their genetic origin. For each part, it expresses the covariance between relatives as a partial numerator relationship matrix times the corresponding variance component. Estimation of fixed effects, random effects or variance components provided by the model are as simple as any model including several random factors. We present a small example describing the mixed model equations for genetic evaluations and two simulated examples to illustrate the Bayesian variance component estimation.