scispace - formally typeset
Search or ask a question

Showing papers on "Mixed model published in 1995"


Journal ArticleDOI
TL;DR: In this article, an algorithm is described to estimate variance components for a univariate animal model using sparse matrix techniques, where residuals and fitted values for random effects are used to derive additional right-hand sides for which the mixed model equations can be solved in turn to yield an average of the observed and expected second derivatives of the likelihood function.

347 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present simple hierarchical centring reparametrisations that often give improved convergence for a broad class of normal linear mixed models, including the Laird-Ware model, and a general structure for hierarchically nested linear models.
Abstract: SUMMARY The generality and easy programmability of modern sampling-based methods for maximisation of likelihoods and summarisation of posterior distributions have led to a tremendous increase in the complexity and dimensionality of the statistical models used in practice. However, these methods can often be extremely slow to converge, due to high correlations between, or weak identifiability of, certain model parameters. We present simple hierarchical centring reparametrisations that often give improved convergence for a broad class of normal linear mixed models. In particular, we study the two-stage hierarchical normal linear model, the Laird-Ware model for longitudinal data, and a general structure for hierarchically nested linear models. Using analytical arguments, simulation studies, and an example involving clinical markers of acquired immune deficiency syndrome (AIDS), we indicate when reparametrisation is likely to provide substantial gains in efficiency.

318 citations


Journal ArticleDOI
TL;DR: In this paper, the authors describe a model for estimation of effect size when there is selection based on one-tailed p-values, when the process of publication favors studies with smallp-values and hence large effect estimates.
Abstract: When the process of publication favors studies with smallp-values, and hence large effect estimates, combined estimates from many studies may be biased. This paper describes a model for estimation of effect size when there is selection based on one-tailedp-values. The model employs the method of maximum likelihood in the context of a mixed (fixed and random) effects general linear model for effect sizes. It offers a test for the presence of publication bias, and corrected estimates of the parameters of the linear model for effect magnitude. The model is illustrated using a well-known data set on the benefits of psychotherapy.

259 citations


Journal ArticleDOI
TL;DR: Four categories of model, simple interpolation, thin plate splines, multiple linear regression and mixed spline-regression, were tested for their ability to predict the spatial distribution of temperature on the British mainland.
Abstract: 1. The prediction and mapping of climate in areas between climate stations is of increasing importance in ecology. 2. Four categories of model, simple interpolation, thin plate splines, multiple linear regression and mixed spline-regression, were tested for their ability to predict the spatial distribution of temperature on the British mainland. The models were tested by external cross-verification. 3. The British distribution of mean daily temperature was predicted with the greatest accuracy by using a mixed model: a thin plate spline fitted to the surface of the country, after correction of the data by a selection from 16 independent topographical variables (such as altitude, distance from the sea, slope and topographic roughness), chosen by multiple regression from a digital terrain model (DTM) of the country. 4. The next most accurate method was a pure multiple regression model using the DTM. Both regression and thin plate spline models based on a few variables (latitude, longitude and altitude) only were comparatively unsatisfactory, but some rather simple methods of surface interpolation (such as bilinear interpolation after correction to sea level) gave moderately satisfactory results. Differences between the methods seemed to be dependent largely on their ability to model the erect of the sea on land temperatures. 5. Prediction of temperature by the best methods was greater than 95% accurate in all months of the year, as shown by the correlation between the predicted and actual values. The predicted temperatures were calculated at real altitudes, not subject to sea-level correction. 6. A minimum of just over 30 temperature recording stations would generate a satisfactory surface, provided the stations were well spaced. 7. Maps of mean daily temperature, using the best overall methods are provided; further important variables, such as continentality and length of growing season, were also mapped. Many of these are believed to be the first detailed representations at real altitude. 8. The interpolated monthly temperature surfaces are available on disk

95 citations


Journal ArticleDOI
TL;DR: In this paper, a unified derivation of BLUP, ML and REML estimation procedures for normally distributed response variables with possibly correlated random components occurring in the mixed model for the mean is presented.
Abstract: This paper presents a unified derivation of BLUP, ML and REML estimation procedures for normally distributed response variables with possibly correlated random components occurring in the mixed model for the mean. The theory is extended to generalised linear mixed models, where the response variable is not necessarily normally distributed but the model may be fitted using a penalised quasi-likelihood approach which mirrors the development in normal theory models. The method is applied to binomially distributed response variables with logit link to a mixed model containing a random component distributed as an AR(1) process.

74 citations


Journal ArticleDOI
TL;DR: A method for predicting substitution rates at nucleotide sites by using homologous DNA sequences is presented, which is unbiased and "best" in the sense that it minimizes the mean squared error and maximizes the correlation between the predictor and the true value.
Abstract: Nucleotides in a DNA sequence may be changing at different rates, because they are located in different structural and functional regions of the gene, and are thus subject to different mutational pressures or selective restrictions. Knowledge of substitution rates at specific sites is important for understanding the forces and mechanisms that have shaped the evolution of the DNA sequences. The gamma distribution has previously been proposed to model such rate variation among nucleotide sites. Based on mixed model methodology we present in this paper a method for predicting substitution rates at nucleotide sites by using homologous DNA sequences. The predictor is unbiased and "best" in the sense that it minimizes the mean squared error and maximizes the correlation between the predictor and the true value. It is also quite robust to errors in estimates of parameters in the model. A numerical example is given, with guidelines for the practical use of the approach. The most influential factor affecting the accuracy of prediction is the number of sequences; to get a correlation of over .7 between the predictor and the true value, about six to seven sequences are needed, depending on the overall similarity of the sequences.

49 citations


Journal ArticleDOI
TL;DR: In this article, the kanban-based operation of a mixed model manufacturing line is studied, and experimental design features are discussed with respect to simulation related issues, performance measures, statistical analysis, experimental design clusters and design clusters are summarized in tabular format.
Abstract: The kanban-based operation of a mixed model manufacturing line is studied. Features of the hypothetical manufacturing line modelled are presented in terms of general structure, major components and operational characteristics. Simulation model developed is described and parameters of the base model are given. Experimental design features are discussed with respect to simulation related issues, performance measures, statistical analysis and experimental design clusters. Statistical findings are summarized in tabular format. Non-intuitive behaviour observed in each experiment set is interpreted.

46 citations


Journal ArticleDOI
TL;DR: In this article, nonparametric effects and hypotheses for general models are defined and hypotheses are formulated in a design where treatment, centers (strata), and interactions are assumed to be fixed factors, and their properties are analyzed in corresponding linear models and in models with Lehmann alternatives.
Abstract: Motivated by some problems arising from multiclinic trials, we consider stratified two-sample designs. Nonparametric effects are defined and nonparametric hypotheses are formulated in a design where treatment, centers (strata), and interactions are assumed to be fixed factors. The interpretation of the nonparametric effects and hypotheses is analyzed in two classes of semiparametric models: the linear models and models with Lehmann alternatives. The case where centers and interactions are assumed to be random factors, the so-called mixed model, is also considered. Nonparametric effects and hypotheses are defined for general models, and their properties are analyzed in corresponding linear models and in models with Lehmann alternatives. The nonparametric effects are estimated by linear rank statistics where the ranks over all centers are used. The mixed model for repeated (baseline and endpoint) observations is briefly considered, and rank procedures are also proposed for this model. All procedure...

45 citations


Journal ArticleDOI
TL;DR: An EM algorithm is developed to compute the maximum likelihood estimates of regression coefficients of the fixed effects and random effects, and variance components, and the likelihood ratio test is used for the preliminary testing of batch-to-batch variation.
Abstract: This paper proposes a normal mixed effects model for stability analysis. An EM algorithm is developed to compute the maximum likelihood estimates of regression coefficients of the fixed effects and random effects, and variance components. The likelihood ratio test is used for the preliminary testing of batch-to-batch variation. An example from a marketing stability study is given to illustrate the proposed procedure.

26 citations


Journal ArticleDOI
01 Dec 1995-Genetics
TL;DR: The finite polygenic mixed model, extended for linkage analysis, leads to a likelihood that can be calculated using efficient algorithms developed for oligogenic models that could be inferred to be closer to the simulated values in these pedigrees.
Abstract: This paper presents an extension of the finite polygenic mixed model of Fernando et al. (1994) to linkage analysis. The finite polygenic mixed model, extended for linkage analysis, leads to a likelihood that can be calculated using efficient algorithms developed for oligogenic models. For comparison, linkage analysis of 5 simulated 4021-member pedigrees was performed using the usual mixed model of inheritance, approximated by Hasstedt (1982), and the finite polygenic mixed model extended for linkage analysis presented here. Maximum likelihood estimates of the finite polygenic mixed model could be inferred to be closer to the simulated values in these pedigrees.

24 citations


Journal Article
TL;DR: This work has developed a predictor for the unknown part of the stem under the mixed model for repeated measurements that provides an eminently satisfactory solution to this important marking for bucking problem under incomplete information.
Abstract: The problem of predicting future observations on a statistical unit given past measurements on the same and other similar units is frequently encountered in practical applications. When computer-based marking for bucking routines is used in a forest processor, it is usually not feasible to run the whole tree stem through the measuring device before the first cutting decisions have to be made. However, for optimal conversion of single stems into smaller logs, the whole stem should be measured in advance. To this end we have developed a predictor for the unknown part of the stem under the mixed model for repeated measurements. Our prediction-based approach provides an eminently satisfactory solution to this important marking for bucking problem under incomplete information.

Journal ArticleDOI
TL;DR: Hasse diagrams summarize the structure of mixed models and can be used by a statistical consultant to help design a complicated experiment or to help clarify the data structure to be analyzed.
Abstract: Hasse diagrams summarize the structure of mixed models and can be used by a statistical consultant to help design a complicated experiment or to help clarify the structure of data to be analyzed. They are also useful in the classroom as an aid for obtaining expected mean squares or deciding which denominator should be used in an F statistic.

Journal ArticleDOI
TL;DR: In this article, the authors summarize different estimation methods and examine some of their properties, and give an example that most of the commonly used estimators are inconsistent, which shows that many of them are not well studied.

Journal ArticleDOI
TL;DR: In this article, a hierarchical mixed effects model is proposed to model the correlations between yield responses at the same site, and a number of measures of the quality of this strategy are proposed.
Abstract: Utilization of regional data for optimizing fertilizer strategy is considered. It is argued that a hierarchical mixed effects model is a reasonable, parsimonious way to model the data. Such models take into account the correlations between yield responses at the same site. Also, they conveniently let one model the differences between years, or between sites, as a function of year and site characteristics. We specifically develop the case of a linear (in the parameters) response model, but hierarchical models could also be based on nonlinear response functions. Given the estimated parameters, the calculation of an optimal fertilizer strategy (which depends on the site and year characteristics) is usually straightforward. A number of measures of the quality of this strategy are proposed. Among these are the risk, which measures yield loss caused by using estimated rather than true parameter values. Another measure of interest is the potential gain from having a model that explains all the between year, or all the between site variability. The effect of potassium on sown prairies in France is treated as an example.

Journal ArticleDOI
TL;DR: In this paper, a mixed linear model with two variance components is considered and a class of estimators improving on ANOVA estimators for the variance components and the ratio of variances are constructed on the basis of the invariant statistics.

Journal ArticleDOI
TL;DR: In this paper, the problem of assessing deviations from the assumptions of mixed-model analysis of variance is considered and the use of easily computed residual analysis techniques, similar to methods used in fixed effects models, is illustrated and shown to give informative results.

Book ChapterDOI
01 Jan 1995
TL;DR: This paper proposes the use of the parametric bootstrap as a practical tool for addressing problems associated with inference from GLMMs, and shows the power of the bootstrap approach in two small area estimation examples.
Abstract: Generalized linear mixed models (GLMMs) provide a unified framework for analyzing relationships between binary, count or continuous response variables and predictors with either fixed or random effects. Recent advances in approximate fitting procedures and Markov Chain Monte Carlo techniques, as well as the widespread availability of high speed computers suggest that GLMM software will soon be a standard feature of many statistical packages. Although the difficulty of fitting of GLMMs has to a large extent been overcome, there are still many unresolved problems, particularly with regards to inference. For example, analytical formulas for standard errors and confidence intervals for linear combinations of fixed and random effects are often unreliable or not available, even in the classical case with normal errors. In this paper we propose the use of the parametric bootstrap as a practical tool for addressing problems associated with inference from GLMMs. The power of the bootstrap approach is illustrated in two small area estimation examples. In the first example, it is shown that the bootstrap reproduces complicated analytical formulas for the standard errors of estimates of small area means based on a normal theory mixed linear model. In the second example, involving a logistic-normal model, the bootstrap produces sensible estimates for standard errors, even though no analytical formulas are available.

Journal ArticleDOI
TL;DR: In this paper, several new empirical models based on water chemistry variables, on map parameters of the lake and its catchment, and combinations of such variables are presented to predict annual mean values of total phosphorus (TP) in small glacial lakes.
Abstract: A lake is a product of processes in its watershed, and these relationships should be empirically quantifiable. Yet few studies have made that attempt. This study quantifies and ranks variables of significance to predict annual mean values of total phosphorus (TP) in small glacial lakes. Several new empirical models based on water chemistry variables, on ‘map parameters’ of the lake and its catchment, and combinations of such variables are presented. Each variable provides only a limited (statistical) explanation of the variation in annual mean values of TP among lakes. The models are markedly improved by accounting for the distribution of the characteristics (e.g., the mires) in the watershed. The most important map parameters were the proportion of the watershed lying close to the lake covered by rocks and open land (as determined with the drainage area zonation method), relief of the drainage area, lake area and mean depth. These empirical models can be used to predict annual mean TP but only for lakes of the same type. The model based on ‘map parameters’ (r 2=0.56) appears stable. The effects of other factors/variables not accounted for in the model (like redox-induced internal loading and anthropogenic sources) on the variation in annual mean TP may then be estimated quantitatively by residual analysis. A new mixed model (which combines a dynamic mass-balance approach with empirical knowledge) was also developed. The basic objective was to put the empirical results into a dynamic framework, thereby increasing predictive accuracy. Sensitivity tests of the mixed model indicate that it works as intended. However, comparisons against independent data for annual mean TP show that the predictive power of the mixed model is low, likely because crucial model variables, like sedimentation rate, runoff rate, diffusion rate and precipitation factor, cannot be accurately predicted. These model variables vary among lakes, but this mixed model, like most dynamic models, assumed that they are constants.

Journal ArticleDOI
TL;DR: In this article, Box's correction of the ANOVA degrees of freedom for the case of heterogeneous interaction and error variances are compared via Monte carlo Simulation, and different estimators of the correction factors are compared.
Abstract: Data from yield trials conducted in different environments are frequently analysed by ordinary analysis of variance. One of the preconditions for such analyses is that error variances be homogeneous across treatments and replicates. If a mixed model with fixed genotypes and random environments is assumed, genotype-environmental interaction is a random effect. For the ordinary ANOVA to be valid in this case, it is then also required that interaction variances be homogeneous for different genotypes. This paper suggests Box's correction of the ANOVA degrees of freedom for the case of heterogeneous interaction and error variances Different estimators of the correction factors are compared via Monte carlo Simulation.

Book ChapterDOI
01 Jan 1995

Journal ArticleDOI
TL;DR: The Mantel-Haenszel mean score statistic, which can be used for continuous or ordered categorical response variables, is shown to be a useful nonparametric alternative to standard linear model methods for testing the significance of the average treatment difference.
Abstract: SUMMARY This paper studies randomization model methods for analyzing data from a multicenter study comparing the effectiveness of two treatments. The Mantel-Haenszel mean score statistic, which can be used for continuous or ordered categorical response variables, is shown to be a useful nonparametric alternative to standard linear model methods for testing the significance of the average treatment difference. In an extensive simulation study, the mean score test performs nearly as well as the optimal linear model methods when the normal-theory assumptions are satisfied. A related estimator of the average treatment difference is also studied. This estimator, which is a weighted average of the center-specific mean differences, is analogous to the commonly used Mantel-Haenszel estimator of the average odds ratio in stratified 2 x 2 contingency tables. The proposed estimator is equivalent to the fixed-effects analysis of variance estimator from the main effects model, and valid estimation of its variance is feasible under very general assumptions. A frequently occurring problem in medical research is evaluation of the effectiveness of a new treatment in patients with a specified disease or condition. One commonly used design involves selection of eligible patients, randomization to one of two groups (new therapy, standard treatment), measurement of an appropriate outcome variable following treatment, and statistical comparison of the distribution of the outcome in the two groups. Since the number of eligible patients at any one center may be relatively small, such clinical trials often involve the participation of multiple investigators. In this case, patients are randomized within each center. The objectives of the data analysis are to test if there is a statistically significant difference between the two treatments and to estimate the magnitude of the average treatment difference. For both objectives, it is important to control for the fact that the data are from multiple centers. If the outcome variable is normally distributed, standard linear model methods are often used to test and estimate the treatment difference. One issue that arises in such an analysis, however, concerns whether the center effect should be treated as fixed or random. In the fixed-effects model, both treatment group and center are considered to be fixed effects, and the assumption is made that subjects are randomly selected from the center-specific patient populations. This model permits statistical inferences to the patient populations associated with the participating centers. The mixed-model approach treats the treatment group as a fixed effect and the center as a random effect. Although statistical inferences to a larger population of centers are permitted, it makes the strong assumption that centers are selected at random. In practice, very few, if any, study protocols incorporate either random selection of centers or random selection of patients within centers. Thus, neither the mixed model nor the fixed-effects linear model can be strictly justified. In addition, although the fixed-effects two-way analysis of variance (ANOVA) model with effects for center and treatment is commonly used, there is no consensus among statisticians as to whether the model should include the center-by-treatment interaction. This is partially due to the fact that patient enrollment often varies considerably among centers. Whereas the estimator of the average treatment difference from the main effects model

DOI
01 Jan 1995
TL;DR: In this article, the authors proposed a generalized linear mixed model (GLM) as a regression tool for non-normal longitudinal data, which is an interesting combination of dynamic models, by other name state space models, and mixed models, also known as random effect models.
Abstract: Dynamic generalized linear mixed models are proposed as a regression tool for nonnormal longitudinal data. This framework is an interesting combination of dynamic models, by other name state space models, and mixed models, also known as random effect models. The main feature is, that both time- and unit-specific parameters are allowed, which is especially attractive if a considerable number of units is observed over a longer period. Statistical inference is done by means of Markov chain Monte Carlo techniques in a full Bayesian setting. The algorithm is based on iterative updating using full conditionals. Due to the hierarchical structure of the model and the extensive use of Metropolis-Hastings steps for updating this algorithm mainly evaluates (log-)likelihoods in multivariate normal distributed proposals. It is derivative-free and covers a wide range of different models, including dynamic and mixed models, the latter with slight modifications. The methodology is illustrated through an analysis of artificial binary data and multicategorical business test data.

Journal ArticleDOI
TL;DR: Using the same preliminary transformation for both testing and estimation gives us an integrated set of procedures for the full analysis of some widely used mixed linear models.

Journal ArticleDOI
TL;DR: A Monte Carlo simulation revealed that the low empirical rank correlation among Si2 and Wi is most likely due to sampling errors, and it was concluded that the observed low rank correlation does not invalidate the two-way model.
Abstract: Stability analysis of multilocation trials is often based on a mixed two-way model. Two stability measures in frequent use are the environmental variance (S i (2) )and the ecovalence (W i). Under the two-way model the rank orders of the expected values of these two statistics are identical for a given set of genotypes. By contrast, empirical rank correlations among these measures are consistently low. This suggests that the two-way mixed model may not be appropriate for describing real data. To check this hypothesis, a Monte Carlo simulation was conducted. It revealed that the low empirical rank correlation amongS i (2) and W i is most likely due to sampling errors. It is concluded that the observed low rank correlation does not invalidate the two-way model. The paper also discusses tests for homogeneity of S i (2) as well as implications of the two-way model for the classification of stability statistics.

Journal ArticleDOI
TL;DR: In this paper, necessary and sufficient conditions are derived under which the minimum norm invariant quadratic unbiased estimators have uniformly minimum variance among all unbiased invariants for data with arbitrary kurtosis.

Book ChapterDOI
01 Jan 1995
TL;DR: In this paper, the forms and some properties of the information matrix for fixed treatment parameters and for strata variances, in case of generally balanced block designs, are shown, and a short discussion on optimality criteria is also presented.
Abstract: Information matrices are arguments of most of optimality criteria defined under fixed linear models and also for fixed effects in mixed linear models. However, in the context of mixed models interest often lies on variances of random effects as well as on fixed effects. In the paper the forms and some properties of the information matrix for fixed treatment parameters and for strata variances, in case of generally balanced block designs, are shown. A short discussion on optimality criteria is also presented.

Journal ArticleDOI
TL;DR: In this paper, it is shown that the UMV estimator of the treatment mean vector is inadmissible with respect to quadratic loss in balanced mixed linear models, and shrinkage estimators are obtained which improve uniformly over the UMVM estimator.

Journal ArticleDOI
TL;DR: In this paper, a necessary and sufficient condition for the Satterthwaite approximation to be exact is presented for the case of a general balanced mixed model, and a test is subsequently developed for detecting any significant departure from this condition using the data under consideration.
Abstract: Satterthwaite's approximation of the distribution of a nonnegative linear combination of independent mean squares is addressed in this article. A necessary and sufficient condition for the approximation to be exact is presented for the case of a general balanced mixed model. A test is subsequently developed for detecting any significant departure from this condition using the data under consideration. An example is given to illustrate the proposed methodology.

Journal ArticleDOI
TL;DR: In this article, a Monte Carlo-based method is proposed to approximate the increment of the logarithm of the determinant of this coefficient matrix that corresponds to increments of the ratio between residual and additive genetic variance.

Journal ArticleDOI
TL;DR: In this article, an expectation-maximization algorithm is used for the restricted maximum likelihood estimation of variance components in a Sire and Dam Model, which leads to the same estimates of additive genetic and environmental variances as those under the individual animal model.
Abstract: Assuming a specific type of data in the field of animal breeding, the iteration equations based on the expectation-maximization algorithm are derived for the restricted maximum likelihood estimation of variance components in a Sire and Dam Model. The application of the iteration equations to the data leads to the same estimates of additive genetic and environmental variances as those under the lndividual Animal Model. With the procedure using the iteration equations, compared to the case for the lndividual Animal Model, the size of the coefficient matrix to be inverted of the mixed model equations relatively becomes small, and the speed of convergence of the estimates becomes rather fast. Consequently, the total computational burden to obtain the proper estimates of additive genetic and environmental variances is expected to be considerably reduced in the proposed procedure. A numerical illustration, comparing the proposed procedure with the lndividual Animal Model procedure, is given using simulated carcass data on beef cattle.