scispace - formally typeset
Search or ask a question

Showing papers on "Random effects model published in 2002"


Journal ArticleDOI
TL;DR: It is concluded that H and I2, which can usually be calculated for published meta-analyses, are particularly useful summaries of the impact of heterogeneity, and one or both should be presented in publishedMeta-an analyses in preference to the test for heterogeneity.
Abstract: The extent of heterogeneity in a meta-analysis partly determines the difficulty in drawing overall conclusions. This extent may be measured by estimating a between-study variance, but interpretation is then specific to a particular treatment effect metric. A test for the existence of heterogeneity exists, but depends on the number of studies in the meta-analysis. We develop measures of the impact of heterogeneity on a meta-analysis, from mathematical criteria, that are independent of the number of studies and the treatment effect metric. We derive and propose three suitable statistics: H is the square root of the chi2 heterogeneity statistic divided by its degrees of freedom; R is the ratio of the standard error of the underlying mean from a random effects meta-analysis to the standard error of a fixed effect meta-analytic estimate, and I2 is a transformation of (H) that describes the proportion of total variation in study estimates that is due to heterogeneity. We discuss interpretation, interval estimates and other properties of these measures and examine them in five example data sets showing different amounts of heterogeneity. We conclude that H and I2, which can usually be calculated for published meta-analyses, are particularly useful summaries of the impact of heterogeneity. One or both should be presented in published meta-analyses in preference to the test for heterogeneity.

25,460 citations


BookDOI
08 Jul 2002
TL;DR: This book discusses the design of Diagnostic Accuracy Studies, the construction of a Smooth ROC Curve, and how to select a Sampling Plan for Readers based on Sensitivity and Specificity.
Abstract: Preface. Acknowledgments. 1. Introduction. 1.1 Why This Book? 1.2 What Is Diagnostic Accuracy? 1.3 Landmarks in Statistical Methods for Diagnostic Medicine. 1.4 Software. 1.5 Topics not Covered in This Book. 1.6 Summary. I BASIC CONCEPTS AND METHODS. 2. Measures of Diagnostic Accuracy. 2.1 Sensitivity and Specificity. 2.2 The Combined Measures of Sensitivity and Specificity. 2.3 The ROC Curve. 2.4 The Area Under the ROC Curve. 2.5 The Sensitivity at a Fixed FPR. 2.6 The Partial Area Under the ROC Curve. 2.7 Likelihood Ratios. 2.8 Other ROC Curve Indices. 2.9 The Localization and Detection of Multiple Abnormalities. 2.10 Interpretation of Diagnostic Tests. 2.11 Optimal Decision Threshold on the ROC Curve. 2.12 Multiple Tests. 3. The Design of Diagnostic Accuracy Studies. 3.1 Determining the Objective of the Study. 3.2 Identifying the Target Patient Population. 3.3 Selecting a Sampling Plan for Patients. 3.3.1 Phase I: Exploratory Studies. 3.3.2 Phase II: Challenge Studies. 3.3.3 Phase III: Clinical Studies. 3.4 Selecting the Gold Standard. 3.5 Choosing a Measure of Accuracy. 3.6 Identifying the Target Reader Population. 3.7 Selecting a Sampling Plan for Readers. 3.8 Planning the Data Collection. 3.8.1 Format for the Test Results. 3.8.2 Data Collection for the Reader Studies. 3.8.3 Reader Training. 3.9 Planning the Data Analyses. 3.9.1 Statistical Hypotheses. 3.9.2 Reporting the Test Results. 3.10 Determining the Sample Size. 4. Estimation and Hypothesis Testing in a Single Sample. 4.1 Binary Scale Data. 4.1.1 Sensitivity and Specificity. 4.1.2 The Sensitivity and Specificity of Clustered Binary Data. 4.1.3 The Likelihood Ratio (LR). 4.1.4 The Odds Ratio. 4.2 Ordinal Scale Data. 4.2.1 The Empirical ROC Curve. 4.2.2 Fitting a Smooth Curve (Parametric Model). 4.2.3 Estimation of Sensitivity at a Particular FPR. 4.2.4 The Area and Partial Area Under the ROC Curve (Parametric Model). 4.2.5 The Area Under the Curve (Nonparametric Method). 4.2.6 Nonparametric Analysis of Clustered Data. 4.2.7 The Degenerate Data. 4.2.8 Choosing Between Parametric and Nonparametric Methods. 4.3 Continuous Scale Data. 4.3.1 The Empirical ROC Curve. 4.3.2 Fitting a Smooth ROC Curve (Parametric and Nonparametric Methods). 4.3.3 Area Under the ROC Curve (Parametric and Nonparametric). 4.3.4 Fixed FPR The Sensitivity and Decision Threshold. 4.3.5 Choosing the Optimal Operating Point. 4.3.6 Choosing Between Parametric and Nonparametric Techniques. 4.4 Hypothesis Testing About the ROC Area. 5. Comparing the Accuracy of Two Diagnostic Tests. 5.1 Binary Scale Data. 5.1.1 Sensitivity and Specificity. 5.1.2 Sensitivity and Specificity of Clustered Binary Data. 5.2 Ordinal and Continuous Scale Data. 5.2.1 Determining the Equality of Two ROC Curves. 5.2.2 Comparing ROC Curves at a Particular Point. 5.2.3 Determining the Range of FPR for Which TPR Differ. 5.2.4 A Comparison of the Area or Partial Area. 5.3 Tests of Equivalence. 6. Sample Size Calculation. 6.1 The Sample Size for Accuracy Studies of a Single Test. 6.1.1 Sensitivity and Specificity. 6.1.2 The Area Under the ROC Curve. 6.1.3 The Sensitivity at a Fixed FPR. 6.1.4 The Partial Area Under the ROC Curve. 6.2 The Sample Size for the Accuracy of Two Tests. 6.2.1 Sensitivity and Specificity. 6.2.2 The Area Under the ROC Curve. 6.2.3 The Sensitivity at a Fixed FPR. 6.2.4 The Partial Area Under the ROC Curve. 6.3 The Sample Size for Equivalent Studies of Two Tests. 6.4 The Sample Size for Determining a Suitable Cutoff Value. 7. Issues in Meta Analysis for Diagnostic Tests. 7.1 Objectives. 7.2 Retrieval of the Literature. 7.3 Inclusion Exclusion Criteria. 7.4 Extracting Information From the Literature. 7.5 Statistical Analysis. 7.6 Public Presentation. II ADVANCED METHODS. 8. Regression Analysis for Independent ROC Data. 8.1 Four Clinical Studies. 8.1.1 Surgical Lesion in a Carotid Vessel Example. 8.1.2 Pancreatic Cancer Exampl. 8.1.3 Adult Obesity Example. 8.1.4 Staging of Prostate Cancer Example. 8.2 Regression Models for Continuous Scale Tests. 8.2.1 Indirect Regression Models for Smooth ROC Curves. 8.2.2 Direct Regression Models for Smooth ROC Curves. 8.2.3 MRA Use for Surgical Lesion Detection in the Carotid Vessel. 8.2.4 Biomarkers for the Detection of Pancreatic Cancer. 8.2.5 Prediction of Adult Obesity by Using Childhood BMI Measurements. 8.3 Regression Models for Ordinal Scale Tests. 8.3.1 Indirect Regression Models for Latent Smooth ROC Curves. 8.3.2 Direct Regression Model for Latent Smooth ROC Curves. 8.3.3 Detection of Periprostatic Invasion With US. 9. Analysis of Correlated ROC Data. 9.1 Studies With Multiple Test Measurements of the Same Patient. 9.1.1 Indirect Regression Models for Ordinal Scale Tests. 9.1.2 Neonatal Examination Example. 9.1.3 Direct Regression Models for Continuous Scale Tests. 9.2 Studies With Multiple Readers and Tests. 9.2.1 A Mixed Effects ANOVA Model for Summary Measures of Diagnostic Accuracy. 9.2.2 Detection of TAD Example. 9.2.3 The Mixed Effects ANOVA Model for Jackknife Pseudovalues. 9.2.4 Neonatal Examination Example. 9.2.5 A Bootstrap Method. 9.3 Sample Size Calculation for Multireader Studies. 10. Methods for Correcting Verification Bias. 10.1 A Single Binary Scale Test. 10.1.1 Correction Methods With the MAR Assumption. 10.1.2 Correction Methods Without the MAR Assumption. 10.1.3 Hepatic Scintigraph Example. 10.2 Correlated Binary Scale Tests. 10.2.1 An ML Approach Without Covariates. 10.2.2 An ML Approach With Covariates. 10.2.3 Screening Tests for Dementia Disorder Example. 10.3 A Single Ordinal Scale Test. 10.3.1 An ML Approach Without Covariates. 10.3.2 Fever of Uncertain Origin Example. 10.3.3 An ML Approach With Covariates. 10.3.4 Screening Test for Dementia Disorder Example. 10.4 Correlated Ordinal Scale Tests. 10.4.1 The Weighted GEE Approach for Latent Smooth ROC Curves. 10.4.2 A Likelihood Based Approach for ROC Areas. 10.4.3 Use of CT and MRI for Staging Pancreatic Cancer Example. 11. Methods for Correcting Imperfect Standard Bias. 11.1 One Single Test in a Single Population. 11.1.1 Hypothetical and Strongyloides Infection Examples. 11.2 One Single Test in G Populations. 11.2.1 Tuberculosis Example. 11.3 Multiple Tests in One Single Population. 11.3.1 MLEs Under the CIA. 11.3.2 Assessment of Pleural Thickening Example. 11.3.3 ML Approaches Without the CIA. 11.3.4 Bioassays for HIV Example. 11.4 Multiple Binary Tests in G Populations. 11.4.1 ML Approaches Under the CIA. 11.4.2 ML Approaches Without the CIA. 12. Statistical Methods for Meta Analysis. 12.1 Sensitivity and Specificity Pairs. 12.1.1 One Common SROC Curve. 12.1.2 Study Specific SROC Curve. 12.1.3 Evaluation of Duplex Ultrasonography, With and Without Color Guidance. 12.2 ROC Curve Areas. 12.2.1 Fixed Effects Models. 12.2.2 Random Effects Models. 12.2.3 Evaluation of the Dexamethasone Suppression.Test. Index.

2,003 citations


Posted Content
TL;DR: In this paper, a true fixed effects model is extended to the stochastic frontier model using results that specifically employ the nonlinear specification, and the random effects model was reformulated as a special case of the random parameters model that retains the fundamental structure of the Stochastic Frontier model.
Abstract: Received analyses based on stochastic frontier modeling with panel data have relied primarily on results from traditional linear fixed and random effects models. This paper examines extensions of these models that circumvent two important shortcomings of the existing fixed and random effects approaches. The conventional panel data stochastic frontier estimators both assume that technical or cost inefficiency is time invariant. In a lengthy panel, this is likely to be a particularly strong assumption. Second, as conventionally formulated, the fixed and random effects estimators force any time invariant cross unit heterogeneity into the same term that is being used to capture the inefficiency. Thus, measures of inefficiency in these models may be picking up heterogeneity in addition to or even instead of technical or cost inefficiency. In this paper, a true fixed effects model is extended to the stochastic frontier model using results that specifically employ the nonlinear specification. The random effects model is reformulated as a special case of the random parameters model that retains the fundamental structure of the stochastic frontier model. The techniques are illustrated through two applications, a large panel from the U.S. banking industry and a cross country comparison of the efficiency of health care delivery.

838 citations


Journal ArticleDOI
TL;DR: A series of models that exemplify the diversity of problems that can be addressed within the empirical Bayesian framework are presented, using PET data to show how priors can be derived from the between-voxel distribution of activations over the brain.

744 citations


Book
01 Mar 2002
TL;DR: Unbalanced Data Analysis: Basic Methods and Examples of Special Applications - Understanding Linear Models Concepts.
Abstract: Acknowledgments. Chapter 1. Introduction. Chapter 2. Regression. Chapter 3. Analysis of Variance for Balanced Data. Chapter 4. Analyzing Data with Random Effects. Chapter 5. Unbalanced Data Analysis: Basic Methods. Chapter 6. Understanding Linear Models Concepts. Chapter 7. Analysis of Covariance. Chapter 8. Repeated-Measures Analysis. Chapter 9. Multivariate Linear Models. Chapter 10. Generalized Linear Models. Chapter 11. Examples of Special Applications. References. Index.

742 citations


Book
01 Jan 2002
TL;DR: In this paper, the authors present an introduction to the analysis of variance and the meaning of p-values and confidence intervals, as well as results about variances of sample means.
Abstract: Why use this book 1. An introduction to the analysis of variance 2. Regression 3. Models, parameters and GLMs 4. Using more than one explanatory variable 5. Designing experiments - keeping it simple 6. Combining continuous and categorical variables 7. Interactions - getting more complex 8. Checking the models A: Independence 9. Checking the models B: The other three assumptions 10. Model selection I: Principles of model choice and designed experiments 11. Model selection II: Data sets with several explanatory variables 12. Random effects 13. Categorical data 14. What lies beyond? Answers to exercises Revision section: The basics Appendix I: The meaning of p-values and confidence intervals Appendix II: Analytical results about variances of sample means Appendix III: Probability distributions Bibliography

597 citations


BookDOI
28 Mar 2002
TL;DR: In this article, the authors proposed a model for missing data in health-related quality of life (HRQoL) assessment using a multivariate regression model with multivariate procedures for non-monotone missing data.
Abstract: Introduction and Examples Health-related quality of life (HRQoL) Measuring health-related quality of life Study 1: Adjuvant breast cancer trial Study 2: Migraine prevention trial Study 3: Advanced lung cancer trial Study 4: Renal cell carcinoma trial Study 5: Chemoradiation (CXRT) trial Study 6: Osteoarthritis trial Study Design and Protocol Development Introduction Background and rationale Research objectives and goals Selection of subjects Longitudinal designs Selection of measurement instrument(s) Conduct of HRQoL assessments Scoring instruments Models for Longitudinal Studies I Introduction Building models for longitudinal studies Building repeated measures models: The mean structure Building repeated measures models: The covariance structure Estimation and hypothesis testing Models for Longitudinal Studies II Introduction Building growth curve models: The mean (fixed effects) structure Building growth curve models: The covariance structure Model reduction Hypothesis testing and estimation An alternative growth-curve model Moderation and Mediation Introduction Moderation Mediation Other exploratory analyses Characterization of Missing Data Introduction Patterns and causes of missing data Mechanisms of missing data Missing completely at random (MCAR) Missing at random (MAR) Missing not at random (MNAR) Example for trial with variation in timing of assessments Example with different patterns across treatment arms Analysis of Studies with Missing Data Introduction MCAR Ignorable missing data Non-ignorable missing data Simple Imputation Introduction to imputation Missing items in a multi-item questionnaire Regression-based methods Other simple imputation methods Imputing missing covariates Underestimation of variance Final comments Multiple Imputation Introduction Overview of multiple imputation Explicit univariate regression Closest neighbor and predictive mean matching Approximate Bayesian bootstrap (ABB) Multivariate procedures for non-monotone missing data Analysis of the M data sets Miscellaneous issues Pattern Mixture and Other Mixture Models Introduction Pattern mixture models Restrictions for growth curve models Restrictions for repeated measures models Variance estimation for mixture models Random Effects Dependent Dropout Introduction Conditional linear model Varying coefficient models Joint models with shared parameters Selection Models Introduction Outcome selection model for monotone dropout Multiple Endpoints Introduction General strategies for multiple endpoints Background concepts and definitions Single step procedures Sequentially rejective methods Closed testing and gatekeeper procedures Composite Endpoints and Summary Measures Introduction Choosing a composite or summary measure Summarizing across HRQoL domains or subscales Summary measure across time Composite endpoints across time Quality Adjusted Life-Years (QALYs) and Q-TWiST Introduction QALYs Q-TWiST Analysis Plans and Reporting Results Introduction General analysis plan Sample size and power Reporting results Appendix C: Cubic Smoothing Splines Appendix P: PAWS/SPSS Notes Appendix R: R Notes Appendix S: SAS Notes References A Summary appears at the end of each chapter.

516 citations


Journal ArticleDOI
TL;DR: In this paper, a transformed likelihood approach is suggested to estimate fixed effects dynamic panel data models and conditions on the data generating process of the exogenous variables are given to get around the issue of "incidental parameters".

413 citations


Journal ArticleDOI
TL;DR: In this article, the authors compare parametric and shared frailty models in Stata via the streg command, and show that the parametric models are equivalent in certain situations.
Abstract: Frailty models are the survival data analog to regression models, which account for heterogeneity and random effects. A frailty is a latent multiplicative ef- fect on the hazard function and is assumed to have unit mean and variance θ ,w hich is estimated along with the other model parameters. A frailty model is an hetero- geneity model where the frailties are assumed to be individual- or spell-specific. A shared frailty model is a random effects model where the frailties are common (or shared) among groups of individuals or spells and are randomly distributed across groups. Parametric frailty models were made available in Stata with the release of Stata 7, while parametric shared frailty models were made available in a recent series of updates. This article serves as a primer to those fitting parametric frailty models in Stata via the streg command. Frailty models are compared to shared frailty models, and both are shown to be equivalent in certain situations. The user-specified form of the distribution of the frailties (whether gamma or inverse Gaussian) is shown to subtly affect the interpretation of the results. Methods for obtaining predictions that are either conditional or unconditional on the frailty are discussed. An example that analyzes the time to recurrence of infection after catheter insertion in kidney patients is studied.

394 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present new computational techniques for multivariate longitudinal or clustered data with missing values by applying a multivariate extension of a popular linear mixed-effects model, creating multiple imputations of missing values for subsequent analyses by a straightforward and effective Markov chain Monte Carlo procedure.
Abstract: This article presents new computational techniques for multivariate longitudinal or clustered data with missing values. Current methodology for linear mixed-effects models can accommodate imbalance or missing data in a single response variable, but it cannot handle missing values in multiple responses or additional covariates. Applying a multivariate extension of a popular linear mixed-effects model, we create multiple imputations of missing values for subsequent analyses by a straightforward and effective Markov chain Monte Carlo procedure. We also derive and implement a new EM algorithm for parameter estimation which converges more rapidly than traditional EM algorithms because it does not treat the random effects as “missing data,” but integrates them out of the likelihood function analytically. These techniques are illustrated on models for adolescent alcohol use in a large school-based prevention trial.

310 citations


Journal ArticleDOI
TL;DR: A model for repeated measures data with clumping at zero, using a mixed-effects mixed-distribution model with correlated random effects, is presented and the proposed methods are illustrated with analyses of effects of several covariates on medical expenditures in 1996 for subjects clustered within households using data from the Medical Expenditure Panel Survey.
Abstract: Longitudinal or repeated measures data with clumping at zero occur in many applications in biometrics, including health policy research, epidemiology, nutrition, and meteorology These data exhibit correlation because they are measured on the same subject over time or because subjects may be considered repeated measures within a larger unit such as a family They present special challenges because of the extreme non-normality of the distributions involved A model for repeated measures data with clumping at zero, using a mixed-effects mixed-distribution model with correlated random effects, is presented The model contains components to model the probability of a nonzero value and the mean of nonzero values, allowing for repeated measurements using random effects and allowing for correlation between the two components Methods for describing the effect of predictor variables on the probability of nonzero values, on the mean of nonzero values, and on the overall mean amount are given This interpretation also applies to the mixed-distribution model for cross-sectional data The proposed methods are illustrated with analyses of effects of several covariates on medical expenditures in 1996 for subjects clustered within households using data from the Medical Expenditure Panel Survey

Book ChapterDOI
01 Jan 2002
TL;DR: In this article, the authors considered mixed effects with both fixed and random effects, and they showed that the only source of randomness in their models arises from regarding the cases as independent random samples.
Abstract: Models with mixed effects contain both fixed and random effects. Fixed effects are what we have been considering up to now; the only source of randomness in our models arises from regarding the cases as independent random samples. Thus in regression we have an additive measurement error that we assume is independent between cases, and in a GLM we observe independent binomial, Poisson, gamma ... random variates whose mean is a deterministic function of the explanatory variables.

Journal ArticleDOI
TL;DR: The random preference, Fechner, and constant error (tremble) models of stochastic choice under risk are compared in this paper, and various combinations of these approaches are used with expected utility and rank-dependent theory.
Abstract: The random preference, Fechner (or ‘white noise’), and constant error (or ‘tremble’) models of stochastic choice under risk are compared. Various combinations of these approaches are used with expected utility and rank-dependent theory. The resulting models are estimated in a random effects framework using experimental data from two samples of 46 subjects who each faced 90 pairwise choice problems. The best fitting model uses the random preference approach with a tremble mechanism, in conjunction with rank-dependent theory. As subjects gain experience, trembles become less frequent and there is less deviation from behaviour consistent with expected utility theory.

Journal ArticleDOI
TL;DR: This work proposes a likelihood-based approach that requires only the assumption that the random effects have a smooth density, and implementation via the EM algorithm is described, and performance and the benefits for uncovering noteworthy features are illustrated.
Abstract: Joint models for a time-to-event (e.g., survival) and a longitudinal response have generated considerable recent interest. The longitudinal data are assumed to follow a mixed effects model, and a proportional hazards model depending on the longitudinal random effects and other covariates is assumed for the survival endpoint. Interest may focus on inference on the longitudinal data process, which is informatively censored, or on the hazard relationship. Several methods for fitting such models have been proposed, most requiring a parametric distributional assumption (normality) on the random effects. A natural concern is sensitivity to violation of this assumption; moreover, a restrictive distributional assumption may obscure key features in the data. We investigate these issues through our proposal of a likelihood-based approach that requires only the assumption that the random effects have a smooth density. Implementation via the EM algorithm is described, and performance and the benefits for uncovering noteworthy features are illustrated by application to data from an HIV clinical trial and by simulation.

Journal ArticleDOI
TL;DR: This paper demonstrates how the fully Bayesian approach to meta-analysis of binary outcome data, considered on an absolute risk or relative risk scale, can be extended to perform analyses on both the absolute and relative risk scales.
Abstract: When conducting a meta-analysis of clinical trials with binary outcomes, a normal approximation for the summary treatment effect measure in each trial is inappropriate in the common situation where some of the trials in the meta-analysis are small, or the observed risks are close to 0 or 1. This problem can be avoided by making direct use of the binomial distribution within trials. A fully Bayesian method has already been developed for random effects meta-analysis on the log-odds scale using the BUGS implementation of Gibbs sampling. In this paper we demonstrate how this method can be extended to perform analyses on both the absolute and relative risk scales. Within each approach we exemplify how trial-level covariates, including underlying risk, can be considered. Data from 46 trials of the effect of single-dose ibuprofen on post-operative pain are analysed and the results contrasted with those derived from classical and Bayesian summary statistic methods. The clinical interpretation of the odds ratio scale is not straightforward. The advantages and flexibility of a fully Bayesian approach to meta-analysis of binary outcome data, considered on an absolute risk or relative risk scale, are now available.

Journal ArticleDOI
TL;DR: This paper discusses an alternative simple approach for constructing the confidence interval, based on the t-distribution, which has improved coverage probability and is easy to calculate, and unlike some methods suggested in the statistical literature, no iterative computation is required.
Abstract: In the context of a random effects model for meta-analysis, a number of methods are available to estimate confidence limits for the overall mean effect. A simple and commonly used method is the DerSimonian and Laird approach. This paper discusses an alternative simple approach for constructing the confidence interval, based on the t-distribution. This approach has improved coverage probability compared to the DerSimonian and Laird method. Moreover, it is easy to calculate, and unlike some methods suggested in the statistical literature, no iterative computation is required.

Journal ArticleDOI
TL;DR: In this paper, the authors employ zero-inflated Poisson regression models for spatial count data and propose fitting this model within a Bayesian framework considering issues of posterior propriety, informative prior specification and well-behaved simulation based model fitting.
Abstract: Count data arises in many contexts. Here our concern is with spatial count data which exhibit an excessive number of zeros. Using the class of zero-inflated count models provides a flexible way to address this problem. Available covariate information suggests formulation of such modeling within a regression framework. We employ zero-inflated Poisson regression models. Spatial association is introduced through suitable random effects yielding a hierarchical model. We propose fitting this model within a Bayesian framework considering issues of posterior propriety, informative prior specification and well-behaved simulation based model fitting. Finally, we illustrate the model fitting with a data set involving counts of isopod nest burrows for 1649 pixels over a portion of the Negev desert in Israel.

Journal ArticleDOI
TL;DR: The fixed effects OR, random effects OR and random effects RR appear to be reasonably constant across different baseline risks, and clinicians may wish to rely on the random effects model RR and use the PEER to individualize NNT when they apply the results of a meta-analysis in their practice.
Abstract: Background Meta-analyses summarize the magnitude of treatment effect using a number of measures of association, including the odds ratio (OR), risk ratio (RR), risk difference (RD) and/or number needed to treat (NNT). In applying the results of a meta-analysis to individual patients, some textbooks of evidence-based medicine advocate individualizing NNT, based on the RR and the patient's expected event rate (PEER). This approach assumes constant RR but no empirical study to date has examined the validity of this assumption. Methods We randomly selected a subset of meta-analyses from a recent issue of the Cochrane Library (1998, Issue 3). When a meta-analysis pooled more than three randomized controlled trials (RCT) to produce a summary measure for an outcome, we compared the OR, RR and RD of each RCT with the corresponding pooled OR, RR and RD from the meta-analysis of all the other RCT. Using the conventional P-value of 0.05, we calculated the percentage of comparisons in which there were no statistically significant differences in the estimates of OR, RR or RD, and refer to this percentage as the 'concordance rate'. Results For each effect measure, we made 1843 comparisons, extracted from 55 meta-analyses. The random effects model OR had the highest concordance rate, closely followed by the fixed effects model OR and random effects model RR. The minimum concordance rate for these indices was 82%, even when the baseline risk differed substantially. The concordance rates for RD, either fixed effects or random effects model, were substantially lower (54-65%). Conclusions The fixed effects OR, random effects OR and random effects RR appear to be reasonably constant across different baseline risks. Given the interpretational and arithmetic ease of RR, clinicians may wish to rely on the random effects model RR and use the PEER to individualize NNT when they apply the results of a meta-analysis in their practice.

Journal ArticleDOI
TL;DR: In this paper, the authors evaluate the efficiency of conjoint choice designs based on the mixed multinomial logit model and derive an expression for the information matrix for that purpose.
Abstract: A computationally attractive model for the analysis of conjoint choice experiments is the mixed multinomial logit model, a multinomial logit model in which it is assumed that the coefficients follow a (normal) distribution across subjects. This model offers the advantage over the standard multinomial logit model of accommodating heterogeneity in the coefficients of the choice model across subjects, a topic that has received considerable interest recently in the marketing literature. With the advent of such powerful models, the conjoint choice design deserves increased attention as well. Unfortunately, if one wants to apply the mixed logit model to the analysis of conjoint choice experiments, the problem arises that nothing is known about the efficiency of designs based on the standard logit for parameters of the mixed logit. The development of designs that are optimal for mixed logit models or other random effects models has not been previously addressed and is the topic of this paper.The development of efficient designs requires the evaluation of the information matrix of the mixed multinomial logit model. We derive an expression for the information matrix for that purpose. The information matrix of the mixed logit model does not have closed form, since it involves integration over the distribution of the random coefficients. In evaluating it we approximate the integrals through repeated samples from the multivariate normal distribution of the coefficients. Since the information matrix is not a scalar we use the determinant scaled by its dimension as a measure of design efficiency. This enables us to apply heuristic search algorithms to explore the design space for highly efficient designs. We build on previously published heuristics based on relabeling, swapping, and cycling of the attribute levels in the design.Designs with a base alternative are commonly used and considered to be important in conjoint choice analysis, since they provide a way to compare the utilities of pro- files in different choice sets. A base alternative is a product profile that is included in all choice sets of a design. There are several types of base alternatives, examples being a socalled outside alternative or an alternative constructed from the attribute levels in the design itself. We extend our design construction procedures for mixed logit models to include designs with a base alternative and investigate and compare four design classes: designs with two alternatives, with two alternatives plus a base alternative, and designs with three and with four alternatives.Our study provides compelling evidence that each of these mixed logit designs provide more efficient parameter estimates for the mixed logit model than their standard logit counterparts and yield higher predictive validity. As compared to designs with two alternatives, designs that include a base alternative are more robust to deviations from the parameter values assumed in the designs, while that robustness is even higher for designs with three and four alternatives, even if those have 33% and 50% less choice sets, respectively. Those designs yield higher efficiency and better predictive validity at lower burden to the respondent. It is noteworthy that our "best" choice designs, the 3- and 4-alternative designs, resulted not only in a substantial improvement in efficiency over the standard logit design but also in an expected predictive validity that is over 50% higher in most cases, a number that pales the increases in predictive validity achieved by refined model specifications.

Posted Content
TL;DR: In this paper, Monte Carlo methods were used to examine the small sample bias in the binary probit and logit models, the ordered probit model, the tobit model, Poisson regression model for count data and the exponential regression model.
Abstract: The nonlinear fixed effects models in econometrics has often been avoided for two reasons one practical, one methodological. The practical obstacle relates to the difficulty of estimating nonlinear models with possibly thousands of coefficients. In fact, in a large number of models of interest to practitioners, estimation of the fixed effects model is feasible even in panels with very large numbers of groups. The more difficult, methodological question centers on the incidental parameters problem that raises questions about the statistical properties of the estimator. There is very little empirical evidence on the behavior of the fixed effects estimator. In this note, we use Monte Carlo methods to examine the small sample bias in the binary probit and logit models, the ordered probit model, the tobit model, the Poisson regression model for count data and the exponential regression model for a nonnegative random variable. We find three results of note: A widely accepted result that suggests that the probit estimator is actually relatively well behaved appears to be incorrect. Perhaps to some surprise, the tobit model, unlike the others, appears largely to be unaffected by the incidental parameters problem, save for a surprising result related to the disturbance variance estimator. Third, as apparently unexamined previously, the estimated asymptotic estimators for fixed effects estimators appear uniformly to be downward biased.

Journal ArticleDOI
TL;DR: In this article, a new class of functional models in which smoothing splines are used to model fixed effects as well as random effects is introduced, which inherit the flexibility of the linear mixed effects models in handling complex designs and correlation structures.
Abstract: In this article, a new class of functional models in which smoothing splines are used to model fixed effects as well as random effects is introduced. The linear mixed effects models are extended to nonparametric mixed effects models by introducing functional random effects, which are modeled as realizations of zero-mean stochastic processes. The fixed functional effects and the random functional effects are modeled in the same functional space, which guarantee the population-average and subject-specific curves have the same smoothness property. These models inherit the flexibility of the linear mixed effects models in handling complex designs and correlation structures, can include continuous covariates as well as dummy factors in both the fixed or random design matrices, and include the nested curves models as special cases. Two estimation procedures are proposed. The first estimation procedure exploits the connection between linear mixed effects models and smoothing splines and can be fitted using existing software. The second procedure is a sequential estimation procedure using Kalman filtering. This algorithm avoids inversion of large dimensional matrices and therefore can be applied to large data sets. A generalized maximum likelihood (GML) ratio test is proposed for inference and model selection. An application to comparison of cortisol profiles is used as an illustration.

Journal ArticleDOI
TL;DR: In this paper, a random effects model for ring recovery and recapture data analysis is proposed, where the temporal variation in survival probability is treated as random with average value E( k 2 ) = † 2.
Abstract: Existing models for ring recovery and recapture data analysis treat temporal variations in annual survival probability (S) as fixed effects. Often there is no explainable structure to the temporal variation in S 1 , … , S k ; random effects can then be a useful model: Si = E(S) + k i . Here, the temporal variation in survival probability is treated as random with average value E( k 2 ) = † 2 . This random effects model can now be fit in program MARK. Resultant inferences include point and interval estimation for process variation, † 2 , estimation of E(S) and var(E(S)) where the latter includes a component for † 2 as well as the traditional component for v ar(S|S). Furthermore, the random effects model leads to shrinkage estimates, S i , as improved (in mean square error) estimators of Si compared to the MLE, S i , from the unrestricted time-effects model. Appropriate confidence intervals based on the S i are also provided. In addition, AIC has been generalized to random effects models. This paper present...

Journal ArticleDOI
TL;DR: In this article, the authors adopt a Bayesian approach to sample size determination in hierarchical models and provide theoretical tools for studying performance as a function of sample size, with a variety of illustrative results.
Abstract: Sample size determination (SSD) is a crucial aspect of experimental design. Two SSD problems are considered here. The first concerns how to select a sample size to achieve specified performance with regard to one or more features of a model. Adopting a Bayesian perspective, we move the Bayesian SSD problem from the rather elementary models addressed in the literature to date in the direction of the wide range of hierarchical models which dominate the current Bayesian landscape. Our approach is generic and thus, in principle, broadly applicable. However, it requires full model specification and computationally intensive simulation, perhaps limiting it practically to simple instances of such models. Still, insight from such cases is of useful design value. In addition, we present some theoretical tools for studying performance as a function of sample size, with a variety of illustrative results. Such results provide guidance with regard to what is achievable. We also offer two examples, a survival model with censoring and a logistic regression model. The second problem concerns how to select a sample size to achieve specified separation of two models. We approach this problem by adopting a screening criterion which in turn forms a model choice criterion. This criterion is set up to choose model 1 when the value is large, model 2 when the value is small. The SSD problem then requires choosing $n_{1}$ to make the probability of selecting model 1 when model 1 is true sufficiently large and choosing $n_{2}$ to make the probability of selecting model 2 when model 2 is true sufficiently large. The required n is $\max(n_{1}, n_{2})$. Here, we again provide two illustrations. One considers separating normal errors from t errors, the other separating a common growth curve model from a model with individual growth curves.

Journal ArticleDOI
TL;DR: A Monte Carlo version of the EM gradient algorithm is developed for maximum likelihood estimation of model parameters and shows that the minimum mean-squared error (MMSE) prediction can be done in a linear fashion in spatial GLMMs analogous to linear kriging.
Abstract: We use spatial generalized linear mixed models (GLMM) to model non-Gaussian spatial variables that are observed at sampling locations in a continuous area. In many applications, prediction of random effects in a spatial GLMM is of great practical interest. We show that the minimum mean-squared error (MMSE) prediction can be done in a linear fashion in spatial GLMMs analogous to linear kriging. We develop a Monte Carlo version of the EM gradient algorithm for maximum likelihood estimation of model parameters. A by-product of this approach is that it also produces the MMSE estimates for the realized random effects at the sampled sites. This method is illustrated through a simulation study and is also applied to a real data set on plant root diseases to obtain a map of disease severity that can facilitate the practice of precision agriculture.

Book
01 Jan 2002
TL;DR: Comparison of Two Samples, Linear Regression Model, Single#x2013 Factor Experiments with Fixed and Random Effects, and Statistical Analysis of Incomplete Data.
Abstract: Comparison of Two Samples- The Linear Regression Model- Single#x2013 Factor Experiments with Fixed and Random Effects- More Restrictive Designs- Incomplete Block Designs- Multifactor Experiments- Models for Categorical Response Variables- Repeated Measures Model- Cross#x2013 Over Design- Statistical Analysis of Incomplete Data

Journal ArticleDOI
TL;DR: For a single time-dependent covariate, Tsiatis and Davidian (2001) have proposed an approach that is easily implemented and does not require an assumption on the distribution of the random effects, and may be generalized to multiple, possibly correlated, time- dependent covariates, as it is demonstrated.
Abstract: In many longitudinal studies, it is of interest to characterize the relationship between a time-to-event (e.g. survival) and several time-dependent and time-independent covariates. Time-dependent covariates are generally observed intermittently and with error. For a single time-dependent covariate, a popular approach is to assume a joint longitudinal data-survival model, where the time-dependent covariate follows a linear mixed effects model and the hazard of failure depends on random effects and time-independent covariates via a proportional hazards relationship. Regression calibration and likelihood or Bayesian methods have been advocated for implementation; however, generalization to more than one time-dependent covariate may become prohibitive. For a single time-dependent covariate, Tsiatis and Davidian (2001) have proposed an approach that is easily implemented and does not require an assumption on the distribution of the random effects. This technique may be generalized to multiple, possibly correlated, time-dependent covariates, as we demonstrate. We illustrate the approach via simulation and by application to data from an HIV clinical trial.

Journal ArticleDOI
TL;DR: In this article, a Bayesian model was used to estimate the random effects of a non-linear mixed effects model for HIV infection. But the model and the data exhibit a number of features that make the use of an ordinary non-linearly mixed effects approach intractable: the data are from two compartments fitted simultaneously against the implicit numerical solution of a system of ordinary differential equations; data from one compartment are subject to censoring; and random effects for one variable are assumed to be from a beta distribution.
Abstract: In the context of a mathematical model describing HIV infection, we discuss a Bayesian modelling approach to a non-linear random effects estimation problem. The model and the data exhibit a number of features that make the use of an ordinary non-linear mixed effects model intractable: (i) the data are from two compartments fitted simultaneously against the implicit numerical solution of a system of ordinary differential equations; (ii) data from one compartment are subject to censoring; (iii) random effects for one variable are assumed to be from a beta distribution. We show how the Bayesian framework can be exploited by incorporating prior knowledge on some of the parameters, and by combining the posterior distributions of the parameters to obtain estimates of quantities of interest that follow from the postulated model.

Journal ArticleDOI
TL;DR: In this paper, a general semiparametric Bayesian model is developed which contains potential outcomes and subject level outcome-specific random effects, and the model is subjected to a fully Bayesian analysis based on Markov chain Monte Carlo simulation methods.

Journal ArticleDOI
TL;DR: Although the concept of individual frailty can be of value when thinking about how data arise or when interpreting parameter estimates in the context of a fitted model, it is argued that the concept is of limited practical value and must be understood as referring to individual random effects.
Abstract: We discuss some of the fundamental concepts underlying the development of frailty and random effects models in survival. One of these fundamental concepts was the idea of a frailty model where each subject has his or her own disposition to failure, their so-called frailty, additional to any effects we wish to quantify via regression. Although the concept of individual frailty can be of value when thinking about how data arise or when interpreting parameter estimates in the context of a fitted model, we argue that the concept is of limited practical value. Individual random effects (frailties), whenever detected, can be made to disappear by elementary model transformation. In consequence, unless we are to take some model form as unassailable, beyond challenge and carved in stone, and if we are to understand the term 'frailty' as referring to individual random effects, then frailty models have no value. Random effects models on the other hand, in which groups of individuals share some common effect, can be used to advantage. Even in this case however, if we are prepared to sacrifice some efficiency, we can avoid complex modelling by using the considerable power already provided by the stratified proportional hazards model. Stratified models and random effects models can both be seen to be particular cases of partially proportional hazards models, a view that gives further insight. The added structure of a random effects model, viewed as a stratified proportional hazards model with some added distributional constraints, will, for group sizes of five or more, provide no more than modest efficiency gains, even when the additional assumptions are exactly true. On the other hand, for moderate to large numbers of very small groups, of sizes two or three, the study of twins being a well known example, the efficiency gains of the random effects model can be far from negligible. For such applications, the case for using random effects models rather than the stratified model is strong. This is especially so in view of the good robustness properties of random effects models. Nonetheless, the simpler analysis, based upon the stratified model, remains valid, albeit making a less efficient use of resources.

Journal ArticleDOI
TL;DR: This study compared different methods for assigning confidence intervals to the analysis of variance estimator of the intraclass correlation coefficient (rho) using Monte Carlo simulations of unbalanced clustered data and data from a cluster randomized trial of an intervention to improve the management of asthma in a general practice setting.
Abstract: A Correction has been published for this article in Statistics in Medicine 23(18) 2004, 2935. This study compared different methods for assigning confidence intervals to the analysis of variance estimator of the intraclass correlation coefficient (ρ). The context of the comparison was the use of ρ to estimate the variance inflation factor when planning cluster randomized trials. The methods were compared using Monte Carlo simulations of unbalanced clustered data and data from a cluster randomized trial of an intervention to improve the management of asthma in a general practice setting. The coverage and precision of the intervals were compared for data with different numbers of clusters, mean numbers of subjects per cluster and underlying values of ρ. The performance of the methods was also compared for data with Normal and non-Normally distributed cluster specific effects. Results of the simulations showed that methods based upon the variance ratio statistic provided greater coverage levels than those based upon large sample approximations to the standard error of ρ. Searle's method provided close to nominal coverage for data with Normally distributed random effects. Adjusted versions of Searle's method to allow for lack of balance in the data generally did not improve upon it either in terms of coverage or precision. Analyses of the trial data, however, showed that limits provided by Thomas and Hultquist's method may differ from those of the other variance ratio statistic methods when the arithmetic mean differs markedly from the harmonic mean cluster size. The simulation results demonstrated that marked non-Normality in the cluster level random effects compromised the performance of all methods. Confidence intervals for the methods were generally wide relative to the underlying size of ρsuggesting that there may be great uncertainty associated with sample size calculations for cluster trials where large clusters are randomized. Data from cluster based studies with sample sizes much larger than those typical of cluster randomized trials are required to estimate ρ with a reasonable degree of precision. Copyright © 2002 John Wiley & Sons, Ltd.