scispace - formally typeset
Search or ask a question
Book

Models for Discrete Longitudinal Data

TL;DR: This paper presents a meta-analysis of generalized Linear Mixed Models for Gaussian Longitudinal Data and its applications to Hierarchical Models and Random-effects Models.
Abstract: Introduction.- Motivating Studies.- Generalized Linear Models.- Linear Mixed Models for Gaussian Longitudinal Data.- Model Families.- The Strength of Marginal Models.- Likelihood-based Models.- Generalized Estimating Equations.- Pseudo-likelihood.- Fitting Marginal Models with SAS.- Conditional Models.- Pseudo-likehood.- From Subject-Specific to Random-Effects Models.- Generalized Linear Mixed Models (GLMM).- Fitting Generalized Linear Mixed Models with SAS.- Marginal Versus Random-Effects Models.- Ordinal Data.- The Epilepsy Data.- Non-linear Models.- Psuedo-likelihood for a Hierarchical Model.- Random-effects Models with Serial Correlation.- Non-Gaussian Random Effects.- Joint Continuous and Discrete Responses.- High-dimensional Multivariate Repeated Measurements.- Missing Data Concepts.- Simple Methods, Direct Likelikhood and WGEE.- Multiple Imputation and the Expectation-Maximization Algorithm.- Selection Models.- Pattern-mixture Models.- Sensitivity Analysis.- Incomplete Data and SAS.
Citations
More filters
Journal ArticleDOI
TL;DR: Most patients with anti-NMDAR encephalitis respond to immunotherapy; second-line immunotherapy is usually effective when first-line treatments fail, and outcomes continued to improve for up to 18 months after symptom onset.
Abstract: Summary Background Anti-NMDA receptor (NMDAR) encephalitis is an autoimmune disorder in which the use of immunotherapy and the long-term outcome have not been defined. We aimed to assess the presentation of the disease, the spectrum of symptoms, immunotherapies used, timing of improvement, and long-term outcome. Methods In this multi-institutional observational study, we tested for the presence of NMDAR antibodies in serum or CSF samples of patients with encephalitis between Jan 1, 2007, and Jan 1, 2012. All patients who tested positive for NMDAR antibodies were included in the study; patients were assessed at symptom onset and at months 4, 8, 12, 18, and 24, by use of the modified Rankin scale (mRS). Treatment included first-line immunotherapy (steroids, intravenous immunoglobulin, plasmapheresis), second-line immunotherapy (rituximab, cyclophosphamide), and tumour removal. Predictors of outcome were determined at the Universities of Pennsylvania (PA, USA) and Barcelona (Spain) by use of a generalised linear mixed model with binary distribution. Results We enrolled 577 patients (median age 21 years, range 8 months to 85 years), 211 of whom were children ( Interpretation Most patients with anti-NMDAR encephalitis respond to immunotherapy. Second-line immunotherapy is usually effective when first-line treatments fail. In this cohort, the recovery of some patients took up to 18 months. Funding The Dutch Cancer Society, the National Institutes of Health, the McKnight Neuroscience of Brain Disorders award, The Fondo de Investigaciones Sanitarias, and Fundacio la Marato de TV3.

2,226 citations

Book
29 Mar 2012
TL;DR: The problem of missing data concepts of MCAR, MAR and MNAR simple solutions that do not (always) work multiple imputation in a nutshell and some dangers, some do's and some don'ts are covered.
Abstract: Basics Introduction The problem of missing data Concepts of MCAR, MAR and MNAR Simple solutions that do not (always) work Multiple imputation in a nutshell Goal of the book What the book does not cover Structure of the book Exercises Multiple imputation Historic overview Incomplete data concepts Why and when multiple imputation works Statistical intervals and tests Evaluation criteria When to use multiple imputation How many imputations? Exercises Univariate missing data How to generate multiple imputations Imputation under the normal linear normal Imputation under non-normal distributions Predictive mean matching Categorical data Other data types Classification and regression trees Multilevel data Non-ignorable methods Exercises Multivariate missing data Missing data pattern Issues in multivariate imputation Monotone data imputation Joint Modeling Fully Conditional Specification FCS and JM Conclusion Exercises Imputation in practice Overview of modeling choices Ignorable or non-ignorable? Model form and predictors Derived variables Algorithmic options Diagnostics Conclusion Exercises Analysis of imputed data What to do with the imputed data? Parameter pooling Statistical tests for multiple imputation Stepwise model selection Conclusion Exercises Case studies Measurement issues Too many columns Sensitivity analysis Correct prevalence estimates from self-reported data Enhancing comparability Exercises Selection issues Correcting for selective drop-out Correcting for non-response Exercises Longitudinal data Long and wide format SE Fireworks Disaster Study Time raster imputation Conclusion Exercises Extensions Conclusion Some dangers, some do's and some don'ts Reporting Other applications Future developments Exercises Appendices: Software R S-Plus Stata SAS SPSS Other software References Author Index Subject Index

2,156 citations


Cites background from "Models for Discrete Longitudinal Da..."

  • ...There is an extensive literature which often concentrates on the longitudinal case (Verbeke and Molenberghs, 2000; Molenberghs and Verbeke, 2005; Daniels and Hogan, 2008)....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors challenge fixed effects (FE) for time-series-cross-sectional and panel data, and argue not simply for technical solutions to endogeneity, but the substantive importance of context/heterogeneity, modelled using RE.
Abstract: This article challenges Fixed Effects (FE) modelling as the ‘default’ for time-series-cross-sectional and panel data. Understanding different within- and between-effects is crucial when choosing modelling strategies. The downside of Random Effects (RE) modelling – correlated lower-level covariates and higher-level residuals – is omitted-variable bias, solvable with Mundlak’s (1978a) formulation. Consequently, RE can provide everything FE promises and more, as confirmed by Monte-Carlo simulations, which additionally show problems with Plumper and Troeger’s FE Vector Decomposition method when data are unbalanced. As well as incorporating time-invariant variables, RE models are readily extendable, with random coefficients, cross-level interactions, and complex variance functions. We argue not simply for technical solutions to endogeneity, but the substantive importance of context/heterogeneity, modelled using RE. The implications extend beyond political science, to all multilevel datasets. However, omitted variables could still bias estimated higher-level variable effects; as with any model, care is required in interpretation.

1,036 citations

Journal Article
TL;DR: A survey of recent developments in the theory and application of composite likelihood is provided in this paper, building on the review paper of Varin(2008), where a range of application areas, including geostatistics, spatial extremes, and space-time mod- els, as well as clustered and longitudinal data and time series are considered.
Abstract: A survey of recent developments in the theory and application of com- posite likelihood is provided, building on the review paper of Varin(2008). A range of application areas, including geostatistics, spatial extremes, and space-time mod- els, as well as clustered and longitudinal data and time series are considered. The important area of applications to statistical genetics is omitted, in light ofLarribe and Fearnhead(2011). Emphasis is given to the development of the theory, and the current state of knowledge on e!ciency and robustness of composite likelihood inference.

1,034 citations


Cites background from "Models for Discrete Longitudinal Da..."

  • ...Molenberghs and Verbeke (2005) in the context of longitudinal studies, and Mardia et al. (2008) in bioinformatics, construct composite likelihoods by pooling pairwise conditional densities LC(θ; y) = m∏ r=1 m∏ s=1 f(yr|ys; θ), or by pooling full conditional densities LC(θ; y) = m∏ r=1 f(yr|y(−r);…...

    [...]

  • ...This pseudolikelihood is the product of the conditional densities of a single observation given its neighbours, LC(θ; y) = m∏ r=1 f(yr|{ys : ys is neighbour of yr}; θ)....

    [...]

  • ...(2.3) Terminology Composite likelihoods are referred to with several different names, including pseudolikelihood (Molenberghs and Verbeke (2005)), approximate likelihood (Stein, Chi and Welty (2004)), and quasi-likelihood (Hjort and Omre (1994); Glasbey (2001); Hjort and Varin (2008))....

    [...]

  • ...Composite likelihood versions of Wald and score statistics for testing H0 : ψ = ψ0 are easily constructed, and have the usual asymptotic χ2q distribution, see Molenberghs and Verbeke (2005)....

    [...]

Journal ArticleDOI
TL;DR: A multiple imputation model is built that allows smooth time trends, shifts across cross‐sectional units, and correlations over time and space, resulting in far more accurate imputations, and enables analysts to incorporate knowledge from area studies experts via priors on individual missing cell values, rather than on difficult‐to‐interpret model parameters.
Abstract: Applications of modern methods for analyzing data with missing values, based primarily on multiple imputation, have in the last half-decade become common in American politics and political behavior. Scholars in this subset of political science have thus increasingly avoided the biases and inefficiencies caused by ad hoc methods like listwise deletion and best guess imputation.However,researchersinmuchofcomparativepoliticsandinternationalrelations,andotherswithsimilardata, have been unable to do the same because the best available imputation methods work poorly with the time-series crosssectiondatastructurescommoninthesefields.Weattempttorectifythissituationwiththreerelateddevelopments.First,we build a multiple imputation model that allows smooth time trends, shifts across cross-sectional units, and correlations over time and space, resulting in far more accurate imputations. Second, we enable analysts to incorporate knowledge from area studies experts via priors on individual missing cell values, rather than on difficult-to-interpret model parameters. Third, because these tasks could not be accomplished within existing imputation algorithms, in that they cannot handle as many variables as needed even in the simpler cross-sectional data for which they were designed, we also develop a new algorithm that substantially expands the range of computationally feasible data types and sizes for which multiple imputation can be used. These developments also make it possible to implement the methods introduced here in freely available open source software that is considerably more reliable than existing algorithms.

901 citations