TL;DR: This paper presents, extends, and studies a model for repeated, overdispersed time-to-event outcomes, subject to censoring, and two estimation methods are presented.
Abstract: This paper presents, extends, and studies a model for repeated, overdispersed time-to-event outcomes, subject to censoring. Building upon work by Molenberghs, Verbeke, and Demetrio (2007) and Molenberghs et al. (2010), gamma and normal random effects are included in a Weibull model, to account for overdispersion and between-subject effects, respectively. Unlike these authors, censoring is allowed for, and two estimation methods are presented. The partial marginalization approach to full maximum likelihood of Molenberghs et al. (2010) is contrasted with pseudo-likelihood estimation. A limited simulation study is conducted to examine the relative merits of these estimation methods. The modeling framework is employed to analyze data on recurrent asthma attacks in children on the one hand and on survival in cancer patients on the other.
Time-to-event data are prominent in contemporary statistical analysis, not only for univariate outcomes but also in hierarchical settings.
The timeto-event case is but one of the applications of their framework.
Basic ingredients for their modeling framework, standard generalized linear models, extensions for overdispersion, and the generalized linear mixed model are the subject of Section 3.
Avenues for parameter estimation and ensuing inferences are explored in Section 5, with particular emphasis on so-called partial marginalization and pseudo-likelihood estimation.
2.1 Recurrent asthma attacks in children
These data have been studied in Duchateau and Janssen.
A prevention trial is set up with such children randomized to placebo or drug, and the asthma events that developed over time are recorded in a diary.
The different events are thus clustered within a patient and ordered in time.
The data are presented in calendar time format, where the time at risk for a particular event is the time from the end of the previous event (asthma attack) to the start of the next event (start of the next asthma attack).
Data for the first two patients are listed in Table 1.
2.2 Survival in cancer patients
Hand et al.11 presented data on patients with advanced cancer of the stomach, bronchus, colon, ovary, or breast, who were treated, in addition to standard treatment, with ascorbate.
The outcome of interest, survival time in days, is recorded to address the question as to whether survival times differ with the organ being affected.
Individual-patient data are listed in Table 2.
There are no censored observations in this case.
3 Background
The authors model is based upon the generalized linear model and two of its extensions, the first one to accommodate overdispersion, and the second one to account for data hierarchies, such as in longitudinal data.
Þ and cð , Þ. Often, and are termed ‘‘natural parameter’’ (or ‘‘canonical parameter’’) and ‘‘dispersion parameter’’, respectively.
When the standard exponential-family models constrain the mean–variance relationship, so-called overdispersion is introduced.
In their exponential and Weibull cases, it is in line with the data range to assume such a random effect to follow a gamma distribution, giving rise to the exponential-gamma and Weibull-gamma models.
The model elements are listed in Table 3.
4.1 General model formulation
The authors now need two different notations, ij and ij, to refer to the linear predictor and/or the natural parameter.
It is convenient, but not strictly necessary, to assume that the two sets of random effects, hi and bi, are independent of each other.
Obviously, parameterization (12) allows for random effects ij capturing overdispersion, and formulated directly at mean scale, whereas ij could be considered the GLMM component.
They considered conjugacy, conditional upon the normally-distributed random effect bi.
Fortunately, the Weibull and exponential cases satisfy this property, with gamma random effects.
4.2 Weibull- and exponential-type models for time-to-event data
The general Weibull model for repeated measures, with both gamma and normal random effects can be expressed as f ð yijhi, biÞ ¼ First, it is implicit that the gamma random effects are independent.
This need not be the case and, like in the Poisson case, extension via multivariate gamma distributions is possible.
Third, it is evident that the classical gamma frailty model (i.e., no normal random effects) and the Weibull-based GLMM (i.e., no gamma random effects) follow as special cases.
This is typically considered for the exponential model, but it holds for the Weibull model too, by observing that the Weibull model is nothing but an exponential model for the random variable Y ij.
Fifth, the above expressions are derived for a two-parameter gamma density.
5 Estimation
The key problem in maximizing (23) is the presence of N integrals over the random effects bi and hi.
The authors consider so-called partial marginalization, in agreement with Molenberghs et al.
5 but, unlike these authors, also allowing for censorship.
5.1 Partial marginalization
While closed-form expressions, can be used to implement maximum likelihood estimation, with numerical accuracy governed by the number of terms included in the Taylor series, one can also proceed by what Molenberghs et al.5 termed partial marginalization.
By this the authors refer to integrating the conditional density over the gamma random effects only, leaving the normal random effects untouched.
Now, in the survival case it is evidently very likely that censoring occurs.
Focusing on rightcensored data, it is then necessary to integrate the marginal density over the survival time within the interval ½0,Ci .
Recall that, while expressions of the type (16) appear to be for the univariate case, they extend without problem to the longitudinal setting as well.
5.2 Pseudo-likelihood
Pseudo-likelihood,10,18 as generalized estimating equations19 are useful when the computational burden of full likelihood becomes burdensome and/or when robustness against misspecification of higher order moments is desirable.
Essentially then, the joint distribution is replaced with a product of factors of marginal and/or conditional distributions of lower dimensions.
Let us define pseudo-likelihood in general and formally, after which the authors turn to the special case of pairwise likelihood.
Grouping the outcomes for subject i into a vector.
6 Marginal distributions and moments
Along the lines of Molenberghs et al.5 and Molenberghs and Verbeke,29 the marginal density and moments are derived.
Molenberghs and Verbeke29 showed that only a finite number of moments is finite.
This holds not only for the combined model, but as soon as gamma random effects are combined with Weibull outcomes, i.e., it also applies to the conventional Weibullgamma model.
Because it is possible that even the second and first moments may be infinite, it is wise to check the number of finite moments.
7.1 Recurrent asthma attacks in children
The authors will analyze the times-to-event, introduced in Section 2.1.
The treatment effect 1 is stably identifiable in all four models.
As a result, overdispersion now disappears, given that the standard error values are more trustworthy.
Re-fitted results for all four models in this way are reported in Table 7.
A related finding was reported in Geys et al.26 where excessive computational requirements could be avoided when using pseudo-likelihood.
7.2 Survival in cancer patients
For the generalized Cauchy model, predictor function ’ is set equal to instead.
Parameter estimates are presented in Table 9.
The key scientific question is directed toward the difference in survival across cancer types.
Thus, their analysis illustrates the occurrence, in real life, of distributions without finite moments, with all moments finite, and with a finite number of finite moments.
The generalized Cauchy model has a finite mean and finite variance and provides a parsimonious description, unlike the generalized logistic model, in spite of its full series of finite moments.
8 Simulation study
The authors aim to evaluate the performance of the combined model, Weibull model with gamma frailties and normal random effects, under full likelihood and pseudo-likelihood.
The design of the simulation study was carried out under different settings, to investigate the impact of sample size, censoring percentage, and method of estimation.
The authors used two sets of true parameters, similar in spirit to the ones in Table 7, without and with censoring (full likelihood).
The Mahalanobis distance has the advance of taking the variance-covariance structure into account.
While with increasing proportion of censored observations, within the same sample size, the relative distance seems to be stable when using pseudo-likelihood estimation method.
9 Concluding remarks
Building upon work by Molenberghs et al.,5 the authors have studied the combination of normal and nonnormal random effects in the time-to-event case.
The statistical loss of efficiency of pseudo-likelihood is relatively small, although the consistency behavior for the maximum-likelihood case is better.
The gamma and normal random effects play distinct roles.
The model can be extended further and/or adapted to specific cases.
at KU Leuven University Library on July 22, 2014smm.sagepub.comDownloaded from at KU Leuven University Library on July 22, 2014smm.sagepub.comDownloaded from
TL;DR: The data suggest that high AG, even after adjusting for serum bicarbonate, is a contributing acid-base mechanism to CKD progression in adults with moderate chronic kidney disease.
Abstract: Acid retention associated with reduced glomerular filtration rate (GFR) exacerbates nephropathy progression in partial nephrectomy models of chronic kidney disease (CKD) and might be reflected in p...
TL;DR: It is shown that the Weibull model constitutes a conjugate model for the gamma frailty, leading to explicit expressions for the moments, survival functions, hazard functions, quantiles, and mean residual lifetimes, which facilitate the parameter interpretation of prognostic inference.
Abstract: In meta-analysis of individual patient data with semi-competing risks, the joint frailty–copula model has been proposed, where frailty terms account for the between-study heterogeneity and copulas account for dependence between terminal and nonterminal event times. In the previous works, the baseline hazard functions in the joint frailty–copula model are estimated by the nonparametric model or the penalized spline model, which requires complex maximization schemes and resampling-based interval estimation. In this article, we propose the Weibull distribution for the baseline hazard functions under the joint frailty–copula model. We show that the Weibull model constitutes a conjugate model for the gamma frailty, leading to explicit expressions for the moments, survival functions, hazard functions, quantiles, and mean residual lifetimes. These results facilitate the parameter interpretation of prognostic inference. We propose a maximum likelihood estimation method and make our computer programs available in the R package, joint.Cox. We also show that the delta method is feasible to calculate interval estimates, which is a useful alternative to the resampling-based method. We conduct simulation studies to examine the accuracy of the proposed methods. Finally, we use the data on ovarian cancer patients to illustrate the proposed method.
TL;DR: This paper proposes a so-called marginalized joint model for longitudinal continuous and repeated time-to-event outcomes on the one hand and a marginalized joint models for bivariate repeated time to event outcomes onThe other, which can be fitted relatively easily using standard statistical software.
Abstract: Joint modeling of various longitudinal sequences has received quite a bit of attention in recent times. This paper proposes a so-called marginalized joint model for longitudinal continuous and repeated time-to-event outcomes on the one hand and a marginalized joint model for bivariate repeated time-to-event outcomes on the other. The model has several appealing features. It flexibly allows for association among measurements of the same outcome at different occasions as well as among measurements on different outcomes recorded at the same time. The model also accommodates overdispersion. The time-to-event outcomes are allowed to be censored. While the model builds upon the generalized linear mixed model framework, it is such that model parameters enjoy a direct marginal interpretation. All of these features have been considered before, but here we bring them together in a unified, flexible framework. The model framework's properties are scrutinized using a simulation study. The models are applied to data from a chronic heart failure study and to a so-called comet assay, encountered in preclinical research. Almost surprisingly, the models can be fitted relatively easily using standard statistical software.
14 citations
Cites background or result from "A combined gamma frailty and normal..."
...While these ideas apply to a realm of data types, here, in line with Molenberghs et al. (2012), we focus on time-to-event outcomes, thereby allowing for the possibility of right censoring....
[...]
...Our work builds upon and extend work of Molenberghs et al. (2010, 2012), Heagerty (1999), and Njagi et al. (2012)....
[...]
...This concept was formulated by Molenberghs et al. (2010, 2012) for single sequences and by Njagi et al. (2012) for various sequences simultaneously....
TL;DR: The conclusions are as follows: time has no statistically significant effect on microbiome composition, the correlation between subjects is statistically significant, and treatment has a significant impact on the microbiome composition only in infected subjects who remained infected.
Abstract: Clustered overdispersed multivariate count data are challenging to model due to the presence of correlation within and between samples. Typically, the first source of correlation needs to be addressed but its quantification is of less interest. Here, we focus on the correlation between time points. In addition, the effects of covariates on the multivariate counts distribution need to be assessed. To fulfill these requirements, a regression model based on the Dirichlet-multinomial distribution for association between covariates and the categorical counts is extended by using random effects to deal with the additional clustering. This model is the Dirichlet-multinomial mixed regression model. Alternatively, a negative binomial regression mixed model can be deployed where the corresponding likelihood is conditioned on the total count. It appears that these two approaches are equivalent when the total count is fixed and independent of the random effects. We consider both subject-specific and categorical-specific random effects. However, the latter has a larger computational burden when the number of categories increases. Our work is motivated by microbiome data sets obtained by sequencing of the amplicon of the bacterial 16S rRNA gene. These data have a compositional structure and are typically overdispersed. The microbiome data set is from an epidemiological study carried out in a helminth-endemic area in Indonesia. The conclusions are as follows: time has no statistically significant effect on microbiome composition, the correlation between subjects is statistically significant, and treatment has a significant effect on the microbiome composition only in infected subjects who remained infected.
TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Abstract: The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log- likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components).
23,215 citations
"A combined gamma frailty and normal..." refers background or methods in this paper
...24(4) 434–452 ! The Author(s) 2014 Reprints and permissions: sagepub....
[...]
...implying that the gamma density is reduced to an exponential one, of the form (4) with ’ now taking the role of j 1⁄4 1= j....
[...]
...Parameter estimates and standard errors for the regression coefficients in (1) the exponential model, (2) the exponential-gamma model, (3) the exponential-normal model, and (4) the combined model....
[...]
...of finite moments k 0 all 7 446 Statistical Methods in Medical Research 24(4)...
[...]
...ð 1Þ(2) ð 1Þ(2) ð Þ 438 Statistical Methods in Medical Research 24(4)...
TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Abstract: SUMMARY This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating equations are derived without specifying the joint distribution of a subject's observations yet they reduce to the score equations for multivariate Gaussian outcomes. Asymptotic theory is presented for the general class of estimators. Specific cases in which we assume independence, m-dependence and exchangeable correlation structures from each subject are discussed. Efficiency of the proposed estimators in two simple situations is considered. The approach is closely related to quasi-likelih ood. Some key ironh: Estimating equation; Generalized linear model; Longitudinal data; Quasi-likelihood; Repeated measures.
TL;DR: In this paper, the authors used iterative weighted linear regression to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation.
Abstract: JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Blackwell Publishing and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (General). SUMMARY The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log-likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components). The implications of the approach in designing statistics courses are discussed.
TL;DR: In this paper, generalized linear mixed models (GLMM) are used to estimate the marginal quasi-likelihood for the mean parameters and the conditional variance for the variances, and the dispersion matrix is specified in terms of a rank deficient inverse covariance matrix.
Abstract: Statistical approaches to overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships may be encompassed within the framework of the generalized linear mixed model (GLMM). Given an unobserved vector of random effects, observations are assumed to be conditionally independent with means that depend on the linear predictor through a specified link function and conditional variances that are specified by a variance function, known prior weights and a scale factor. The random effects are assumed to be normally distributed with mean zero and dispersion matrix depending on unknown variance components. For problems involving time series, spatial aggregation and smoothing, the dispersion may be specified in terms of a rank deficient inverse covariance matrix. Approximation of the marginal quasi-likelihood using Laplace's method leads eventually to estimating equations based on penalized quasilikelihood or PQL for the mean parameters and pseudo-likelihood for the variances. Im...
4,317 citations
"A combined gamma frailty and normal..." refers methods in this paper
...Then, the marginal model, in analogy with (8), equals...
TL;DR: This paper presents a meta-analysis of generalized Linear Mixed Models for Gaussian Longitudinal Data and its applications to Hierarchical Models and Random-effects Models.
Abstract: Introduction.- Motivating Studies.- Generalized Linear Models.- Linear Mixed Models for Gaussian Longitudinal Data.- Model Families.- The Strength of Marginal Models.- Likelihood-based Models.- Generalized Estimating Equations.- Pseudo-likelihood.- Fitting Marginal Models with SAS.- Conditional Models.- Pseudo-likehood.- From Subject-Specific to Random-Effects Models.- Generalized Linear Mixed Models (GLMM).- Fitting Generalized Linear Mixed Models with SAS.- Marginal Versus Random-Effects Models.- Ordinal Data.- The Epilepsy Data.- Non-linear Models.- Psuedo-likelihood for a Hierarchical Model.- Random-effects Models with Serial Correlation.- Non-Gaussian Random Effects.- Joint Continuous and Discrete Responses.- High-dimensional Multivariate Repeated Measurements.- Missing Data Concepts.- Simple Methods, Direct Likelikhood and WGEE.- Multiple Imputation and the Expectation-Maximization Algorithm.- Selection Models.- Pattern-mixture Models.- Sensitivity Analysis.- Incomplete Data and SAS.
Q1. What are the contributions mentioned in the paper "A combined gamma frailty and normal random-effects model for repeated, overdispersed time-to-event data" ?
This paper presents, extends, and studies a model for repeated, overdispersed time-to-event outcomes, subject to censoring. A limited simulation study is conducted to examine the relative merits of these estimation methods.