# A latent-class mixture model for incomplete longitudinal Gaussian data.

## Summary (3 min read)

### 1 Introduction

- Repeated measures are often prone to incompleteness, often taking the dropout form.
- Since one can never be certain about the dropout mechanism, certain assumptions have to be made.
- A non-response process is missing completely at random (MCAR) if the missingness is independent of both unobserved and observed data, and missing at random (MAR) if, conditional on the observed data, the missingness is independent of the unobserved measurements.
- Information from the location and evolution of the response profiles, a selection model concept, and from the dropout patterns, a pattern-mixture idea, is used simultaneously to define latent groups and variables, a shared- 1 parameter feature.
- Second, apart from providing a more flexible modeling tool, there is room for use as a sensitivity analysis instrument.

### 2 Latent-Class Mixture Models

- In principle, one would like to consider the density of the full data f(yi, di|θ,ψ), where the parameter vectors θ and ψ describe the measurement and missingness processes, respectively.
- The measurement process as well as the dropout process depend on this latent variable, directly and through the subject-specific effects bi.
- The authors then assume that gij(wij, bi, qik) satisfies logit[gij(wij, bi, qik)] = wijγk + λbi.
- Not all models that can be formulated in this way are identified, so restrictions might be needed.

### 3 Likelihood Function and Estimation

- Estimation of the unknown parameters in the latent-class mixture model described in the previous section will be based on maximum likelihood.
- Böhning (1999) shows that a mixture of two normals with simultaneously different means and different variances is not identifiable.
- In line with Böhning (1999) and McLachlan and Peel (2000), one could consider several variations to the target model.
- Maximizing `(Ω|yo,d, q), the corresponding log-likelihood, is easier than maximizing the loglikelihood `(Ω|yo,d).
- Denote the expected log-likelihood function, the so-called objective function, by O. The EM algorithm is initiated by means of an initial value Ω(0), after which one oscillates between the E- and M-steps, until convergence.

### 4 Classification

- One can also classify the subjects into the different latent subgroups of the fitted model.
- In certain cases such latent groups can have substantive meaning.
- A scenario would be that two or more posterior probabilities are almost equal, of which one is the maximum of all posterior probabilities for that particular subject.
- This makes classification nearly random and misclassification is likely to occur.
- Therefore, rather than merely considering the classification of subjects into the latent subgroups, it is instructive to inspect the posterior probabilities in full.

### 5 Simulation Study

- An advantage of the latent-class mixture model is its flexible structure, making the model a helpful tool for analyzing incomplete longitudinal data.
- To assess whether this disadvantage counterbalances the advantage of model flexibility, and to assess performance, the authors conduct a simulation study.
- Section 5.1 describes a simplification of the latent-class mixture model used in the simulations and in the application in Section 6.
- The design and results of the simulation study are given in Sections 5.2 and 5.3, respectively.

### 5.1 A Simplification of the Latent-Class Mixture Model

- The matrices Σ (k) i usually depend on i only through the dimension of the response vector for subject i, while parameters are common to all.
- The authors further simplify the model in two steps.
- First, it is assumed that there is only one subject-specific effect bi, a shared intercept, influencing the measurement process, not the dropout process.
- Second, the measurement process is assumed to depend on the latent variable, not in a direct way, but only through the shared intercept.

### 5.2 Design of the Simulation Study

- The authors simulated 250 datasets, each containing measurements and covariate information of 100 subjects.
- In line with Section 5.1, the 9 variances of these two normal distributions are assumed to be equal and are denoted by d2.
- While only the measurement error variance is increased in the second setting, σ = 0.75, both variance parameters are increased in the third setting, d = 3.5 and σ = 1.00.
- Up to the third setting, the chosen parameters result in a bimodal, well-separated mixture distribution.
- Finally, in the dropout model, the logistic regression is based on an intercept only, which differs for both latent classes, namely, γ1 = −2.5 and γ2 = −1.25, respectively, with corresponding probabilities 0.73 and 0.45 of completing the study.

### 5.3 Results of the Simulation Study

- Table 1 contains the results of the simulation study.
- So, the authors can conclude the model fits the data well, even with a larger within-subject variability.
- In the penultimate simulation setting, not only the measurement error variance is increased, but also the variance in the mixture components.
- Bias and MSE values remain small, the order of magnitude not exceeding 10−1 and 10−3, respectively.
- Thus the latent-class mixture model does fit the simulated data well.

### 6 Analysis of Depression Trial Data

- The authors apply the latent-class mixture model to a depression trial, arising from a randomized, double-blind psychiatric clinical trial, conducted in the United States.
- The primary objective of this trial was to compare the efficacy of an experimental anti-depressant with a nonexperimental one.
- In these retrospective analyses, data from 170 patients are considered.
- The Hamilton Depression Rating Scale (HAMD17) is used to measure the depression status of the patients.
- In the two subsequent sections, a latent-class mixture model is fitted to the depression trial and a sensitivity analysis performed.

### 6.1 Formulating a Latent-Class Mixture Model

- The latent-class mixture model framework is used to analyze the depression trial, assuming the patients can be split into g latent subgroups.
- Table 2 shows that when assuming dropout model (6), AIC opts for the model with two 13 latent subgroups (Model 2), whereas BIC gives preference to the shared-parameter model (Model 1).
- Parameter estimates with corresponding standard errors and p-values of the two-component latent-class mixture model are shown in Table 3.
- A more formal comparison of both latent groups regarding their change of HAMD17 score versus baseline confirms this association between the classification and the profile over time.
- The first latent group mainly contains patients who complete the study, 62 in total.

### 6.2 A Sensitivity Analysis

- In addition to the two-component latent-class mixture model shown in Section 6.1, a classical shared-parameter model will be fitted to the depression trial, as well as a pattern-mixture model, and two selection models, based on the selection models introduced by Diggle and Kenward (1994).
- Next, the Diggle-Kenward (DK) model combines a multivariate normal model for the measurement process with a logistic regression model for the dropout process.
- Since the main interest of the depression trial was in the treatment effect at the last visit, Table 5 shows the estimates, standard errors, and p-values for this effect under the five fitted 17 models.
- Note that using both the two-component latent-class mixture model and the classical shared-parameter model, the standard error is reduced by 0.3 units, compared to either selection model, or pattern-mixture model, resulting in a more accurate confidence interval for the treatment effect at the last visit.
- The p-values are clearly moving around the significance level of 0.05.

### 7 Concluding Remarks

- Through its structure, the model captures unobserved heterogeneity between latent subgroups of the population.
- As shown in the simulation study, the flexibility of such latent-class mixture models outweighs the expected modelling complexity.
- Of course, care has to be taken when interpreting latent classes, since in some applications they may merely be artifacts, without any substantive grounds.
- This is a tricky but well documented problem (McLachlan and Peel 2000).
- Details on starting value selection are embedded in an electronically available companion manual.

Did you find this useful? Give us your feedback

##### Citations

376 citations

### Cites background from "A latent-class mixture model for in..."

...Beunckens et al. (2008) combine continuous random effects with latent classes, leading to the simultaneous use of mixture and mixed-effects models ideas....

[...]

329 citations

240 citations

224 citations

### Cites methods from "A latent-class mixture model for in..."

...However, unlike Wu and Carroll’s (1988) model, the probability of missing data at wave t depends directly on the repeated measures variables....

[...]

...For Wu and Carroll’s (1988) model, identification is driven by distributional assumptions for the random effects (i.e., the individual intercepts and slopes), whereas Diggle and Kenward’s (1994) model requires distributional assumptions for the repeated measures variables....

[...]

...To the extent that this assumption is correct, Wu and Carroll’s model may be preferred because the developmental trajectories—as opposed to single realizations of the quality of life measure—are probable determinants of missingness....

[...]

...In contrast, Wu and Carroll’s (1988) selection model uses individual intercepts and slopes as predictors of missingness....

[...]

...To illustrate, Figure 2 shows a path diagram of a linear growth curve model of the type developed by Wu and Carroll....

[...]

186 citations

### Cites methods from "A latent-class mixture model for in..."

...Second, the Beunckens et al. (2008) mixture selection model is presented....

[...]

...…Have, Bogner, & Katz, 2005, for an application to identifying trajectories of positive affect and negative events following myocardial infarction; Beunckens et al., 2008, for an application to nonignorable missing data modeling in a depression trial; Muthén & Brown, 2009, for an application to…...

[...]

...Beunckens et al. (2008) introduced a model that combines selection modeling features with shared-parameter modeling features....

[...]

##### References

18,201 citations

8,258 citations

8,197 citations

8,095 citations