scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Multivariate Extension of the Dynamic Logit Model for Longitudinal Data Based on a Latent Markov Heterogeneity Structure

01 Jun 2009-Journal of the American Statistical Association (Taylor & Francis)-Vol. 104, Iss: 486, pp 816-831
TL;DR: In this article, an extension of the dynamic logit model is proposed for multivariate categorical longitudinal data, which is based on a marginal parameterization of the conditional distribution of each vector of response variables given the covariates, the lagged response variables, and a set of subject-specific parameters for the unobserved heterogeneity.
Abstract: For the analysis of multivariate categorical longitudinal data, we propose an extension of the dynamic logit model. The resulting model is based on a marginal parameterization of the conditional distribution of each vector of response variables given the covariates, the lagged response variables, and a set of subject-specific parameters for the unobserved heterogeneity. The latter ones are assumed to follow a first-order Markov chain. For the maximum likelihood estimation of the model parameters, we outline an EM algorithm. The data analysis approach based on the proposed model is illustrated by a simulation study and an application to a dataset, which derives from the Panel Study on Income Dynamics and concerns fertility and female participation to the labor market.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A latent Markov quantile regression model for longitudinal data with non-informative drop-out that allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions is proposed.
Abstract: We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set.

78 citations


Cites background or methods from "A Multivariate Extension of the Dyn..."

  • ...The resulting model is a latent Markov regression model (Bartolucci and Farcomeni 2009)....

    [...]

  • ...Bartolucci and Farcomeni (2009), in a different context, report on bias in approximating a latent Gaussian distribution with a similar finite mixture structure....

    [...]

  • ...See also Dardanoni and Forcina (1998) and Bartolucci and Farcomeni (2009)....

    [...]

  • ...Keywords Asymmetric Laplace distribution · Hidden Markov model · Longitudinal data · Quantile regression...

    [...]

  • ...In the latent Markov context, BIC is usually preferred since AIC often leads to overestimation of the number of latent states (see for instance a brief simulation study in Bartolucci and Farcomeni 2009, and Boucheron and Gassiat 2007 for a more general discussion)....

    [...]

Journal ArticleDOI
21 Aug 2014-Test
TL;DR: A comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data is provided and methods for selecting the number of states and for path prediction are outlined.
Abstract: We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. We illustrate the general version of the LM model which includes individual covariates, and several constrained versions. Constraints make the model more parsimonious and allow us to consider and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also illustrate in detail maximum likelihood estimation through the Expectation–Maximization algorithm, which may be efficiently implemented by recursions taken from the hidden Markov literature. We outline methods for obtaining standard errors for the parameter estimates. We also illustrate methods for selecting the number of states and for path prediction. Finally, we mention issues related to Bayesian inference of LM models. Possibilities for further developments are given among the concluding remarks.

61 citations


Cites background or methods from "A Multivariate Extension of the Dyn..."

  • ...We refer to Bartolucci (2006) and Bartolucci and Farcomeni (2009) for details....

    [...]

  • ...The multivariate LM model with covariates affecting the manifest probabilities proposed by Bartolucci and Farcomeni (2009) was applied by these authors to data extracted from the “Panel Study of Income Dynamics” database (University of Michigan)....

    [...]

  • ...The more interesting methods are based on the information matrix obtained from the EM algorithm by the technique of Louis (1982) or related techniques; see for instance Turner et al. (1998) and Bartolucci and Farcomeni (2009)....

    [...]

  • ...The formulation proposed by Bartolucci and Farcomeni (2009) includes the covariates in the measurement part of the model in the presence of multivariate responses....

    [...]

  • ...Bartolucci and Farcomeni (2009) used an LM model with covariates affecting the manifest probabilities since they were interested in separately estimating the effect of each covariate on each outcome....

    [...]

Journal ArticleDOI
TL;DR: This work proposes an event-history (EH) extension of the latent Markov approach that may be used with multivariate longitudinal data, in which one or more outcomes of a different nature are observed at each time occasion, and extends the usual forward-backward recursions of Baum and Welch.
Abstract: Summary Mixed latent Markov (MLM) models represent an important tool of analysis of longitudinal data when response variables are affected by time-fixed and time-varying unobserved heterogeneity, in which the latter is accounted for by a hidden Markov chain. In order to avoid bias when using a model of this type in the presence of informative drop-out, we propose an event-history (EH) extension of the latent Markov approach that may be used with multivariate longitudinal data, in which one or more outcomes of a different nature are observed at each time occasion. The EH component of the resulting model is referred to the interval-censored drop-out, and bias in MLM modeling is avoided by correlated random effects, included in the different model components, which follow common latent distributions. In order to perform maximum likelihood estimation of the proposed model by the expectation–maximization algorithm, we extend the usual forward-backward recursions of Baum and Welch. The algorithm has the same complexity as the one adopted in cases of non-informative drop-out. We illustrate the proposed approach through simulations and an application based on data coming from a medical study about primary biliary cirrhosis in which there are two outcomes of interest, one continuous and the other binary.

48 citations


Cites background or methods from "A Multivariate Extension of the Dyn..."

  • ...In any case, this assumption can be easily relaxed when all outcomes are categorical (Bartolucci and Farcomeni, 2009), using a marginal parameterization based on logits and log-odds ratios....

    [...]

  • ...We also pay attention to the computation of the standard errors for the parameter estimates by employing a method proposed in Bartolucci and Farcomeni (2009). The remainder of the article is organized as follows....

    [...]

  • ..., Creemers et al., 2010; Viviani, Rizopoulos, and Alfó, 2014) consider shared-parameter models as a separate framework. Selection models for discrete longitudinal data have been proposed, among others, by Molenberghs, Kenward, and Lesaffre (1997) and Ten Have et al. (1998). The advantages of shared-parameter models like the one we propose are that a directly interpretable marginal model is obtained for the observed outcomes, even if at the price of a heavier estimation procedure....

    [...]

  • ...We also pay attention to the computation of the standard errors for the parameter estimates by employing a method proposed in Bartolucci and Farcomeni (2009)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a latent process is modelled by a mixture of auto-regressive AR(1) processes with different means and correlation coefficients, but with equal variances.
Abstract: Summary Motivated by an application to a longitudinal data set coming from the Health and Retirement Study about self-reported health status, we propose a model for longitudinal data which is based on a latent process to account for the unobserved heterogeneity between sample units in a dynamic fashion. The latent process is modelled by a mixture of auto-regressive AR(1) processes with different means and correlation coefficients, but with equal variances. We show how to perform maximum likelihood estimation of the proposed model by the joint use of an expectation–maximization algorithm and a Newton–Raphson algorithm, implemented by means of recursions developed in the hidden Markov model literature. We also introduce a simple method to obtain standard errors for the parameter estimates and suggest a strategy to choose the number of mixture components. In the application the response variable is ordinal; however, the approach may also be applied in other settings. Moreover, the application to the self-reported health status data set allows us to show that the model proposed is more flexible than other models for longitudinal data based on a continuous latent process. The model also achieves a goodness of fit that is similar to that of models based on a discrete latent process following a Markov chain, while retaining a reduced number of parameters. The effect of different formulations of the latent structure of the model is evaluated in terms of estimates of the regression parameters for the covariates.

44 citations


Cites methods from "A Multivariate Extension of the Dyn..."

  • ...Then, maximum likelihood (ML) estimation may be performed by an adaptation of the EM algorithm for the LM model that was described by Bartolucci and Farcomeni (2009); see also Baum et al. (1970) and Dempster et al. (1977)....

    [...]

  • ...The NR algorithm is based on the observed information matrix which is obtained by the same numerical method as proposed by Bartolucci and Farcomeni (2009)....

    [...]

  • ...Following Bartolucci and Farcomeni (2009), the score vector is computed as the first derivative of the expected value of the completedata log-likelihood, which is obtained after an E-step....

    [...]

  • ...This is a formulation of LM type, which was employed by Bartolucci and Farcomeni (2009) to propose a flexible class of models for multivariate categorical longitudinal data....

    [...]

  • ...These posterior probabilities may be computed by suitable recursions; see Baum et al. (1970) and Bartolucci and Farcomeni (2009) for details....

    [...]

Journal ArticleDOI
TL;DR: The contaminated Gaussian HMM is introduced and the effectiveness of the proposed model in comparison with HMMs of different elliptical distributions is demonstrated, and the performance of some well-known information criteria in selecting the true number of latent states is evaluated.
Abstract: The Gaussian hidden Markov model (HMM) is widely considered for the analysis of heterogenous continuous multivariate longitudinal data. To robustify this approach with respect to possible elliptical heavy-tailed departures from normality, due to the presence of outliers, spurious points, or noise (collectively referred to as bad points herein), the contaminated Gaussian HMM is here introduced. The contaminated Gaussian distribution represents an elliptical generalization of the Gaussian distribution and allows for automatic detection of bad points in the same natural way as observations are typically assigned to the latent states in the HMM context. Once the model is fitted, each observation has a posterior probability of belonging to a particular state and, inside each state, of being a bad point or not. In addition to the parameters of the classical Gaussian HMM, for each state we have two more parameters, both with a specific and useful interpretation: one controls the proportion of bad points and one ...

44 citations


Cites background or methods from "A Multivariate Extension of the Dyn..."

  • ...For multivariate continuous data, attention is commonly focused on Gaussian HMMs (Bartolucci and Farcomeni 2010; Volant et al. 2014; Holzmann and Schwaiger 2015), with few notable exceptions (Bartolucci and Farcomeni 2009; Bulla et al. 2012; Lagona, Maruotti, and Padovano 2015)....

    [...]

  • ..., 2014; Bartolucci and Farcomeni, 2010), with few notable exceptions (Lagona et al., 2015; Bulla et al., 2012; Bartolucci and Farcomeni, 2009)....

    [...]

References
More filters
BookDOI
28 Jan 2005
TL;DR: The important role of finite mixture models in statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and geospatial literature.
Abstract: The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and ge...

8,258 citations

Book
02 Oct 2000
TL;DR: The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the mathematical and statistical literature.
Abstract: The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and ge...

8,095 citations

Journal ArticleDOI
TL;DR: The upper bound is obtained for a specific probabilistic nonsequential decoding algorithm which is shown to be asymptotically optimum for rates above R_{0} and whose performance bears certain similarities to that of sequential decoding algorithms.
Abstract: The probability of error in decoding an optimal convolutional code transmitted over a memoryless channel is bounded from above and below as a function of the constraint length of the code. For all but pathological channels the bounds are asymptotically (exponentially) tight for rates above R_{0} , the computational cutoff rate of sequential decoding. As a function of constraint length the performance of optimal convolutional codes is shown to be superior to that of block codes of the same length, the relative improvement increasing with rate. The upper bound is obtained for a specific probabilistic nonsequential decoding algorithm which is shown to be asymptotically optimum for rates above R_{0} and whose performance bears certain similarities to that of sequential decoding algorithms.

6,804 citations


"A Multivariate Extension of the Dyn..." refers methods in this paper

  • ...Finally, we deal with prediction of the vector of responses and illustrate the Viterbi algorithm (Viterbi 1967; Juang and Rabiner 1991) for path prediction, i.e., prediction of the sequence of latent states of a given subject on the basis of his/ her observable covariates and response variables....

    [...]

  • ...To predict the entire sequence of latent states, we can use the Viterbi algorithm (Viterbi 1967; Juang and Rabiner 1991)....

    [...]

  • ...Finally, we deal with prediction of the vector of responses and illustrate the Viterbi algorithm (Viterbi 1967; Juang and Rabiner 1991) for path prediction, i....

    [...]

Book
25 Jul 1986
TL;DR: In this paper, the authors propose a homogeneity test for linear regression models (analysis of covariance) and show that linear regression with variable intercepts is more consistent than simple regression with simple intercepts.
Abstract: 1. Introduction 2. Homogeneity test for linear regression models (analysis of covariance) 3. Simple regression with variable intercepts 4. Dynamic models with variable intercepts 5. Simultaneous-equations models 6. Variable-coefficient models 7. Discrete data 8. Truncated and censored data 9. Cross-sectional dependent panel data 10. Dynamic system 11. Incomplete panel data 12. Miscellaneous topics 13. A summary view.

6,234 citations