scispace - formally typeset
Search or ask a question
Topic

Latent variable model

About: Latent variable model is a research topic. Over the lifetime, 3589 publications have been published within this topic receiving 235061 citations.


Papers
More filters
Proceedings Article
11 Jul 2009
TL;DR: Compared to existing probabilistic models of latent variables, the proposed perceptron-style algorithm lowers the training cost significantly yet with comparable or even superior classification accuracy.
Abstract: We propose a perceptron-style algorithm for fast discriminative training of structured latent variable model, and analyzed its convergence properties. Our method extends the perceptron algorithm for the learning task with latent dependencies, which may not be captured by traditional models. It relies on Viterbi decoding over latent variables, combined with simple additive updates. Compared to existing probabilistic models of latent variables, our method lowers the training cost significantly yet with comparable or even superior classification accuracy.

60 citations

Journal ArticleDOI
TL;DR: It is concluded that the taxometric method may be an effective approach to distinguishing between dimensional and categorical structure but that other latent modeling procedures may be more effective for specifying the model.
Abstract: Statistical analyses investigating latent structure can be divided into those that estimate structural model parameters and those that detect the structural model type. The most basic distinction among structure types is between categorical (discrete) and dimensional (continuous) models. It is a common, and potentially misleading, practice to apply some method for estimating a latent structural model such as factor analysis without first verifying that the latent structure type assumed by that method applies to the data. The taxometric method was developed specifically to distinguish between dimensional and 2-class models. This study evaluated the taxometric method as a means of identifying categorical structures in general. We assessed the ability of the taxometric method to distinguish between dimensional (1-class) and categorical (2-5 classes) latent structures and to estimate the number of classes in categorical datasets. Based on 50,000 Monte Carlo datasets (10,000 per structure type), and using the comparison curve fit index averaged across 3 taxometric procedures (Mean Above Minus Below A Cut, Maximum Covariance, and Latent Mode Factor Analysis) as the criterion for latent structure, the taxometric method was found superior to finite mixture modeling for distinguishing between dimensional and categorical models. A multistep iterative process of applying taxometric procedures to the data often failed to identify the number of classes in the categorical datasets accurately, however. It is concluded that the taxometric method may be an effective approach to distinguishing between dimensional and categorical structure but that other latent modeling procedures may be more effective for specifying the model.

60 citations

Journal ArticleDOI
TL;DR: A variational approach for fitting the mixture of latent trait models is developed and it is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone.
Abstract: Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone.

60 citations

Journal ArticleDOI
TL;DR: The application of latent variable mixture models is illustrated by applying them to the study of the latent structure of psychotic experiences to address research questions that directly compare the viability of dimensional, categorical and hybrid conceptions of constructs.
Abstract: Latent variable mixture modeling represents a flexible approach to investigating population heterogeneity by sorting cases into latent but non-arbitrary subgroups that are more homogeneous. The purpose of this selective review is to provide a non-technical introduction to mixture modeling in a cross-sectional context. Latent class analysis is used to classify individuals into homogeneous subgroups (latent classes). Factor mixture modeling represents a newer approach that represents a fusion of latent class analysis and factor analysis. Factor mixture models are adaptable to representing categorical and dimensional states of affairs. This article provides an overview of latent variable mixture models and illustrates the application of these methods by applying them to the study of the latent structure of psychotic experiences. The flexibility of latent variable mixture models makes them adaptable to the study of heterogeneity in complex psychiatric and psychological phenomena. They also allow researchers to address research questions that directly compare the viability of dimensional, categorical and hybrid conceptions of constructs.

60 citations

Posted Content
TL;DR: FlyingSquid is built, a weak supervision framework that runs orders of magnitude faster than previous weak supervision approaches and requires fewer assumptions, and proves bounds on generalization error without assuming that the latent variable model can exactly parameterize the underlying data distribution.
Abstract: Weak supervision is a popular method for building machine learning models without relying on ground truth annotations. Instead, it generates probabilistic training labels by estimating the accuracies of multiple noisy labeling sources (e.g., heuristics, crowd workers). Existing approaches use latent variable estimation to model the noisy sources, but these methods can be computationally expensive, scaling superlinearly in the data. In this work, we show that, for a class of latent variable models highly applicable to weak supervision, we can find a closed-form solution to model parameters, obviating the need for iterative solutions like stochastic gradient descent (SGD). We use this insight to build FlyingSquid, a weak supervision framework that runs orders of magnitude faster than previous weak supervision approaches and requires fewer assumptions. In particular, we prove bounds on generalization error without assuming that the latent variable model can exactly parameterize the underlying data distribution. Empirically, we validate FlyingSquid on benchmark weak supervision datasets and find that it achieves the same or higher quality compared to previous approaches without the need to tune an SGD procedure, recovers model parameters 170 times faster on average, and enables new video analysis and online learning applications.

60 citations


Network Information
Related Topics (5)
Statistical hypothesis testing
19.5K papers, 1M citations
82% related
Inference
36.8K papers, 1.3M citations
81% related
Multivariate statistics
18.4K papers, 1M citations
80% related
Linear model
19K papers, 1M citations
80% related
Estimator
97.3K papers, 2.6M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202375
2022143
2021137
2020185
2019142
2018159