scispace - formally typeset
Search or ask a question
Topic

Latent variable model

About: Latent variable model is a research topic. Over the lifetime, 3589 publications have been published within this topic receiving 235061 citations.


Papers
More filters
Posted Content
03 Jul 2017
TL;DR: Recurrent autoencoder model is used to predict the time series in single and multiple steps ahead and significantly improves the state-of-art performance on chaotic time series benchmark and also has better performance on real-world data.
Abstract: In this paper, we use augmented the hierarchical latent variable model to model multi-period time series, where the dynamics of time series are governed by factors or trends in multiple periods. Previous methods based on stacked recurrent neural network (RNN) and deep belief network (DBN) models cannot model the tendencies in multiple periods, and no models for sequential data pay special attention to redundant input variables which have no or even negative impact on prediction and modeling. Applying hierarchical latent variable model with multiple transition periods, our proposed algorithm can capture dependencies in different temporal resolutions. Introducing Bayesian neural network with Horseshoe prior as input network, we can discard the redundant input variables in the optimization process, concurrently with the learning of other parts of the model. Based on experiments with both synthetic and real-world data, we show that the proposed method significantly improves the modeling and prediction performance on multi-period time series.

37 citations

Journal ArticleDOI
TL;DR: In this paper, a model in which both the dependent variable and the explanatory variables are ordinal and have an arbitrary number of categories is discussed, and the joint distribution is compared to the conditional distribution.
Abstract: This article discusses a model in which both the dependent variable and the explanatory variables are ordinal and have an arbitrary number of categories. Assuming joint normality of the underlying continuous latent variables, we compare estimation based on the joint distribution to estimation based on the conditional distribution. Because the explanatory variables are not weakly exogenous in this model, the latter approach implies a loss in efficiency that can be substantial in many cases, as shown in detail for the special case of trichotomous data with symmetric thresholds. Therefore, latent variables underlying the observed ordinal variables should always be considered to be jointly endogenous; that is, the joint distribution should be considered.

37 citations

Journal ArticleDOI
TL;DR: This work aims to develop an active learning method to sequentially select a data set with significant information to enhance latent variable model (LVM)-based soft sensors using the Gaussian process model to link the relationships between the score variables of LVM and the input process variables.
Abstract: With training data of insufficient information, soft sensor models inevitably show some inaccurate predictions in their industrial applications. This work aims to develop an active learning method to sequentially select a data set with significant information to enhance latent variable model (LVM)-based soft sensors. Using the Gaussian process model to link the relationships between the score variables of LVM and the input process variables, the prediction variance can be formulated. And an uncertainty index of LVM is presented. It contains the variances of the predicted outputs and the changes of the predicted outputs per unit change in the designed inputs. Without any prior knowledge of the process, the index is sequentially used to adequately find out from which regions the new informative data should be adopted to enhance the model quality. Additionally, an evaluation criterion is proposed to monitor the active learning procedure. Consequently, the active learning procedures of exploration and exploit...

37 citations

Proceedings ArticleDOI
Chen Zhu1, Hengshu Zhu1, Hui Xiong, Pengliang Ding1, Fang Xie1 
13 Aug 2016
TL;DR: Wang et al. as discussed by the authors proposed a new research paradigm for recruitment market analysis by leveraging unsupervised learning techniques for automatically discovering recruitment market trends based on large-scale recruitment data.
Abstract: Recruitment market analysis provides valuable understanding of industry-specific economic growth and plays an important role for both employers and job seekers. With the rapid development of online recruitment services, massive recruitment data have been accumulated and enable a new paradigm for recruitment market analysis. However, traditional methods for recruitment market analysis largely rely on the knowledge of domain experts and classic statistical models, which are usually too general to model large-scale dynamic recruitment data, and have difficulties to capture the fine-grained market trends. To this end, in this paper, we propose a new research paradigm for recruitment market analysis by leveraging unsupervised learning techniques for automatically discovering recruitment market trends based on large-scale recruitment data. Specifically, we develop a novel sequential latent variable model, named MTLVM, which is designed for capturing the sequential dependencies of corporate recruitment states and is able to automatically learn the latent recruitment topics within a Bayesian generative framework. In particular, to capture the variability of recruitment topics over time, we design hierarchical dirichlet processes for MTLVM. These processes allow to dynamically generate the evolving recruitment topics. Finally, we implement a prototype system to empirically evaluate our approach based on real-world recruitment data in China. Indeed, by visualizing the results from MTLVM, we can successfully reveal many interesting findings, such as the popularity of LBS related jobs reached the peak in the 2nd half of 2014, and decreased in 2015.

37 citations

Journal ArticleDOI
TL;DR: The proposed Bayesian estimation approach performs very well under the studied conditions and some implications of this study, including the misspecified missingness mechanism, the sample size, the sensitivity of the model, the number of latent classes, the model comparison, and the future directions of the approach are discussed.
Abstract: Growth mixture models (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class probabilities depend on some observed explanatory variables and data missingness depends on both the explanatory variables and a latent class variable. A full Bayesian method is then proposed to estimate the model. Through the data augmentation method, conditional posterior distributions for all model parameters and missing data are obtained. A Gibbs sampling procedure is then used to generate Markov chains of model parameters for statistical inference. The application of the model and the method is first demonstrated through the analysis of mathematical ability growth data from the National Longitudinal Survey of Youth 1997 (Bureau of Labor Statistics, U.S. Department of Labor, 1997). A simulation study considering 3 main factors (the sample size, the class probability, and the missing data mechanism) is then conducted and the results show that the proposed Bayesian estimation approach performs very well under the studied conditions. Finally, some implications of this study, including the misspecified missingness mechanism, the sample size, the sensitivity of the model, the number of latent classes, the model comparison, and the future directions of the approach, are discussed.

37 citations


Network Information
Related Topics (5)
Statistical hypothesis testing
19.5K papers, 1M citations
82% related
Inference
36.8K papers, 1.3M citations
81% related
Multivariate statistics
18.4K papers, 1M citations
80% related
Linear model
19K papers, 1M citations
80% related
Estimator
97.3K papers, 2.6M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202375
2022143
2021137
2020185
2019142
2018159