scispace - formally typeset
Search or ask a question

Showing papers on "Latent variable model published in 2018"


Book ChapterDOI
15 Nov 2018
TL;DR: Introduction The Logic of Latent Variables Latent Class Analysis Estimating Latent Categorical Variables Analyzing Scale Response Patterns Comparing Latent Structures Among Groups Conclusions
Abstract: Latent class analysis (LCA) is a statistical method for identifying unobserved groups based on patterns of categorical data. LCA is related to cluster analysis (see Chapter 4, this volume) in that both methods are concerned with the classification of cases (e.g., people or objects) into groups that are not known or specified a priori. In LCA, cases with similar response patterns on a series of manifest variables are classified into the same latent class with membership in these classes being probabilistic rather than deterministic. LCA can also be viewed as analogous to factor analysis (see Chapter 8, this volume), with the former examining categorical variables and the latter continuous ones; however, this comparison is less direct than in the case of cluster analysis. However, both LCA and factor analysis are based on the principle that observed variables are (conditionally) independent assuming knowledge of the latent structure. Finally, LCA is related to item response theory (see Chapter 11, this volume) and can be viewed as a generalization of discrete response models such as the Rasch model (Lindsay, Clogg, & Grego, 1991).

817 citations


Posted Content
TL;DR: This work shows that latent variational variable models that explicitly model underlying stochasticity and adversarially-trained models that aim to produce naturalistic images are in fact complementary and combines the two to produce predictions that look more realistic to human raters and better cover the range of possible futures.
Abstract: Being able to predict what may happen in the future requires an in-depth understanding of the physical and causal rules that govern the world. A model that is able to do so has a number of appealing applications, from robotic planning to representation learning. However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images. However, a standard latent variable model can struggle to produce realistic results, and a standard adversarially-trained model underutilizes latent variables and fails to produce diverse predictions. We show that these distinct methods are in fact complementary. Combining the two produces predictions that look more realistic to human raters and better cover the range of possible futures. Our method outperforms prior and concurrent work in these aspects.

398 citations


Journal ArticleDOI
Zhiqiang Ge1
TL;DR: A tutorial review of probabilistic latent variable models on process data analytics and detailed illustrations of different kinds of basic PLVMs are provided, as well as their research statuses.
Abstract: Dimensionality reduction is important for the high-dimensional nature of data in the process industry, which has made latent variable modeling methods popular in recent years. By projecting high-di...

185 citations


Journal ArticleDOI
TL;DR: This article briefly reviews methodological studies in minimal technical detail and provides a demonstration to easily quantify the large influence measurement quality has on fit index values and how greatly the cutoffs would change if they were derived under an alternative level of measurement quality.
Abstract: Latent variable modeling is a popular and flexible statistical framework. Concomitant with fitting latent variable models is assessment of how well the theoretical model fits the observed data. Alt...

170 citations


Journal ArticleDOI
TL;DR: The aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features.
Abstract: Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features.

149 citations


Journal ArticleDOI
TL;DR: It is argued that focusing on the contrast between dynamic and static modeling approaches has led to an unrealistic view of testing and finding support for the network approach, as well as an oversimplified picture of the relationship between medical diseases and mental disorders.
Abstract: The network approach to psychopathology is becoming increasingly popular. The motivation for this approach is to provide a replacement for the problematic common cause perspective and the associated latent variable model, where symptoms are taken to be mere effects of a common cause (the disorder itself). The idea is that the latent variable model is plausible for medical diseases, but unrealistic for mental disorders, which should rather be conceptualized as networks of directly interacting symptoms. We argue that this rationale for the network approach is misguided. Latent variable (or common cause) models are not inherently problematic, and there is not even a clear boundary where network models end and latent variable (or common cause) models begin. We also argue that focusing on this contrast has led to an unrealistic view of testing and finding support for the network approach, as well as an oversimplified picture of the relationship between medical diseases and mental disorders. As an alternative, we point out more essential contrasts, such as the contrast between dynamic and static modeling approaches that can provide a better framework for conceptualizing mental disorders. Finally, we discuss several topics and open problems that need to be addressed in order to make the network approach more concrete and to move the field of psychological network research forward. (PsycINFO Database Record

141 citations


Journal ArticleDOI
TL;DR: A broad equivalence is shown between the Ising model and the IRT model, which describes the probability distribution associated with item responses in a psychometric test as a function of a latent variable.
Abstract: In recent years, network models have been proposed as an alternative representation of psychometric constructs such as depression. In such models, the covariance between observables (e.g., symptoms like depressed mood, feelings of worthlessness, and guilt) is explained in terms of a pattern of causal interactions between these observables, which contrasts with classical interpretations in which the observables are conceptualized as the effects of a reflective latent variable. However, few investigations have been directed at the question how these different models relate to each other. To shed light on this issue, the current paper explores the relation between one of the most important network models-the Ising model from physics-and one of the most important latent variable models-the Item Response Theory (IRT) model from psychometrics. The Ising model describes the interaction between states of particles that are connected in a network, whereas the IRT model describes the probability distribution associated with item responses in a psychometric test as a function of a latent variable. Despite the divergent backgrounds of the models, we show a broad equivalence between them and also illustrate several opportunities that arise from this connection.

137 citations


Journal ArticleDOI
TL;DR: DMF is compared with state-of-the-art methods of linear and nonlinear matrix completion in the tasks of toy matrix completion, image inpainting and collaborative filtering and DMF is applicable to large matrices.

130 citations


Journal ArticleDOI
TL;DR: A selective review of statistical modeling of dynamic networks with latent variables, specifically, the latent space models and the latent class models (or stochastic blockmodels), which investigate both the observed features and the unobserved structure of networks.
Abstract: We present a selective review of statistical modeling of dynamic networks. We focus on models with latent variables, specifically, the latent space models and the latent class models (or stochastic blockmodels), which investigate both the observed features and the unobserved structure of networks. We begin with an overview of the static models, and then we introduce the dynamic extensions. For each dynamic model, we also discuss its applications that have been studied in the literature, with the data source listed in Appendix. Based on the review, we summarize a list of open problems and challenges in dynamic network modeling with latent variables.

118 citations


Book ChapterDOI
01 Jan 2018
TL;DR: In this paper, a profile analysis approach of re-parameterizing the linear latent variable model in such a way that the latent variables can be interpreted in terms of profile patterns rather than factors is discussed.
Abstract: The MDS is discussed as a profile analysis approach of re-parameterizing the linear latent variable model in such a way that the latent variables can be interpreted in terms of profile patterns rather than factors. It is used to identify major patterns among psychological variables and can serve as the basis for further study of correlates and/or predictors of profiles and other background and external variables. I outline the procedure of MDS profile analysis and discuss the issues that are related to parameter estimation and interpretation of the results.

97 citations


Proceedings ArticleDOI
01 Jun 2018
TL;DR: This work addresses challenges in a Gaussian Latent Variable model for sequence prediction with a "Best of Many" sample objective that leads to more accurate and more diverse predictions that better capture the true variations in real-world sequence data.
Abstract: For autonomous agents to successfully operate in the real world, anticipation of future events and states of their environment is a key competence. This problem has been formalized as a sequence extrapolation problem, where a number of observations are used to predict the sequence into the future. Real-world scenarios demand a model of uncertainty of such predictions, as predictions become increasingly uncertain - in particular on long time horizons. While impressive results have been shown on point estimates, scenarios that induce multi-modal distributions over future sequences remain challenging. Our work addresses these challenges in a Gaussian Latent Variable model for sequence prediction. Our core contribution is a "Best of Many" sample objective that leads to more accurate and more diverse predictions that better capture the true variations in real-world sequence data. Beyond our analysis of improved model fit, our models also empirically outperform prior work on three diverse tasks ranging from traffic scenes to weather data.

Journal ArticleDOI
TL;DR: A set of switching ARDLV models are proposed in the probabilistic framework, which extends the original single model to its multimode form and a hierarchical fault detection method is developed for process monitoring in the multimode processes.
Abstract: In most industrials, the dynamic characteristics are very common and should be paid enough attention for process control and monitoring purposes. As a high-order Bayesian network model, autoregressive dynamic latent variable (ARDLV) is able to effectively extract both autocorrelations and cross-correlations in data for a dynamic process. However, the operating conditions will be frequently changed in a real production line, which indicates that the measurements cannot be described using a single steady-state model. In this paper, a set of switching ARDLV models are proposed in the probabilistic framework, which extends the original single model to its multimode form. Based on it, a hierarchical fault detection method is developed for process monitoring in the multimode processes. Finally, the proposed method is demonstrated by a numerical example and a real predecarburization unit in an ammonia synthesis process.

Posted Content
TL;DR: This paper frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data and shows that this results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.
Abstract: Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e.g., robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing learned concepts from a set of training tasks to unseen, but related, tasks. Often, this relationship between tasks is hard coded or relies in some other way on human expertise. In this paper, we frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data. We apply our framework in a model-based reinforcement learning setting and show that our meta-learning model effectively generalizes to novel tasks by identifying how new tasks relate to prior ones from minimal data. This results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.

Journal ArticleDOI
TL;DR: This article proposes a likelihood-based method to estimate the latent structure from the data that outperforms the existing approaches and establishes identifiability conditions that ensure the estimability of the structure matrix.
Abstract: This article focuses on a family of restricted latent structure models with wide applications in psychological and educational assessment, where the model parameters are restricted via a latent structure matrix to reflect prespecified assumptions on the latent attributes. Such a latent matrix is often provided by experts and assumed to be correct upon construction, yet it may be subjective and misspecified. Recognizing this problem, researchers have been developing methods to estimate the matrix from data. However, the fundamental issue of the identifiability of the latent structure matrix has not been addressed until now. The first goal of this article is to establish identifiability conditions that ensure the estimability of the structure matrix. With the theoretical development, the second part of the article proposes a likelihood-based method to estimate the latent structure from the data. Simulation studies show that the proposed method outperforms the existing approaches. We further illustra...

Proceedings Article
06 Aug 2018
TL;DR: In this paper, meta learning is used to increase the data efficiency of RL by generalizing learned concepts from a set of training tasks to unseen, but related, tasks by inferring the relationship between tasks automatically from data.
Abstract: Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e.g., robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing learned concepts from a set of training tasks to unseen, but related, tasks. Often, this relationship between tasks is hard coded or relies in some other way on human expertise. In this paper, we frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data. We apply our framework in a modelbased reinforcement learning setting and show that our meta-learning model effectively generalizes to novel tasks by identifying how new tasks relate to prior ones from minimal data. This results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.

Proceedings Article
29 Apr 2018
TL;DR: A cooperative learning algorithm to train both the undirected energy-based model and the directed latent variable model jointly is proposed and it is shown that the cooperativelearning algorithm can learn realistic models of images.
Abstract: This paper proposes a cooperative learning algorithm to train both the undirected energy-based model and the directed latent variable model jointly. The learning algorithm interweaves the maximum likelihood algorithms for learning the two models, and each iteration consists of the following two steps: (1) Modified contrastive divergence for energy-based model: The learning of the energy-based model is based on the contrastive divergence, but the finite-step MCMC sampling of the model is initialized from the synthesized examples generated by the latent variable model instead of being initialized from the observed examples. (2) MCMC teaching of the latent variable model: The learning of the latent variable model is based on how the MCMC in (1) changes the initial synthesized examples generated by the latent variable model, where the latent variables that generate the initial synthesized examples are known so that the learning is essentially supervised. Our experiments show that the cooperative learning algorithm can learn realistic models of images.

Journal ArticleDOI
TL;DR: Estimated standard errors are derived for the two-step estimates of the structural model which account for the uncertainty from both steps of the estimation, and how the method can be implemented in existing software for latent variable modelling is shown.
Abstract: We consider models which combine latent class measurement models for categorical latent variables with structural regression models for the relationships between the latent classes and observed explanatory and response variables. We propose a two-step method of estimating such models. In its first step, the measurement model is estimated alone, and in the second step the parameters of this measurement model are held fixed when the structural model is estimated. Simulation studies and applied examples suggest that the two-step method is an attractive alternative to existing one-step and three-step methods. We derive estimated standard errors for the two-step estimates of the structural model which account for the uncertainty from both steps of the estimation, and show how the method can be implemented in existing software for latent variable modelling.

Proceedings ArticleDOI
12 Oct 2018
TL;DR: The Variational Shape Learner (VSL) as discussed by the authors learns the underlying structure of voxelized 3D shapes in an unsupervised fashion through the use of skip-connections.
Abstract: We propose the Variational Shape Learner (VSL), a generative model that learns the underlying structure of voxelized 3D shapes in an unsupervised fashion. Through the use of skip-connections, our model can successfully learn and infer a latent, hierarchical representation of objects. Furthermore, realistic 3D objects can be easily generated by sampling the VSL's latent probabilistic manifold. We show that our generative model can be trained end-to-end from 2D images to perform single image 3D model retrieval. Experiments show, both quantitatively and qualitatively, the improved generalization of our proposed model over a range of tasks, performing better or comparable to various state-of-the-art alternatives.

Journal ArticleDOI
TL;DR: In this paper, the authors combine the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the deep active inference agent, which minimises a variational free energy bound on the average surprise of its sensations, motivated by a homeostatic argument.
Abstract: This work combines the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the “deep active inference” agent. This agent minimises a variational free energy bound on the average surprise of its sensations, which is motivated by a homeostatic argument. It does so by optimising the parameters of a generative latent variable model of its sensory inputs, together with a variational density approximating the posterior distribution over the latent variables, given its observations, and by acting on its environment to actively sample input that is likely under this generative model. The internal dynamics of the agent are implemented using deep and recurrent neural networks, as used in machine learning, making the deep active inference agent a scalable and very flexible class of active inference agent. Using the mountain car problem, we show how goal-directed behaviour can be implemented by defining appropriate priors on the latent states in the agent’s model. Furthermore, we show that the deep active inference agent can learn a generative model of the environment, which can be sampled from to understand the agent’s beliefs about the environment and its interaction therewith.

Journal ArticleDOI
TL;DR: The unsupervised method uses a conditional variational autoencoder network and constrain transformations to be symmetric and diffeomorphic by applying a differentiable exponentiation layer with a symmetric loss function and provides multi-scale velocity field estimations.
Abstract: We propose to learn a low-dimensional probabilistic deformation model from data which can be used for registration and the analysis of deformations. The latent variable model maps similar deformations close to each other in an encoding space. It enables to compare deformations, generate normal or pathological deformations for any new image or to transport deformations from one image pair to any other image. Our unsupervised method is based on variational inference. In particular, we use a conditional variational autoencoder (CVAE) network and constrain transformations to be symmetric and diffeomorphic by applying a differentiable exponentiation layer with a symmetric loss function. We also present a formulation that includes spatial regularization such as diffusion-based filters. Additionally, our framework provides multi-scale velocity field estimations. We evaluated our method on 3-D intra-subject registration using 334 cardiac cine-MRIs. On this dataset, our method showed state-of-the-art performance with a mean DICE score of 81.2% and a mean Hausdorff distance of 7.3mm using 32 latent dimensions compared to three state-of-the-art methods while also demonstrating more regular deformation fields. The average time per registration was 0.32s. Besides, we visualized the learned latent space and show that the encoded deformations can be used to transport deformations and to cluster diseases with a classification accuracy of 83% after applying a linear projection.

Book ChapterDOI
31 Jan 2018
TL;DR: Equivalence studies are coming of age in cross-cultural studies as discussed by the authors. Yet the burgeoning of the field has not led to a convergence in conceptualizations, methods, and analyses.
Abstract: Equivalence studies are coming of age. Thirty years ago there were few conceptual models and statistical techniques to address sources of systematic measurement error in cross-cultural studies (for early examples, see Cleary & Hilton, 1968; Lord, 1977, 1980; Poortinga, 1971). This picture has changed; in the last decades conceptual models and statistical techniques have been developed and refined. Many empirical examples have been published. There is a growing awareness of the importance of the field for the advancement of cross-cultural theorizing. An increasing number of journals require authors who submit manuscripts of cross-cultural studies to present evidence supporting the equivalence of the study measures. Yet the burgeoning of the field has not led to a convergence in conceptualizations, methods, and analyses. For example, educational testing focuses on the analysis of items as sources of problems of cross-cultural comparisons, often using item response theory (e.g., Emenogu & Childs, 2005). In personality psychology, exploratory factor analysis is commonly applied as a tool to examine similarity of factors underlying a questionnaire (e.g., McCrae, 2002). In survey research and marketing, structural equation modeling (SEM) is most frequently employed (e.g., Steenkamp & Baumgartner, 1998). From a theoretical perspective, these models are related; for example, the relationship of item response theory and confirmatory factor analysis (as derived from a general latent variable model) has been described by Brown (2015; see also Glockner-Rist & 4Hoijtink, 2003). However, from a practical perspective, the models can be seen as relatively independent paradigms; for a critical outsider the link between substantive field and analysis method is rather arbitrary and difficult to comprehend.

Journal ArticleDOI
TL;DR: The paper presents insightful equivalence between the classical multivariate techniques for process monitoring and their probabilistic counterparts, which is obtained by restricting the generalized model.

Journal ArticleDOI
TL;DR: In this article, the interrelation of normative beliefs and modality styles is studied, which are an individual's perception of the beliefs of others regarding a specific behaviour, and modal styles represent the part of an individual’s lifestyle that is characterised by the use of a certain set of modes.
Abstract: We study the interrelation of normative beliefs, which are an individual’s perception of the beliefs of others regarding a specific behaviour, and modality styles, which represent the part of an individual’s lifestyle that is characterised by the use of a certain set of modes. In recent years, travel behaviour research has increasingly sought to understand the effect of social influence on mobility-related behaviour. One stream of literature has adopted representations rooted in social psychology to explain behaviour as a function of latent psycho-social constructs including normative beliefs. Another stream of literature has employed a lifestyle-oriented approach to identify segments or modality styles within a population that differ in terms of their orientation towards different modes of transport. Our study proposes an integrated conceptual framework that combines elements of these two streams of literature. Modality styles are hypothesised to be a function of normative beliefs towards the use of different modes of transport; mobility-related attitudes and behaviours are in turn hypothesised to be functions of modality styles. The conceptual model is operationalised using a latent class and latent variable model and empirically validated using data collected through an Australian consumer panel. We demonstrate how this integrated model framework may be used to understand the relationship between normative beliefs, modality styles and travel behaviour. In addition, we show how the model can be applied to predict how extant modality styles and patterns of travel behaviour may change over time in response to concurrent shifts in normative beliefs.

Proceedings Article
Dinghan Shen1, Yizhe Zhang1, Ricardo Henao1, Qinliang Su1, Lawrence Carin1 
01 Jan 2018
TL;DR: The authors employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization, and apply it to text sequence matching problems.
Abstract: A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization. Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long Short-Term Memory (LSTM), with less parameters and considerably faster training. Further, we apply it to text sequence-matching problems. The proposed model significantly outperforms several strong sentence-encoding baselines, especially in the semi-supervised setting.

Posted Content
TL;DR: In this paper, the authors study the problem of word ambiguities in definition modeling and propose a possible solution by employing latent variable modeling and soft attention mechanisms, which takes into account words ambiguity and polysemy leads to performance improvement.
Abstract: We explore recently introduced definition modeling technique that provided the tool for evaluation of different distributed vector representations of words through modeling dictionary definitions of words. In this work, we study the problem of word ambiguities in definition modeling and propose a possible solution by employing latent variable modeling and soft attention mechanisms. Our quantitative and qualitative evaluation and analysis of the model shows that taking into account words ambiguity and polysemy leads to performance improvement.

Posted Content
TL;DR: In this article, a modification of the VAE is proposed, called a Variational Homoencoder (VHE), which produces a hierarchical latent variable model which better utilises latent variables.
Abstract: Hierarchical Bayesian methods can unify many related tasks (e.g. k-shot classification, conditional and unconditional generation) as inference within a single generative model. However, when this generative model is expressed as a powerful neural network such as a PixelCNN, we show that existing learning techniques typically fail to effectively use latent variables. To address this, we develop a modification of the Variational Autoencoder in which encoded observations are decoded to new elements from the same class. This technique, which we call a Variational Homoencoder (VHE), produces a hierarchical latent variable model which better utilises latent variables. We use the VHE framework to learn a hierarchical PixelCNN on the Omniglot dataset, which outperforms all existing models on test set likelihood and achieves strong performance on one-shot generation and classification tasks. We additionally validate the VHE on natural images from the YouTube Faces database. Finally, we develop extensions of the model that apply to richer dataset structures such as factorial and hierarchical categories.

Posted Content
TL;DR: This tutorial explores issues in depth through the lens of variational inference about how to parameterize conditional likelihoods in latent variable models with powerful function approximators.
Abstract: There has been much recent, exciting work on combining the complementary strengths of latent variable models and deep learning. Latent variable modeling makes it easy to explicitly specify model constraints through conditional independence properties, while deep learning makes it possible to parameterize these conditional likelihoods with powerful function approximators. While these "deep latent variable" models provide a rich, flexible framework for modeling many real-world phenomena, difficulties exist: deep parameterizations of conditional likelihoods usually make posterior inference intractable, and latent variable objectives often complicate backpropagation by introducing points of non-differentiability. This tutorial explores these issues in depth through the lens of variational inference.

Posted Content
TL;DR: In this paper, the authors combine a standard visual model for object detection, based on convolutional neural networks, with a latent variable model for link prediction, and show that the integration of a semantic model using link prediction methods can significantly improve the results for visual relationship detection.
Abstract: Structured scene descriptions of images are useful for the automatic processing and querying of large image databases. We show how the combination of a semantic and a visual statistical model can improve on the task of mapping images to their associated scene description. In this paper we consider scene descriptions which are represented as a set of triples (subject, predicate, object), where each triple consists of a pair of visual objects, which appear in the image, and the relationship between them (e.g. man-riding-elephant, man-wearing-hat). We combine a standard visual model for object detection, based on convolutional neural networks, with a latent variable model for link prediction. We apply multiple state-of-the-art link prediction methods and compare their capability for visual relationship detection. One of the main advantages of link prediction methods is that they can also generalize to triples, which have never been observed in the training data. Our experimental results on the recently published Stanford Visual Relationship dataset, a challenging real world dataset, show that the integration of a semantic model using link prediction methods can significantly improve the results for visual relationship detection. Our combined approach achieves superior performance compared to the state-of-the-art method from the Stanford computer vision group.

Journal ArticleDOI
TL;DR: A novel multiview learning model based on the Gaussian process latent variable model (GPLVM) to learn a set of nonlinear and nonparametric mapping functions and obtain a shared latent variable in the manifold space is proposed.
Abstract: Multiview learning reveals the latent correlation among different modalities and utilizes the complementary information to achieve a better performance in many applications. In this paper, we propose a novel multiview learning model based on the Gaussian process latent variable model (GPLVM) to learn a set of nonlinear and nonparametric mapping functions and obtain a shared latent variable in the manifold space. Different from the previous work on the GPLVM, the proposed shared autoencoder Gaussian process (SAGP) latent variable model assumes that there is an additional mapping from the observed data to the shared manifold space. Due to the introduction of the autoencoder framework, both nonlinear projections from and to the observation are considered simultaneously. Additionally, instead of fully connecting used in the conventional autoencoder, the SAGP achieves the mappings utilizing the GP, which remarkably reduces the number of estimated parameters and avoids the phenomenon of overfitting. To make the proposed method adaptive for classification, a discriminative regularization is embedded into the proposed method. In the optimization process, an efficient algorithm based on the alternating direction method and gradient decent techniques is designed to solve the encoder and decoder parts alternatively. Experimental results on three real-world data sets substantiate the effectiveness and superiority of the proposed approach as compared with the state of the art.

Posted Content
TL;DR: This work proposes to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model, and shows that the latent variable MMT formulation improves considerably over strong baselines.
Abstract: In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model. This latent variable can be seen as a multi-modal stochastic embedding of an image and its description in a foreign language. It is used in a target-language decoder and also to predict image features. Importantly, our model formulation utilises visual and textual inputs during training but does not require that images be available at test time. We show that our latent variable MMT formulation improves considerably over strong baselines, including a multi-task learning approach (Elliott and Kadar, 2017) and a conditional variational auto-encoder approach (Toyama et al., 2016). Finally, we show improvements due to (i) predicting image features in addition to only conditioning on them, (ii) imposing a constraint on the minimum amount of information encoded in the latent variable, and (iii) by training on additional target-language image descriptions (i.e. synthetic data).