Showing papers on "Probabilistic latent semantic analysis published in 2017"

PDF

Open Access

Proceedings Article•DOI•

[...]

Changqing Zhang¹, Qinghua Hu¹, Huazhu Fu², Pengfei Zhu¹, Xiaochun Cao³ - Show less +1 more•Institutions (3)

Tianjin University¹, Agency for Science, Technology and Research², Chinese Academy of Sciences³

21 Jul 2017

TL;DR: A novel Latent Multi-view Subspace Clustering method, which clusters data points with latent representation and simultaneously explores underlying complementary information from multiple views, which makes subspace representation more accurate and robust as well.

...read moreread less

Abstract: In this paper, we propose a novel Latent Multi-view Subspace Clustering (LMSC) method, which clusters data points with latent representation and simultaneously explores underlying complementary information from multiple views. Unlike most existing single view subspace clustering methods that reconstruct data points using original features, our method seeks the underlying latent representation and simultaneously performs data reconstruction based on the learned latent representation. With the complementarity of multiple views, the latent representation could depict data themselves more comprehensively than each single view individually, accordingly makes subspace representation more accurate and robust as well. The proposed method is intuitive and can be optimized efficiently by using the Augmented Lagrangian Multiplier with Alternating Direction Minimization (ALM-ADM) algorithm. Extensive experiments on benchmark datasets have validated the effectiveness of our proposed method.

...read moreread less

357 citations

Posted Content•

Toward Multimodal Image-to-Image Translation

[...]

Jun-Yan Zhu¹, Richard Zhang², Deepak Pathak¹, Trevor Darrell¹, Alexei A. Efros¹, Oliver Wang³, Eli Shechtman³ - Show less +3 more•Institutions (3)

University of California, Berkeley¹, Massachusetts Institute of Technology², Adobe Systems³

30 Nov 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time, which helps prevent the many-to-one mapping from the latent code to the output during training.

...read moreread less

Abstract: Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim to model a \emph{distribution} of possible outputs in a conditional generative modeling setting. The ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time. A generator learns to map the given input, combined with this latent code, to the output. We explicitly encourage the connection between output and the latent code to be invertible. This helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse, and produces more diverse results. We explore several variants of this approach by employing different training objectives, network architectures, and methods of injecting the latent code. Our proposed method encourages bijective consistency between the latent encoding and output modes. We present a systematic comparison of our method and other variants on both perceptual realism and diversity.

...read moreread less

278 citations

Journal Article•DOI•

Zero-Shot Visual Recognition via Bidirectional Latent Embedding

[...]

Qian Wang¹, Ke Chen¹•Institutions (1)

University of Manchester¹

01 Sep 2017-International Journal of Computer Vision

TL;DR: Zhang et al. as mentioned in this paper proposed a stagewise bidirectional latent embedding framework of two subsequent learning stages for zero-shot visual recognition, where the bottom-up stage explores the topological and labeling information underlying training data of known classes via a proper supervised subspace learning algorithm and the latent embeddings of training data are used to form landmarks that guide embedding semantics underlying unseen classes into this learned latent space.

...read moreread less

Abstract: Zero-shot learning for visual recognition, e.g., object and action recognition, has recently attracted a lot of attention. However, it still remains challenging in bridging the semantic gap between visual features and their underlying semantics and transferring knowledge to semantic categories unseen during learning. Unlike most of the existing zero-shot visual recognition methods, we propose a stagewise bidirectional latent embedding framework of two subsequent learning stages for zero-shot visual recognition. In the bottom---up stage, a latent embedding space is first created by exploring the topological and labeling information underlying training data of known classes via a proper supervised subspace learning algorithm and the latent embedding of training data are used to form landmarks that guide embedding semantics underlying unseen classes into this learned latent space. In the top---down stage, semantic representations of unseen-class labels in a given label vocabulary are then embedded to the same latent space to preserve the semantic relatedness between all different classes via our proposed semi-supervised Sammon mapping with the guidance of landmarks. Thus, the resultant latent embedding space allows for predicting the label of a test instance with a simple nearest-neighbor rule. To evaluate the effectiveness of the proposed framework, we have conducted extensive experiments on four benchmark datasets in object and action recognition, i.e., AwA, CUB-200-2011, UCF101 and HMDB51. The experimental results under comparative studies demonstrate that our proposed approach yields the state-of-the-art performance under inductive and transductive settings.

...read moreread less

144 citations

Journal Article•DOI•

Latent Semantic Minimal Hashing for Image Retrieval

[...]

Xiaoqiang Lu¹, Xiangtao Zheng¹, Xuelong Li¹•Institutions (1)

Chinese Academy of Sciences¹

01 Jan 2017-IEEE Transactions on Image Processing

TL;DR: A novel joint binary codes learning method is proposed to combine image feature to latent semantic feature with minimum encoding loss, which is referred as latent semantic minimal hashing, which outperforms most state-of-the-art hashing methods.

...read moreread less

Abstract: Hashing-based similarity search is an important technique for large-scale query-by-example image retrieval system, since it provides fast search with computation and memory efficiency. However, it is a challenge work to design compact codes to represent original features with good performance. Recently, a lot of unsupervised hashing methods have been proposed to focus on preserving geometric structure similarity of the data in the original feature space, but they have not yet fully refined image features and explored the latent semantic feature embedding in the data simultaneously. To address the problem, in this paper, a novel joint binary codes learning method is proposed to combine image feature to latent semantic feature with minimum encoding loss, which is referred as latent semantic minimal hashing . The latent semantic feature is learned based on matrix decomposition to refine original feature, thereby it makes the learned feature more discriminative. Moreover, a minimum encoding loss is combined with latent semantic feature learning process simultaneously, so as to guarantee the obtained binary codes are discriminative as well. Extensive experiments on several well-known large databases demonstrate that the proposed method outperforms most state-of-the-art hashing methods.

...read moreread less

140 citations

Proceedings Article•DOI•

Low-Rank Embedded Ensemble Semantic Dictionary for Zero-Shot Learning

[...]

Zhengming Ding¹, Ming Shao², Yun Fu¹•Institutions (2)

Northeastern University¹, University of Massachusetts Dartmouth²

01 Jul 2017

TL;DR: This work forms a novel framework to jointly seek a low-rank embedding and semantic dictionary to link visual features with their semantic representations, which manages to capture shared features across different observed classes.

...read moreread less

Abstract: Zero-shot learning for visual recognition has received much interest in the most recent years. However, the semantic gap across visual features and their underlying semantics is still the biggest obstacle in zero-shot learning. To fight off this hurdle, we propose an effective Low-rank Embedded Semantic Dictionary learning (LESD) through ensemble strategy. Specifically, we formulate a novel framework to jointly seek a low-rank embedding and semantic dictionary to link visual features with their semantic representations, which manages to capture shared features across different observed classes. Moreover, ensemble strategy is adopted to learn multiple semantic dictionaries to constitute the latent basis for the unseen classes. Consequently, our model could extract a variety of visual characteristics within objects, which can be well generalized to unknown categories. Extensive experiments on several zero-shot benchmarks verify that the proposed model can outperform the state-of-the-art approaches.

...read moreread less

129 citations

Proceedings Article•

Gaussian process based nonlinear latent structure discovery in multivariate spike train data

[...]

Anqi Wu¹, Nicholas Roy¹, Stephen Keeley¹, Jonathan W. Pillow¹•Institutions (1)

Princeton University¹

01 Dec 2017

TL;DR: This work proposes a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data and introduces the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves.

...read moreread less

Abstract: A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes-one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network.

...read moreread less

89 citations

Book Chapter•DOI•

Topic Modeling over Short Texts by Incorporating Word Embeddings

[...]

Jipeng Qiang¹, Jipeng Qiang², Jipeng Qiang³, Ping Chen¹, Tong Wang¹, Xindong Wu², Xindong Wu⁴ - Show less +3 more•Institutions (4)

University of Massachusetts Boston¹, Hefei University of Technology², Yangzhou University³, University of Louisiana at Lafayette⁴

23 May 2017

TL;DR: This paper proposed Embedding-based Topic Model (ETM) to learn latent topics from short texts by aggregating short texts into long pseudo-texts and using a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic.

...read moreread less

Abstract: Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this problem very well since only very limited word co-occurrence information is available in short texts. This paper studies how to incorporate the external word correlation knowledge into short texts to improve the coherence of topic modeling. Based on recent results in word embeddings that learn semantically representations for words from a large corpus, we introduce a novel method, Embedding-based Topic Model (ETM), to learn latent topics from short texts. ETM not only solves the problem of very limited word co-occurrence information by aggregating short texts into long pseudo-texts, but also utilizes a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic. The experiments on real-world datasets validate the effectiveness of our model comparing with the state-of-the-art models.

...read moreread less

84 citations

Journal Article•DOI•

High dimensional semiparametric latent graphical model for mixed data

[...]

Jianqing Fan¹, Han Liu¹, Yang Ning¹, Hui Zou²•Institutions (2)

Princeton University¹, University of Minnesota²

01 Mar 2017-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: A unified rank‐based approach to estimate the correlation matrix of latent variables is proposed and the concentration inequality of the proposed rank-based estimator is established, which achieves the same rates of convergence for precision matrix estimation and graph recovery as if the latent variables were observed.

...read moreread less

Abstract: Summary We propose a semiparametric latent Gaussian copula model for modelling mixed multivariate data, which contain a combination of both continuous and binary variables. The model assumes that the observed binary variables are obtained by dichotomizing latent variables that satisfy the Gaussian copula distribution. The goal is to infer the conditional independence relationship between the latent random variables, based on the observed mixed data. Our work has two main contributions: we propose a unified rank-based approach to estimate the correlation matrix of latent variables; we establish the concentration inequality of the proposed rank-based estimator. Consequently, our methods achieve the same rates of convergence for precision matrix estimation and graph recovery, as if the latent variables were observed. The methods proposed are numerically assessed through extensive simulation studies, and real data analysis.

...read moreread less

76 citations

Proceedings Article•DOI•

An overview of topic modeling methods and tools

[...]

Bhagyashree Vyankatrao Barde¹, Anant Madhavrao Bainwad¹•Institutions (1)

Shri Guru Gobind Singhji Institute of Engineering and Technology¹

15 Jun 2017

TL;DR: Methods of Topic Modeling which includes Vector Space Model (VSM), Latent Semantic Indexing (LSI), Probabilistic LatentSemantic Analysis (PLSA),Latent Dirichlet Allocation (LDA) with their features and limitations are discussed.

...read moreread less

Abstract: Topic modeling is a powerful technique for analysis of a huge collection of a document. Topic modeling is used for discovering hidden structure from the collection of a document. The topic is viewed as a recurring pattern of co-occurring words. A topic includes a group of words that often occurs together. Topic modeling can link words with the same context and differentiate across the uses of words with different meanings. In this paper, we discuss methods of Topic Modeling which includes Vector Space Model (VSM), Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA) with their features and limitations. After that, we will discuss tools available for topic modeling such as Gensim, Standford topic modeling toolbox, MALLET, BigARTM. Then some of the applications of Topic Modeling covered. Topic models have a wide range of applications like tag recommendation, text categorization, keyword extraction, information filtering and similarity search in the fields of text mining, information retrieval.

...read moreread less

74 citations

Proceedings Article•DOI•

Document Retrieval Model Through Semantic Linking

[...]

Faezeh Ensan¹, Ebrahim Bagheri²•Institutions (2)

Ferdowsi University of Mashhad¹, Ryerson University²

02 Feb 2017

TL;DR: This paper addresses the task of document retrieval based on the degree of document relatedness to the meanings of a query by presenting a semantic-enabled language model that adopts a probabilistic reasoning model for calculating the conditional probability of a queries concept given values assigned to document concepts.

...read moreread less

Abstract: This paper addresses the task of document retrieval based on the degree of document relatedness to the meanings of a query by presenting a semantic-enabled language model. Our model relies on the use of semantic linking systems for forming a graph representation of documents and queries, where nodes represent concepts extracted from documents and edges represent semantic relatedness between concepts. Based on this graph, our model adopts a probabilistic reasoning model for calculating the conditional probability of a query concept given values assigned to document concepts. We present an integration framework for interpolating other retrieval systems with the presented model in this paper. Our empirical experiments on a number of TREC collections show that the semantic retrieval has a synergetic impact on the results obtained through state of the art keyword-based approaches, and the consideration of semantic information obtained from entity linking on queries and documents can complement and enhance the performance of other retrieval models.

...read moreread less

69 citations

Journal Article•DOI•

Latent space models for multiview network data.

[...]

Michael Salter-Townshend¹, Tyler H. McCormick²•Institutions (2)

University of Oxford¹, University of Washington²

01 Sep 2017-The Annals of Applied Statistics

TL;DR: This work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association and infers correlations between views not explained by the latent space model.

...read moreread less

Abstract: Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090–1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)].

...read moreread less

Proceedings Article•

Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization

[...]

Piji Li¹, Zihao Wang¹, Wai Lam¹, Zhaochun Ren², Lidong Bing³ - Show less +1 more•Institutions (3)

The Chinese University of Hong Kong¹, University College London², Tencent³

12 Feb 2017

TL;DR: An unsupervised data reconstruction framework is proposed, which jointly considers the reconstruction for latent semantic space and observed term vector space and can capture the salience of sentences from these two different and complementary vector spaces.

...read moreread less

Abstract: We propose a new unsupervised sentence salience framework for Multi-Document Summarization (MDS), which can be divided into two components: latent semantic modeling and salience estimation. For latent semantic modeling, a neural generative model called Variational Auto-Encoders (VAEs) is employed to describe the observed sentences and the corresponding latent semantic representations. Neural variational inference is used for the posterior inference of the latent variables. For salience estimation, we propose an unsupervised data reconstruction framework, which jointly considers the reconstruction for latent semantic space and observed term vector space. Therefore, we can capture the salience of sentences from these two different and complementary vector spaces. Thereafter, the VAEs-based latent semantic model is integrated into the sentence salience estimation component in a unified fashion, and the whole framework can be trained jointly by back-propagation via multi-task learning. Experimental results on the benchmark datasets DUC and TAC show that our framework achieves better performance than the state-of-the-art models.

...read moreread less

Journal Article•DOI•

Dynamic Latent Variable Analytics for Process Operations and Control

[...]

Yining Dong¹, Yining Dong², S. Joe Qin², S. Joe Qin¹•Institutions (2)

The Chinese University of Hong Kong¹, University of Southern California²

04 Nov 2017-Computers & Chemical Engineering

TL;DR: These latent variable methods are extended to modeling high dimensional time series data to extract the most dynamic latent time series, of which the current values are best predicted from the past values of the extracted latent variables.

...read moreread less

Posted Content•

KBLRN : End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features

[...]

Alberto Garcia-Duran¹, Mathias Niepert²•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, University of Washington²

14 Sep 2017-arXiv: Artificial Intelligence

TL;DR: KBLRN as discussed by the authors is a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features, integrating feature types with a novel combination of neural representation learning and probabilistic product of experts models.

...read moreread less

Abstract: We present KBLRN, a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features. KBLRN integrates feature types with a novel combination of neural representation learning and probabilistic product of experts models. To the best of our knowledge, KBLRN is the first approach that learns representations of knowledge bases by integrating latent, relational, and numerical features. We show that instances of KBLRN outperform existing methods on a range of knowledge base completion tasks. We contribute a novel data sets enriching commonly used knowledge base completion benchmarks with numerical features. The data sets are available under a permissive BSD-3 license. We also investigate the impact numerical features have on the KB completion performance of KBLRN.

...read moreread less

Journal Article•DOI•

A comparison of latent class, K-means, and K-median methods for clustering dichotomous data.

[...]

Michael J. Brusco¹, Emilie Shireman², Douglas Steinley²•Institutions (2)

Florida State University¹, University of Missouri²

01 Sep 2017-Psychological Methods

TL;DR: Simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data found that the 3 approaches can exhibit profound differences when applied to real data.

...read moreread less

Abstract: The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record

...read moreread less

Posted Content•

Learning Latent Representations for Speech Generation and Transformation

[...]

Wei-Ning Hsu¹, Yu Zhang¹, James Glass¹•Institutions (1)

Massachusetts Institute of Technology¹

13 Apr 2017-arXiv: Computation and Language

TL;DR: The capability of the convolutional VAE model to modify the phonetic content or the speaker identity for speech segments using the derived operations, without the need for parallel supervisory data is demonstrated.

...read moreread less

Abstract: An ability to model a generative process and learn a latent representation for speech in an unsupervised fashion will be crucial to process vast quantities of unlabelled speech data. Recently, deep probabilistic generative models such as Variational Autoencoders (VAEs) have achieved tremendous success in modeling natural images. In this paper, we apply a convolutional VAE to model the generative process of natural speech. We derive latent space arithmetic operations to disentangle learned latent representations. We demonstrate the capability of our model to modify the phonetic content or the speaker identity for speech segments using the derived operations, without the need for parallel supervisory data.

...read moreread less

Journal Article•DOI•

Dynamic Latent Class Analysis

[...]

Tihomir Asparouhov, Ellen L. Hamaker¹, Bengt Muthén•Institutions (1)

Utrecht University¹

04 Mar 2017-Structural Equation Modeling

TL;DR: In this paper, the authors describe the general time-intensive longitudinal latent class modeling framework implemented in Mplus and discuss the Bayesian estimation based on Markov chain Monto Carlo, which allows modeling with arbitrary long time series data and many random effects.

...read moreread less

Abstract: This article describes the general time-intensive longitudinal latent class modeling framework implemented in Mplus. For each individual a latent class variable is measured at each time point and the latent class changes across time follow a Markov process (i.e., a hidden or latent Markov model), with subject-specific transition probabilities that are estimated as random effects. Such a model for single-subject data has been referred to as the regime-switching state-space model. The latent class variable can be measured by continuous or categorical indicators, under the local independence condition, or more generally by a class-specific structural equation model or a dynamic structural equation model. We discuss the Bayesian estimation based on Markov chain Monto Carlo, which allows modeling with arbitrary long time series data and many random effects. The modeling framework is illustrated with several simulation studies.

...read moreread less

Journal Article•DOI•

Semi-automatic terminology ontology learning based on topic modeling

[...]

Monika Rani¹, Amit Kumar Dhar¹, Om Prakash Vyas¹•Institutions (1)

Indian Institute of Information Technology, Allahabad¹

01 Aug 2017-Engineering Applications of Artificial Intelligence

TL;DR: In this paper, two topic modeling algorithms are explored, namely LSI and SVD and Mr.LDA for learning topic ontology and the objective is to determine the statistical relationship between document and terms to build a topic ontologies and ontology graph with minimum human intervention.

...read moreread less

Journal Article•DOI•

Person Re-Identification via Distance Metric Learning With Latent Variables

[...]

Chong Sun¹, Dong Wang¹, Huchuan Lu¹•Institutions (1)

Dalian University of Technology¹

01 Jan 2017-IEEE Transactions on Image Processing

TL;DR: An effective person re-identification method with latent variables is proposed, which represents a pedestrian as the mixture of a holistic model and a number of flexible models, and develops a latent metric learning method for learning the effective metric matrix.

...read moreread less

Abstract: In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.

...read moreread less

Journal Article•DOI•

[...]

Guoli Song¹, Shuhui Wang¹, Qingming Huang¹, Qi Tian²•Institutions (2)

Chinese Academy of Sciences¹, University of Texas at San Antonio²

07 Jun 2017-IEEE Transactions on Image Processing

TL;DR: This paper proposes multimodal Similarity Gaussian Process latent variable model (m-SimGP), which learns the mapping functions between the intra-modal similarities and latent representation, and proposes m-DRSimGP, which combines the distance preservation in m-DSimGP and semantic preservation in m-RSim GP to learn the latent representation.

...read moreread less

Abstract: Data from real applications involve multiple modalities representing content with the same semantics from complementary aspects. However, relations among heterogeneous modalities are simply treated as observation-to-fit by existing work, and the parameterized modality specific mapping functions lack flexibility in directly adapting to the content divergence and semantic complicacy in multimodal data. In this paper, we build our work based on the Gaussian process latent variable model (GPLVM) to learn the non-parametric mapping functions and transform heterogeneous modalities into a shared latent space. We propose multimodal Similarity Gaussian Process latent variable model (m-SimGP), which learns the mapping functions between the intra-modal similarities and latent representation. We further propose multimodal distance-preserved similarity GPLVM (m-DSimGP) to preserve the intra-modal global similarity structure, and multimodal regularized similarity GPLVM (m-RSimGP) by encouraging similar/dissimilar points to be similar/dissimilar in the latent space. We propose m-DRSimGP, which combines the distance preservation in m-DSimGP and semantic preservation in m-RSimGP to learn the latent representation. The overall objective functions of the four models are solved by simple and scalable gradient decent techniques. They can be applied to various tasks to discover the nonlinear correlations and to obtain the comparable low-dimensional representation for heterogeneous modalities. On five widely used real-world data sets, our approaches outperform existing models on cross-modal content retrieval and multimodal classification.

...read moreread less

Posted Content•

Z-Forcing: Training Stochastic Recurrent Networks

[...]

Anirudh Goyal¹, Alessandro Sordoni², Marc-Alexandre Côté³, Nan Rosemary Ke⁴, Yoshua Bengio¹ - Show less +1 more•Institutions (4)

Université de Montréal¹, Microsoft², Université de Sherbrooke³, École Polytechnique de Montréal⁴

15 Nov 2017-arXiv: Machine Learning

TL;DR: This paper proposed a stochastic recurrent model, where each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps, and training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence.

...read moreread less

Abstract: Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{this https URL}

...read moreread less

Journal Article•DOI•

Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering

[...]

Bo Yang¹, Xiao Fu¹, Nicholas D. Sidiropoulos¹•Institutions (1)

University of Minnesota¹

01 Jan 2017-IEEE Transactions on Signal Processing

TL;DR: This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data, and leverages matrix and Tensor factorization models that produce essentially unique latent representations of the data.

...read moreread less

Abstract: Dimensionality reduction techniques play an essential role in data analytics, signal processing, and machine learning. Dimensionality reduction is usually performed in a preprocessing stage that is separate from subsequent data analysis, such as clustering or classification. Finding reduced-dimension representations that are well-suited for the intended task is more appealing. This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data. The proposed approach leverages matrix and tensor factorization models that produce essentially unique latent representations of the data to unravel latent cluster structure—which is otherwise obscured because of the freedom to apply an oblique transformation in latent space. At the same time, latent cluster structure is used as prior information to enhance the performance of factorization. Specific contributions include several custom-built problem formulations, corresponding algorithms, and discussion of associated convergence properties. Besides extensive simulations, real-world datasets such as Reuters document data and MNIST image data are also employed to showcase the effectiveness of the proposed approaches.

...read moreread less

Proceedings Article•DOI•

Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval

[...]

Jianlong Wu¹, Zhouchen Lin¹, Hongbin Zha¹•Institutions (1)

Peking University¹

07 Aug 2017

TL;DR: This paper first adopts the spectral regression method to learn the optimal latent space shared by data of all modalities based on the orthogonal constraints, then builds a graph model to project the multi-modality data into the latent space.

...read moreread less

Abstract: Cross-modal retrieval has received much attention in recent years. It is a commonly used method to project multi-modality data into a common subspace and then retrieve. However, nearly all existing methods directly adopt the space defined by the binary class label information without learning as the shared subspace for regression. In this paper, we first adopt the spectral regression method to learn the optimal latent space shared by data of all modalities based on the orthogonal constraints. Then we construct a graph model to project the multi-modality data into the latent space. Finally, we combine these two processes together to jointly learn the latent space and regress. We conduct extensive experiments on multiple benchmark datasets and our proposed method outperforms the state-of-the-art approaches.

...read moreread less

Journal Article•DOI•

Partial Membership Latent Dirichlet Allocation for Soft Image Segmentation

[...]

Chao Chen¹, Alina Zare², Huy Trinh¹, Gbenga O. Omotara¹, James T. Cobb³, Timotius A. Lagaunne¹ - Show less +2 more•Institutions (3)

University of Missouri¹, University of Florida², Naval Surface Warfare Center³

04 Aug 2017-IEEE Transactions on Image Processing

TL;DR: Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.

...read moreread less

Abstract: Topic models [e.g., probabilistic latent semantic analysis, latent Dirichlet allocation (LDA), and supervised LDA] have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership LDA (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery, where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.

...read moreread less

Journal Article•DOI•

Fitting latent variable mixture models.

[...]

Gitta H. Lubke¹, Justin M. Luningham¹•Institutions (1)

University of Notre Dame¹

01 Nov 2017-Behaviour Research and Therapy

TL;DR: The theoretical background of LVMMs is provided and their exploratory character is emphasized, the general framework together with assumptions and necessary constraints is outlined, the difference between models with and without covariates is highlighted, and the interrelation between the number of classes and the complexity of the within-class model is discussed as well as the relevance of measurement invariance.

...read moreread less

Proceedings Article•DOI•

Large-scale point-of-interest category prediction using natural language processing models

[...]

Daniel Yue Zhang¹, Dong Wang¹, Hao Zheng¹, Xin Mu¹, Qi Li¹, Yang Zhang¹ - Show less +2 more•Institutions (1)

University of Notre Dame¹

01 Jul 2017

TL;DR: A Context-Aware POI Category Prediction (CAP-CP) scheme using Natural Language Processing (NLP) models and a novel Temporal Adaptive Ngram (TA-Ngram) model to capture the dynamic dependency between check-in points to address temporal dependency challenge.

...read moreread less

Abstract: Point-of-Interest (POI) recommendation is an important application in Location-based Social Networks (LBSN). The category prediction problem is to predict the next POI category that users may visit. The predicted category information is critical in large-scale POI recommendation because it can significantly reduce the prediction space and improve the recommendation accuracy. While efforts have been made to address the POI category prediction problem, several important challenges still exist. First, existing solutions did not fully explore the temporal dependency (e.g., “long range dependency”) of users' check-in traces. Second, the hidden contextual information associated with each check-in point has been underutilized. In this work, we propose a Context-Aware POI Category Prediction (CAP-CP) scheme using Natural Language Processing (NLP) models. In particular, to address temporal dependency challenge, we develop a novel Temporal Adaptive Ngram (TA-Ngram) model to capture the dynamic dependency between check-in points. To address the challenge of hidden context incorporation, CAP-CP leverages the Probabilistic Latent Semantic Analysis (PLSA) model to infer the semantic implications of the context variables in the prediction model. Empirical results on a real world dataset show that our scheme can effectively improve the performance of the state-of-the-art POI recommendation solutions.

...read moreread less

Journal Article•DOI•

Learning multi-level features for sensor-based human action recognition

[...]

Yan Xu¹, Yan Xu², Zhengyang Shen¹, Xin Zhang¹, Yifan Gao¹, Shujian Deng¹, Yipei Wang¹, Yubo Fan¹, Eric Chang² - Show less +5 more•Institutions (2)

Beihang University¹, Microsoft²

01 Sep 2017-Pervasive and Mobile Computing

TL;DR: The Max-margin Latent Pattern Learning (MLPL) method is proposed to learn high-level semantic descriptions of latent action patterns as the output of this framework, achieving the state-of-the-art performances on Skoda, WISDM and OPP datasets.

...read moreread less

Posted Content•

Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables

[...]

Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

26 Jun 2017-arXiv: Machine Learning

TL;DR: This work describes and studies in BNNs with latent variables a decomposition of predictive uncertainty into its epistemic and aleatoric components, and uses a similar decomposition to develop a novel risk sensitive objective for safe reinforcement learning (RL).

...read moreread less

Abstract: Bayesian neural networks (BNNs) with latent variables are probabilistic models which can automatically identify complex stochastic patterns in the data. We describe and study in these models a decomposition of predictive uncertainty into its epistemic and aleatoric components. First, we show how such a decomposition arises naturally in a Bayesian active learning scenario by following an information theoretic approach. Second, we use a similar decomposition to develop a novel risk sensitive objective for safe reinforcement learning (RL). This objective minimizes the effect of model bias in environments whose stochastic dynamics are described by BNNs with latent variables. Our experiments illustrate the usefulness of the resulting decomposition in active learning and safe RL settings.

...read moreread less

Journal Article•DOI•

Hierarchical Latent Concept Discovery for Video Event Detection

[...]

Chao Li¹, Zi Huang¹, Yang Yang², Jiewei Cao¹, Xiaoshuai Sun¹, Heng Tao Shen² - Show less +2 more•Institutions (2)

University of Queensland¹, University of Electronic Science and Technology of China²

01 May 2017-IEEE Transactions on Image Processing

TL;DR: This paper proposes a novel hierarchical video event detection model, which deliberately unifies the processes of underlying semantics discovery and event modeling from video data and devise an effective model to automatically uncover video semantics by hierarchically capturing latent static-visual concepts in frame-level and latent activity concepts in segment-level.

...read moreread less

Abstract: Semantic information is important for video event detection. How to automatically discover, model, and utilize semantic information to facilitate video event detection has been a challenging problem. In this paper, we propose a novel hierarchical video event detection model, which deliberately unifies the processes of underlying semantics discovery and event modeling from video data. Specially, different from most of the approaches based on manually pre-defined concepts, we devise an effective model to automatically uncover video semantics by hierarchically capturing latent static-visual concepts in frame-level and latent activity concepts (i.e., temporal sequence relationships of static-visual concepts) in segment-level. The unified model not only enables a discriminative and descriptive representation for videos, but also alleviates error propagation problem from video representation to event modeling existing in previous methods. A max-margin framework is employed to learn the model. Extensive experiments on four challenging video event datasets, i.e., MED11, CCV, UQE50, and FCVID, have been conducted to demonstrate the effectiveness of the proposed method.

...read moreread less

Journal Article•DOI•

Bayesian regularized multivariate generalized latent variable models.

[...]

Xiang-Nan Feng¹, Hao-Tian Wu¹, Xinyuan Song¹•Institutions (1)

The Chinese University of Hong Kong¹

13 Jan 2017-Structural Equation Modeling

TL;DR: In this paper, a multivariate generalized latent variable model is proposed to investigate the effects of observable and latent explanatory variables on multiple responses of interest, such as continuous, count, ordinal, and nominal variables.

...read moreread less

Abstract: We consider a multivariate generalized latent variable model to investigate the effects of observable and latent explanatory variables on multiple responses of interest. Various types of correlated responses, such as continuous, count, ordinal, and nominal variables, are considered in the regression. A generalized confirmatory factor analysis model that is capable of managing mixed-type data is proposed to characterize latent variables via correlated observed indicators. In addressing the complicated structure of the proposed model, we introduce continuous underlying measurements to provide a unified model framework for mixed-type data. We develop a multivariate version of the Bayesian adaptive least absolute shrinkage and selection operator procedure, which is implemented with a Markov chain Monte Carlo (MCMC) algorithm in a full Bayesian context, to simultaneously conduct estimation and model selection. The empirical performance of the proposed methodology is demonstrated through a simulation study. An ...

...read moreread less