scispace - formally typeset
Search or ask a question

Showing papers on "Probabilistic latent semantic analysis published in 2017"


Proceedings ArticleDOI
21 Jul 2017
TL;DR: A novel Latent Multi-view Subspace Clustering method, which clusters data points with latent representation and simultaneously explores underlying complementary information from multiple views, which makes subspace representation more accurate and robust as well.
Abstract: In this paper, we propose a novel Latent Multi-view Subspace Clustering (LMSC) method, which clusters data points with latent representation and simultaneously explores underlying complementary information from multiple views. Unlike most existing single view subspace clustering methods that reconstruct data points using original features, our method seeks the underlying latent representation and simultaneously performs data reconstruction based on the learned latent representation. With the complementarity of multiple views, the latent representation could depict data themselves more comprehensively than each single view individually, accordingly makes subspace representation more accurate and robust as well. The proposed method is intuitive and can be optimized efficiently by using the Augmented Lagrangian Multiplier with Alternating Direction Minimization (ALM-ADM) algorithm. Extensive experiments on benchmark datasets have validated the effectiveness of our proposed method.

357 citations


Posted Content
TL;DR: In this article, the ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time, which helps prevent the many-to-one mapping from the latent code to the output during training.
Abstract: Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim to model a \emph{distribution} of possible outputs in a conditional generative modeling setting. The ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time. A generator learns to map the given input, combined with this latent code, to the output. We explicitly encourage the connection between output and the latent code to be invertible. This helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse, and produces more diverse results. We explore several variants of this approach by employing different training objectives, network architectures, and methods of injecting the latent code. Our proposed method encourages bijective consistency between the latent encoding and output modes. We present a systematic comparison of our method and other variants on both perceptual realism and diversity.

278 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a stagewise bidirectional latent embedding framework of two subsequent learning stages for zero-shot visual recognition, where the bottom-up stage explores the topological and labeling information underlying training data of known classes via a proper supervised subspace learning algorithm and the latent embeddings of training data are used to form landmarks that guide embedding semantics underlying unseen classes into this learned latent space.
Abstract: Zero-shot learning for visual recognition, e.g., object and action recognition, has recently attracted a lot of attention. However, it still remains challenging in bridging the semantic gap between visual features and their underlying semantics and transferring knowledge to semantic categories unseen during learning. Unlike most of the existing zero-shot visual recognition methods, we propose a stagewise bidirectional latent embedding framework of two subsequent learning stages for zero-shot visual recognition. In the bottom---up stage, a latent embedding space is first created by exploring the topological and labeling information underlying training data of known classes via a proper supervised subspace learning algorithm and the latent embedding of training data are used to form landmarks that guide embedding semantics underlying unseen classes into this learned latent space. In the top---down stage, semantic representations of unseen-class labels in a given label vocabulary are then embedded to the same latent space to preserve the semantic relatedness between all different classes via our proposed semi-supervised Sammon mapping with the guidance of landmarks. Thus, the resultant latent embedding space allows for predicting the label of a test instance with a simple nearest-neighbor rule. To evaluate the effectiveness of the proposed framework, we have conducted extensive experiments on four benchmark datasets in object and action recognition, i.e., AwA, CUB-200-2011, UCF101 and HMDB51. The experimental results under comparative studies demonstrate that our proposed approach yields the state-of-the-art performance under inductive and transductive settings.

144 citations


Journal ArticleDOI
TL;DR: A novel joint binary codes learning method is proposed to combine image feature to latent semantic feature with minimum encoding loss, which is referred as latent semantic minimal hashing, which outperforms most state-of-the-art hashing methods.
Abstract: Hashing-based similarity search is an important technique for large-scale query-by-example image retrieval system, since it provides fast search with computation and memory efficiency. However, it is a challenge work to design compact codes to represent original features with good performance. Recently, a lot of unsupervised hashing methods have been proposed to focus on preserving geometric structure similarity of the data in the original feature space, but they have not yet fully refined image features and explored the latent semantic feature embedding in the data simultaneously. To address the problem, in this paper, a novel joint binary codes learning method is proposed to combine image feature to latent semantic feature with minimum encoding loss, which is referred as latent semantic minimal hashing . The latent semantic feature is learned based on matrix decomposition to refine original feature, thereby it makes the learned feature more discriminative. Moreover, a minimum encoding loss is combined with latent semantic feature learning process simultaneously, so as to guarantee the obtained binary codes are discriminative as well. Extensive experiments on several well-known large databases demonstrate that the proposed method outperforms most state-of-the-art hashing methods.

140 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: This work forms a novel framework to jointly seek a low-rank embedding and semantic dictionary to link visual features with their semantic representations, which manages to capture shared features across different observed classes.
Abstract: Zero-shot learning for visual recognition has received much interest in the most recent years. However, the semantic gap across visual features and their underlying semantics is still the biggest obstacle in zero-shot learning. To fight off this hurdle, we propose an effective Low-rank Embedded Semantic Dictionary learning (LESD) through ensemble strategy. Specifically, we formulate a novel framework to jointly seek a low-rank embedding and semantic dictionary to link visual features with their semantic representations, which manages to capture shared features across different observed classes. Moreover, ensemble strategy is adopted to learn multiple semantic dictionaries to constitute the latent basis for the unseen classes. Consequently, our model could extract a variety of visual characteristics within objects, which can be well generalized to unknown categories. Extensive experiments on several zero-shot benchmarks verify that the proposed model can outperform the state-of-the-art approaches.

129 citations


Proceedings Article
01 Dec 2017
TL;DR: This work proposes a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data and introduces the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves.
Abstract: A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes-one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network.

89 citations


Book ChapterDOI
23 May 2017
TL;DR: This paper proposed Embedding-based Topic Model (ETM) to learn latent topics from short texts by aggregating short texts into long pseudo-texts and using a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic.
Abstract: Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this problem very well since only very limited word co-occurrence information is available in short texts. This paper studies how to incorporate the external word correlation knowledge into short texts to improve the coherence of topic modeling. Based on recent results in word embeddings that learn semantically representations for words from a large corpus, we introduce a novel method, Embedding-based Topic Model (ETM), to learn latent topics from short texts. ETM not only solves the problem of very limited word co-occurrence information by aggregating short texts into long pseudo-texts, but also utilizes a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic. The experiments on real-world datasets validate the effectiveness of our model comparing with the state-of-the-art models.

84 citations


Journal ArticleDOI
TL;DR: A unified rank‐based approach to estimate the correlation matrix of latent variables is proposed and the concentration inequality of the proposed rank-based estimator is established, which achieves the same rates of convergence for precision matrix estimation and graph recovery as if the latent variables were observed.
Abstract: Summary We propose a semiparametric latent Gaussian copula model for modelling mixed multivariate data, which contain a combination of both continuous and binary variables. The model assumes that the observed binary variables are obtained by dichotomizing latent variables that satisfy the Gaussian copula distribution. The goal is to infer the conditional independence relationship between the latent random variables, based on the observed mixed data. Our work has two main contributions: we propose a unified rank-based approach to estimate the correlation matrix of latent variables; we establish the concentration inequality of the proposed rank-based estimator. Consequently, our methods achieve the same rates of convergence for precision matrix estimation and graph recovery, as if the latent variables were observed. The methods proposed are numerically assessed through extensive simulation studies, and real data analysis.

76 citations


Proceedings ArticleDOI
15 Jun 2017
TL;DR: Methods of Topic Modeling which includes Vector Space Model (VSM), Latent Semantic Indexing (LSI), Probabilistic LatentSemantic Analysis (PLSA),Latent Dirichlet Allocation (LDA) with their features and limitations are discussed.
Abstract: Topic modeling is a powerful technique for analysis of a huge collection of a document. Topic modeling is used for discovering hidden structure from the collection of a document. The topic is viewed as a recurring pattern of co-occurring words. A topic includes a group of words that often occurs together. Topic modeling can link words with the same context and differentiate across the uses of words with different meanings. In this paper, we discuss methods of Topic Modeling which includes Vector Space Model (VSM), Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA) with their features and limitations. After that, we will discuss tools available for topic modeling such as Gensim, Standford topic modeling toolbox, MALLET, BigARTM. Then some of the applications of Topic Modeling covered. Topic models have a wide range of applications like tag recommendation, text categorization, keyword extraction, information filtering and similarity search in the fields of text mining, information retrieval.

74 citations


Proceedings ArticleDOI
02 Feb 2017
TL;DR: This paper addresses the task of document retrieval based on the degree of document relatedness to the meanings of a query by presenting a semantic-enabled language model that adopts a probabilistic reasoning model for calculating the conditional probability of a queries concept given values assigned to document concepts.
Abstract: This paper addresses the task of document retrieval based on the degree of document relatedness to the meanings of a query by presenting a semantic-enabled language model. Our model relies on the use of semantic linking systems for forming a graph representation of documents and queries, where nodes represent concepts extracted from documents and edges represent semantic relatedness between concepts. Based on this graph, our model adopts a probabilistic reasoning model for calculating the conditional probability of a query concept given values assigned to document concepts. We present an integration framework for interpolating other retrieval systems with the presented model in this paper. Our empirical experiments on a number of TREC collections show that the semantic retrieval has a synergetic impact on the results obtained through state of the art keyword-based approaches, and the consideration of semantic information obtained from entity linking on queries and documents can complement and enhance the performance of other retrieval models.

69 citations


Journal ArticleDOI
TL;DR: This work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association and infers correlations between views not explained by the latent space model.
Abstract: Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090–1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)].

Proceedings Article
12 Feb 2017
TL;DR: An unsupervised data reconstruction framework is proposed, which jointly considers the reconstruction for latent semantic space and observed term vector space and can capture the salience of sentences from these two different and complementary vector spaces.
Abstract: We propose a new unsupervised sentence salience framework for Multi-Document Summarization (MDS), which can be divided into two components: latent semantic modeling and salience estimation. For latent semantic modeling, a neural generative model called Variational Auto-Encoders (VAEs) is employed to describe the observed sentences and the corresponding latent semantic representations. Neural variational inference is used for the posterior inference of the latent variables. For salience estimation, we propose an unsupervised data reconstruction framework, which jointly considers the reconstruction for latent semantic space and observed term vector space. Therefore, we can capture the salience of sentences from these two different and complementary vector spaces. Thereafter, the VAEs-based latent semantic model is integrated into the sentence salience estimation component in a unified fashion, and the whole framework can be trained jointly by back-propagation via multi-task learning. Experimental results on the benchmark datasets DUC and TAC show that our framework achieves better performance than the state-of-the-art models.

Journal ArticleDOI
TL;DR: These latent variable methods are extended to modeling high dimensional time series data to extract the most dynamic latent time series, of which the current values are best predicted from the past values of the extracted latent variables.

Posted Content
TL;DR: KBLRN as discussed by the authors is a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features, integrating feature types with a novel combination of neural representation learning and probabilistic product of experts models.
Abstract: We present KBLRN, a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features. KBLRN integrates feature types with a novel combination of neural representation learning and probabilistic product of experts models. To the best of our knowledge, KBLRN is the first approach that learns representations of knowledge bases by integrating latent, relational, and numerical features. We show that instances of KBLRN outperform existing methods on a range of knowledge base completion tasks. We contribute a novel data sets enriching commonly used knowledge base completion benchmarks with numerical features. The data sets are available under a permissive BSD-3 license. We also investigate the impact numerical features have on the KB completion performance of KBLRN.

Journal ArticleDOI
TL;DR: Simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data found that the 3 approaches can exhibit profound differences when applied to real data.
Abstract: The problem of partitioning a collection of objects based on their measurements on a set of dichotomous variables is a well-established problem in psychological research, with applications including clinical diagnosis, educational testing, cognitive categorization, and choice analysis. Latent class analysis and K-means clustering are popular methods for partitioning objects based on dichotomous measures in the psychological literature. The K-median clustering method has recently been touted as a potentially useful tool for psychological data and might be preferable to its close neighbor, K-means, when the variable measures are dichotomous. We conducted simulation-based comparisons of the latent class, K-means, and K-median approaches for partitioning dichotomous data. Although all 3 methods proved capable of recovering cluster structure, K-median clustering yielded the best average performance, followed closely by latent class analysis. We also report results for the 3 methods within the context of an application to transitive reasoning data, in which it was found that the 3 approaches can exhibit profound differences when applied to real data. (PsycINFO Database Record

Posted Content
TL;DR: The capability of the convolutional VAE model to modify the phonetic content or the speaker identity for speech segments using the derived operations, without the need for parallel supervisory data is demonstrated.
Abstract: An ability to model a generative process and learn a latent representation for speech in an unsupervised fashion will be crucial to process vast quantities of unlabelled speech data. Recently, deep probabilistic generative models such as Variational Autoencoders (VAEs) have achieved tremendous success in modeling natural images. In this paper, we apply a convolutional VAE to model the generative process of natural speech. We derive latent space arithmetic operations to disentangle learned latent representations. We demonstrate the capability of our model to modify the phonetic content or the speaker identity for speech segments using the derived operations, without the need for parallel supervisory data.

Journal ArticleDOI
TL;DR: In this paper, the authors describe the general time-intensive longitudinal latent class modeling framework implemented in Mplus and discuss the Bayesian estimation based on Markov chain Monto Carlo, which allows modeling with arbitrary long time series data and many random effects.
Abstract: This article describes the general time-intensive longitudinal latent class modeling framework implemented in Mplus. For each individual a latent class variable is measured at each time point and the latent class changes across time follow a Markov process (i.e., a hidden or latent Markov model), with subject-specific transition probabilities that are estimated as random effects. Such a model for single-subject data has been referred to as the regime-switching state-space model. The latent class variable can be measured by continuous or categorical indicators, under the local independence condition, or more generally by a class-specific structural equation model or a dynamic structural equation model. We discuss the Bayesian estimation based on Markov chain Monto Carlo, which allows modeling with arbitrary long time series data and many random effects. The modeling framework is illustrated with several simulation studies.

Journal ArticleDOI
TL;DR: In this paper, two topic modeling algorithms are explored, namely LSI and SVD and Mr.LDA for learning topic ontology and the objective is to determine the statistical relationship between document and terms to build a topic ontologies and ontology graph with minimum human intervention.

Journal ArticleDOI
TL;DR: An effective person re-identification method with latent variables is proposed, which represents a pedestrian as the mixture of a holistic model and a number of flexible models, and develops a latent metric learning method for learning the effective metric matrix.
Abstract: In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.

Journal ArticleDOI
TL;DR: This paper proposes multimodal Similarity Gaussian Process latent variable model (m-SimGP), which learns the mapping functions between the intra-modal similarities and latent representation, and proposes m-DRSimGP, which combines the distance preservation in m-DSimGP and semantic preservation in m-RSim GP to learn the latent representation.
Abstract: Data from real applications involve multiple modalities representing content with the same semantics from complementary aspects. However, relations among heterogeneous modalities are simply treated as observation-to-fit by existing work, and the parameterized modality specific mapping functions lack flexibility in directly adapting to the content divergence and semantic complicacy in multimodal data. In this paper, we build our work based on the Gaussian process latent variable model (GPLVM) to learn the non-parametric mapping functions and transform heterogeneous modalities into a shared latent space. We propose multimodal Similarity Gaussian Process latent variable model (m-SimGP), which learns the mapping functions between the intra-modal similarities and latent representation. We further propose multimodal distance-preserved similarity GPLVM (m-DSimGP) to preserve the intra-modal global similarity structure, and multimodal regularized similarity GPLVM (m-RSimGP) by encouraging similar/dissimilar points to be similar/dissimilar in the latent space. We propose m-DRSimGP, which combines the distance preservation in m-DSimGP and semantic preservation in m-RSimGP to learn the latent representation. The overall objective functions of the four models are solved by simple and scalable gradient decent techniques. They can be applied to various tasks to discover the nonlinear correlations and to obtain the comparable low-dimensional representation for heterogeneous modalities. On five widely used real-world data sets, our approaches outperform existing models on cross-modal content retrieval and multimodal classification.

Posted Content
TL;DR: This paper proposed a stochastic recurrent model, where each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps, and training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence.
Abstract: Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{this https URL}

Journal ArticleDOI
TL;DR: This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data, and leverages matrix and Tensor factorization models that produce essentially unique latent representations of the data.
Abstract: Dimensionality reduction techniques play an essential role in data analytics, signal processing, and machine learning. Dimensionality reduction is usually performed in a preprocessing stage that is separate from subsequent data analysis, such as clustering or classification. Finding reduced-dimension representations that are well-suited for the intended task is more appealing. This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data. The proposed approach leverages matrix and tensor factorization models that produce essentially unique latent representations of the data to unravel latent cluster structure—which is otherwise obscured because of the freedom to apply an oblique transformation in latent space. At the same time, latent cluster structure is used as prior information to enhance the performance of factorization. Specific contributions include several custom-built problem formulations, corresponding algorithms, and discussion of associated convergence properties. Besides extensive simulations, real-world datasets such as Reuters document data and MNIST image data are also employed to showcase the effectiveness of the proposed approaches.

Proceedings ArticleDOI
07 Aug 2017
TL;DR: This paper first adopts the spectral regression method to learn the optimal latent space shared by data of all modalities based on the orthogonal constraints, then builds a graph model to project the multi-modality data into the latent space.
Abstract: Cross-modal retrieval has received much attention in recent years. It is a commonly used method to project multi-modality data into a common subspace and then retrieve. However, nearly all existing methods directly adopt the space defined by the binary class label information without learning as the shared subspace for regression. In this paper, we first adopt the spectral regression method to learn the optimal latent space shared by data of all modalities based on the orthogonal constraints. Then we construct a graph model to project the multi-modality data into the latent space. Finally, we combine these two processes together to jointly learn the latent space and regress. We conduct extensive experiments on multiple benchmark datasets and our proposed method outperforms the state-of-the-art approaches.

Journal ArticleDOI
TL;DR: Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.
Abstract: Topic models [e.g., probabilistic latent semantic analysis, latent Dirichlet allocation (LDA), and supervised LDA] have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership LDA (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery, where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.

Journal ArticleDOI
TL;DR: The theoretical background of LVMMs is provided and their exploratory character is emphasized, the general framework together with assumptions and necessary constraints is outlined, the difference between models with and without covariates is highlighted, and the interrelation between the number of classes and the complexity of the within-class model is discussed as well as the relevance of measurement invariance.

Proceedings ArticleDOI
Daniel Yue Zhang1, Dong Wang1, Hao Zheng1, Xin Mu1, Qi Li1, Yang Zhang1 
01 Jul 2017
TL;DR: A Context-Aware POI Category Prediction (CAP-CP) scheme using Natural Language Processing (NLP) models and a novel Temporal Adaptive Ngram (TA-Ngram) model to capture the dynamic dependency between check-in points to address temporal dependency challenge.
Abstract: Point-of-Interest (POI) recommendation is an important application in Location-based Social Networks (LBSN). The category prediction problem is to predict the next POI category that users may visit. The predicted category information is critical in large-scale POI recommendation because it can significantly reduce the prediction space and improve the recommendation accuracy. While efforts have been made to address the POI category prediction problem, several important challenges still exist. First, existing solutions did not fully explore the temporal dependency (e.g., “long range dependency”) of users' check-in traces. Second, the hidden contextual information associated with each check-in point has been underutilized. In this work, we propose a Context-Aware POI Category Prediction (CAP-CP) scheme using Natural Language Processing (NLP) models. In particular, to address temporal dependency challenge, we develop a novel Temporal Adaptive Ngram (TA-Ngram) model to capture the dynamic dependency between check-in points. To address the challenge of hidden context incorporation, CAP-CP leverages the Probabilistic Latent Semantic Analysis (PLSA) model to infer the semantic implications of the context variables in the prediction model. Empirical results on a real world dataset show that our scheme can effectively improve the performance of the state-of-the-art POI recommendation solutions.

Journal ArticleDOI
TL;DR: The Max-margin Latent Pattern Learning (MLPL) method is proposed to learn high-level semantic descriptions of latent action patterns as the output of this framework, achieving the state-of-the-art performances on Skoda, WISDM and OPP datasets.

Posted Content
TL;DR: This work describes and studies in BNNs with latent variables a decomposition of predictive uncertainty into its epistemic and aleatoric components, and uses a similar decomposition to develop a novel risk sensitive objective for safe reinforcement learning (RL).
Abstract: Bayesian neural networks (BNNs) with latent variables are probabilistic models which can automatically identify complex stochastic patterns in the data. We describe and study in these models a decomposition of predictive uncertainty into its epistemic and aleatoric components. First, we show how such a decomposition arises naturally in a Bayesian active learning scenario by following an information theoretic approach. Second, we use a similar decomposition to develop a novel risk sensitive objective for safe reinforcement learning (RL). This objective minimizes the effect of model bias in environments whose stochastic dynamics are described by BNNs with latent variables. Our experiments illustrate the usefulness of the resulting decomposition in active learning and safe RL settings.

Journal ArticleDOI
TL;DR: This paper proposes a novel hierarchical video event detection model, which deliberately unifies the processes of underlying semantics discovery and event modeling from video data and devise an effective model to automatically uncover video semantics by hierarchically capturing latent static-visual concepts in frame-level and latent activity concepts in segment-level.
Abstract: Semantic information is important for video event detection. How to automatically discover, model, and utilize semantic information to facilitate video event detection has been a challenging problem. In this paper, we propose a novel hierarchical video event detection model, which deliberately unifies the processes of underlying semantics discovery and event modeling from video data. Specially, different from most of the approaches based on manually pre-defined concepts, we devise an effective model to automatically uncover video semantics by hierarchically capturing latent static-visual concepts in frame-level and latent activity concepts (i.e., temporal sequence relationships of static-visual concepts) in segment-level. The unified model not only enables a discriminative and descriptive representation for videos, but also alleviates error propagation problem from video representation to event modeling existing in previous methods. A max-margin framework is employed to learn the model. Extensive experiments on four challenging video event datasets, i.e., MED11, CCV, UQE50, and FCVID, have been conducted to demonstrate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: In this paper, a multivariate generalized latent variable model is proposed to investigate the effects of observable and latent explanatory variables on multiple responses of interest, such as continuous, count, ordinal, and nominal variables.
Abstract: We consider a multivariate generalized latent variable model to investigate the effects of observable and latent explanatory variables on multiple responses of interest. Various types of correlated responses, such as continuous, count, ordinal, and nominal variables, are considered in the regression. A generalized confirmatory factor analysis model that is capable of managing mixed-type data is proposed to characterize latent variables via correlated observed indicators. In addressing the complicated structure of the proposed model, we introduce continuous underlying measurements to provide a unified model framework for mixed-type data. We develop a multivariate version of the Bayesian adaptive least absolute shrinkage and selection operator procedure, which is implemented with a Markov chain Monte Carlo (MCMC) algorithm in a full Bayesian context, to simultaneously conduct estimation and model selection. The empirical performance of the proposed methodology is demonstrated through a simulation study. An ...