scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Proceedings ArticleDOI
28 Jun 2009
TL;DR: This work proposes a novel user segmentation algorithm named Probabilistic Latent Semantic User Segmentation (PLSUS), which adopts the probabilistic latent semantic analysis to mine the relationship between users and their behaviors so as to segment users in a semantic manner.
Abstract: Behavioral Targeting (BT), which aims to deliver the most appropriate advertisements to the most appropriate users, is attracting much attention in online advertising market. A key challenge of BT is how to automatically segment users for ads delivery, and good user segmentation may significantly improve the ad click-through rate (CTR). Different from classical user segmentation strategies, which rarely take the semantics of user behaviors into consideration, we propose in this paper a novel user segmentation algorithm named Probabilistic Latent Semantic User Segmentation (PLSUS). PLSUS adopts the probabilistic latent semantic analysis to mine the relationship between users and their behaviors so as to segment users in a semantic manner. We perform experiments on the real world ad click through log of a commercial search engine. Comparing with the other two classical clustering algorithms, K-Means and CLUTO, PLSUS can further improve the ads CTR up to 100%. To our best knowledge, this work is an early semantic user segmentation study for BT in academia.

45 citations

Proceedings Article
01 Jun 2013
TL;DR: A novel method is presented for the computation of compositionality within a distributional framework that is modeled as a multi-way interaction between latent factors, which are automatically constructed from corpus data.
Abstract: In this paper, we present a novel method for the computation of compositionality within a distributional framework. The key idea is that compositionality is modeled as a multi-way interaction between latent factors, which are automatically constructed from corpus data. We use our method to model the composition of subject verb object triples. The method consists of two steps. First, we compute a latent factor model for nouns from standard co-occurrence data. Next, the latent factors are used to induce a latent model of three-way subject verb object interactions. Our model has been evaluated on a similarity task for transitive phrases, in which it exceeds the state of the art.

44 citations

Proceedings ArticleDOI
02 Nov 2007
TL;DR: A method for the mapping of ontologies that discovers and exploits sets of latent features for approximating the intended meaning of ontology elements by applying the reverse generative process of the Latent Dirichlet Allocation model is proposed.
Abstract: This paper proposes a method for the mapping of ontologies that, in a greater extent than other approaches, discovers and exploits sets of latent features for approximating the intended meaning of ontology elements. This is done by applying the reverse generative process of the Latent Dirichlet Allocation model. Similarity between element pairs is computed by means of the Kullback-Leibler divergence measure. Experimental results show the potential of the method.

44 citations

Proceedings ArticleDOI
05 Jul 2010
TL;DR: This paper proposes two techniques: Multi-modal Latent Semantic Indexing (MMLSI) and Multi-Modal Probabilistic LatentSemantic Analysis (MMpLSA), which incorporate visual features and tags by generating simultaneous semantic contexts.
Abstract: Popular image retrieval schemes generally rely only on a single mode, (either low level visual features or embedded text) for searching in multimedia databases. Many popular image collections (eg. those emerging over Internet) have associated tags, often for human consumption. A natural extension is to combine information from multiple modes for enhancing effectiveness in retrieval. In this paper, we propose two techniques: Multi-modal Latent Semantic Indexing (MMLSI) and Multi-Modal Probabilistic Latent Semantic Analysis (MMpLSA). These methods are obtained by directly extending their traditional single mode counter parts. Both these methods incorporate visual features and tags by generating simultaneous semantic contexts. The experimental results demonstrate an improved accuracy over other single and multi-modal methods.

44 citations

Journal ArticleDOI
TL;DR: In this paper, a latent process is modelled by a mixture of auto-regressive AR(1) processes with different means and correlation coefficients, but with equal variances.
Abstract: Summary Motivated by an application to a longitudinal data set coming from the Health and Retirement Study about self-reported health status, we propose a model for longitudinal data which is based on a latent process to account for the unobserved heterogeneity between sample units in a dynamic fashion. The latent process is modelled by a mixture of auto-regressive AR(1) processes with different means and correlation coefficients, but with equal variances. We show how to perform maximum likelihood estimation of the proposed model by the joint use of an expectation–maximization algorithm and a Newton–Raphson algorithm, implemented by means of recursions developed in the hidden Markov model literature. We also introduce a simple method to obtain standard errors for the parameter estimates and suggest a strategy to choose the number of mixture components. In the application the response variable is ordinal; however, the approach may also be applied in other settings. Moreover, the application to the self-reported health status data set allows us to show that the model proposed is more flexible than other models for longitudinal data based on a continuous latent process. The model also achieves a goodness of fit that is similar to that of models based on a discrete latent process following a Markov chain, while retaining a reduced number of parameters. The effect of different formulations of the latent structure of the model is evaluated in terms of estimates of the regression parameters for the covariates.

44 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858