scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Journal ArticleDOI
TL;DR: It is proven that there is a direct relationship to the size of the LSA dimension reduction and the L SA self-correlation, and it is shown that by altering theLSA term self-Correlations the authors gain a substantial increase in precision, while also reducing the computation required during the information retrieval process.
Abstract: Latent semantic analysis (LSA) is a generalized vector space method that uses dimension reduction to generate term correlations for use during the information retrieval process. We hypothesized that even though the dimension reduction establishes correlations between terms, the dimension reduction is causing a degradation in the correlation of a term to itself (self-correlation). In this article, we have proven that there is a direct relationship to the size of the LSA dimension reduction and the LSA self-correlation. We have also shown that by altering the LSA term self-correlations we gain a substantial increase in precision, while also reducing the computation required during the information retrieval process.

22 citations

Proceedings ArticleDOI
09 Aug 2015
TL;DR: A Stochastic Gradient Descent based optimization procedure is developed to fit the model by jointly learning the weight of each context and latent factors, which significantly outperforms not only the base model but also the representative context-aware recommendation models.
Abstract: In this paper, we propose a generic framework to learn context-aware latent representations for context-aware collaborative filtering. Contextual contents are combined via a function to produce the context influence factor, which is then combined with each latent factor to derive latent representations. We instantiate the generic framework using biased Matrix Factorization as the base model. A Stochastic Gradient Descent (SGD) based optimization procedure is developed to fit the model by jointly learning the weight of each context and latent factors. Experiments conducted over three real-world datasets demonstrate that our model significantly outperforms not only the base model but also the representative context-aware recommendation models.

22 citations

Proceedings ArticleDOI
14 May 2006
TL;DR: This paper proposes using a "dynamic key term lexicon" automatically extracted from the ever-changing document archives as an extra feature set in the retrieval task, which can retrieve relevant documents more efficiently.
Abstract: Spoken document retrieval will be very important in the future network era. In this paper, we propose using a "dynamic key term lexicon" automatically extracted from the ever-changing document archives as an extra feature set in the retrieval task. This lexicon is much more compact but semantically rich, thus it can retrieve relevant documents more efficiently. The key terms include named entities and others selected by a new metric referred to as the term entropy here derived from probabilistic latent semantic analysis (PLSA). Various configurations of retrieval models were tested with a broadcast news archive in Mandarin Chinese and significant performance improvements were obtained, especially with the new version of PLSA models based on a key term lexicon rather than the full lexicon.

22 citations

Book ChapterDOI
08 Sep 2008
TL;DR: A new measure of semantic relatedness between any pair of terms for the English language, using WordNet as the authors' knowledge base is proposed and a new WSD method based on the proposed measure is introduced.
Abstract: Word sense disambiguation (WSD) methods evolve towards exploring all of the available semantic information that word thesauri provide. In this scope, the use of semantic graphs and new measures of semantic relatedness may offer better WSD solutions. In this paper we propose a new measure of semantic relatedness between any pair of terms for the English language, using WordNet as our knowledge base. Furthermore, we introduce a new WSD method based on the proposed measure. Experimental evaluation of the proposed method in benchmark data shows that our method matches or surpasses state of the art results. Moreover, we evaluate the proposed measure of semantic relatedness in pairs of terms ranked by human subjects. Results reveal that our measure of semantic relatedness produces a ranking that is more similar to the human generated one, compared to rankings generated by other related measures of semantic relatedness proposed in the past.

22 citations

Journal ArticleDOI
TL;DR: In this paper, a multivariate generalized latent variable model is proposed to investigate the effects of observable and latent explanatory variables on multiple responses of interest, such as continuous, count, ordinal, and nominal variables.
Abstract: We consider a multivariate generalized latent variable model to investigate the effects of observable and latent explanatory variables on multiple responses of interest. Various types of correlated responses, such as continuous, count, ordinal, and nominal variables, are considered in the regression. A generalized confirmatory factor analysis model that is capable of managing mixed-type data is proposed to characterize latent variables via correlated observed indicators. In addressing the complicated structure of the proposed model, we introduce continuous underlying measurements to provide a unified model framework for mixed-type data. We develop a multivariate version of the Bayesian adaptive least absolute shrinkage and selection operator procedure, which is implemented with a Markov chain Monte Carlo (MCMC) algorithm in a full Bayesian context, to simultaneously conduct estimation and model selection. The empirical performance of the proposed methodology is demonstrated through a simulation study. An ...

22 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858