scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Proceedings ArticleDOI
07 Aug 2017
TL;DR: This paper first adopts the spectral regression method to learn the optimal latent space shared by data of all modalities based on the orthogonal constraints, then builds a graph model to project the multi-modality data into the latent space.
Abstract: Cross-modal retrieval has received much attention in recent years. It is a commonly used method to project multi-modality data into a common subspace and then retrieve. However, nearly all existing methods directly adopt the space defined by the binary class label information without learning as the shared subspace for regression. In this paper, we first adopt the spectral regression method to learn the optimal latent space shared by data of all modalities based on the orthogonal constraints. Then we construct a graph model to project the multi-modality data into the latent space. Finally, we combine these two processes together to jointly learn the latent space and regress. We conduct extensive experiments on multiple benchmark datasets and our proposed method outperforms the state-of-the-art approaches.

33 citations

Proceedings ArticleDOI
28 Jul 2003
TL;DR: By examining the number of times term t is identified for a search on term t' (precision) using differing ranges of dimensions, it is found that lower ranked dimensions identify related terms and higher-ranked dimensions discriminate between the synonyms.
Abstract: We seek insight into Latent Semantic Indexing by establishing a method to identify the optimal number of factors in the reduced matrix for representing a keyword. This method is demonstrated empirically by duplicating all documents containing a term t, and inserting new documents in the database that replace t with t'. By examining the number of times term t is identified for a search on term t' (precision) using differing ranges of dimensions, we find that lower ranked dimensions identify related terms and higher-ranked dimensions discriminate between the synonyms.

33 citations

Journal ArticleDOI
TL;DR: Experimental results show that the semantic driven super-resolution can significantly improve over the original settings, and the benefits vs. the drawbacks of using semantic information are discussed.

32 citations

Proceedings ArticleDOI
04 Oct 2004
TL;DR: A use of confidence scores to weight words in the history, a weight of the prior topic distribution and a way of calculating perplexity that accounts for recognition errors in the model context are introduced.
Abstract: This paper describes experiments with a PLSA-based language model for conversational telephone speech. This model uses a long-range history and exploits topic information in the test text to adjust probabilities of test words. The PLSA-based model was found to lower test set perplexity over a traditional word+class-based 4-gram by 13% (optimistic estimate using a reference transcript as history) or by 6% (realistic estimate using recognised transcript as history). Moreover, this paper introduces a use of confidence scores to weight words in the history, a weight of the prior topic distribution and a way of calculating perplexity that accounts for recognition errors in the model context.

32 citations

Journal ArticleDOI
TL;DR: In this paper, a generalization of the bias-corrected 3-step estimation method was proposed for latent class analysis, which can overcome the downward-biased estimates of the covariate effects on initial state and transition probabilities.
Abstract: Latent Markov models with covariates can be estimated via 1-step maximum likelihood. However, this 1-step approach has various disadvantages, such as that the inclusion of covariates in the model might alter the formation of the latent states and that parameter estimation could become infeasible with large numbers of time points, responses, and covariates. This is why researchers typically prefer performing the analysis in a stepwise manner; that is, they first construct the measurement model, then obtain the latent state classifications, and subsequently study the relationship between covariates and latent state memberships. However, such a stepwise approach yields downward-biased estimates of the covariate effects on initial state and transition probabilities. This article, shows how to overcome this problem using a generalization of the bias-corrected 3-step estimation method proposed for latent class analysis (Asparouhov & Muthen, 2014; Bolck, Croon, & Hagenaars, 2004; Vermunt, 2010). We give a formal...

32 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858