Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

More data trumps smarter algorithms: comparing pointwise mutual information with latent semantic analysis.

[...]

Gabriel Recchia¹, Michael N. Jones¹•Institutions (1)

Indiana University¹

01 Aug 2009-Behavior Research Methods

TL;DR: This work evaluates a simple metric of pointwise mutual information and demonstrates that this metric benefits from training on extremely large amounts of data and correlates more closely with human semantic similarity ratings than do publicly available implementations of several more complex models.

...read moreread less

Abstract: Computational models of lexical semantics, such as latent semantic analysis, can automatically generate semantic similarity measures between words from statistical redundancies in text. These measures are useful for experimental stimulus selection and for evaluating a model’s cognitive plausibility as a mechanism that people might use to organize meaning in memory. Although humans are exposed to enormous quantities of speech, practical constraints limit the amount of data that many current computational models can learn from. We follow up on previous work evaluating a simple metric of pointwise mutual information. Controlling for confounds in previous work, we demonstrate that this metric benefits from training on extremely large amounts of data and correlates more closely with human semantic similarity ratings than do publicly available implementations of several more complex models. We also present a simple tool for building simple and scalable models from large corpora quickly and efficiently.

...read moreread less

153 citations

Proceedings Article•DOI•

Topologically-constrained latent variable models

[...]

Raquel Urtasun¹, David J. Fleet², Andreas Geiger³, Jovan Popović¹, Trevor Darrell¹, Neil D. Lawrence⁴ - Show less +2 more•Institutions (4)

Massachusetts Institute of Technology¹, University of Toronto², Karlsruhe Institute of Technology³, University of Manchester⁴

05 Jul 2008

TL;DR: A range of approaches for embedding data in a non-Euclidean latent space for the Gaussian Process latent variable model allows to learn transitions between motion styles even though such transitions are not present in the data.

...read moreread less

Abstract: In dimensionality reduction approaches, the data are typically embedded in a Euclidean latent space. However for some data sets this is inappropriate. For example, in human motion data we expect latent spaces that are cylindrical or a toroidal, that are poorly captured with a Euclidean space. In this paper, we present a range of approaches for embedding data in a non-Euclidean latent space. Our focus is the Gaussian Process latent variable model. In the context of human motion modeling this allows us to (a) learn models with interpretable latent directions enabling, for example, style/content separation, and (b) generalise beyond the data set enabling us to learn transitions between motion styles even though such transitions are not present in the data.

...read moreread less

153 citations

Proceedings Article•DOI•

[...]

Chris Ding¹•Institutions (1)

Lawrence Berkeley National Laboratory¹

01 Aug 1999

TL;DR: A dual probability model is constructed for the Latent Semantic Indexing using the cosine similarity measure, establishing a statistical framework for LSI and leading to a statistical criterion for the optimal semantic dimensions.

...read moreread less

Abstract: A dual probability model is constructed for the Latent Semantic Indexing (LSI) using the cosine similarity measure. Both the document-document similarity matrix and the term-term similarity matrix naturally arise from the maximum likelihood estimation of the model parameters, and the optimal solutions are the latent semantic vectors of of LSI. Dimensionality reduction is justi ed by the statistical signi cance of latent semantic vectors as measured by the likelihood of the model. This leads to a statistical criterion for the optimal semantic dimensions, answering a critical open question in LSI with practical importance. Thus the model establishes a statistical framework for LSI. Ambiguities related to statistical modeling of LSI are clari ed.

...read moreread less

152 citations

Journal Article•DOI•

Variational inference for latent variables and uncertain inputs in Gaussian processes

[...]

Andreas Damianou¹, Michalis K. Titsias², Neil D. Lawrence¹•Institutions (2)

University of Sheffield¹, Athens University of Economics and Business²

01 Jan 2016-Journal of Machine Learning Research

TL;DR: A Bayesian method for training GP-LVMs by introducing a non-standard variational inference framework that allows to approximately integrate out the latent variables and subsequently train a GP-LVM by maximising an analytic lower bound on the exact marginal likelihood.

...read moreread less

Abstract: The Gaussian process latent variable model (GP-LVM) provides a flexible approach for non-linear dimensionality reduction that has been widely applied. However, the current approach for training GP-LVMs is based on maximum likelihood, where the latent projection variables are maximised over rather than integrated out. In this paper we present a Bayesian method for training GP-LVMs by introducing a non-standard variational inference framework that allows to approximately integrate out the latent variables and subsequently train a GP-LVM by maximising an analytic lower bound on the exact marginal likelihood. We apply this method for learning a GP-LVM from i.i.d. observations and for learning non-linear dynamical systems where the observations are temporally correlated. We show that a benefit of the variational Bayesian procedure is its robustness to overfitting and its ability to automatically select the dimensionality of the non-linear latent space. The resulting framework is generic, flexible and easy to extend for other purposes, such as Gaussian process regression with uncertain or partially missing inputs. We demonstrate our method on synthetic data and standard machine learning benchmarks, as well as challenging real world datasets, including high resolution video data.

...read moreread less

151 citations

Journal Article•DOI•

Dynamic Latent Trait Models for Multidimensional Longitudinal Data

[...]

David B. Dunson

01 Sep 2003-Journal of the American Statistical Association

TL;DR: In this paper, a general modeling framework is proposed that allows mixtures of count, categorical, and continuous response variables, and each response is related to age-specific latent traits through a generalized linear model that accommodates item-specific measurement errors.

...read moreread less

Abstract: This article presents a new approach for analysis of multidimensional longitudinal data, motivated by studies using an item response battery to measure traits of an individual repeatedly over time. A general modeling framework is proposed that allows mixtures of count, categorical, and continuous response variables. Each response is related to age-specific latent traits through a generalized linear model that accommodates item-specific measurement errors. A transition model allows the latent traits at a given age to depend on observed predictors and on previous latent traits for that individual. Following a Bayesian approach to inference, a Markov chain Monte Carlo algorithm is proposed for posterior computation. The methods are applied to data from a neurotoxicity study of the pesticide methoxychlor, and evidence of a dose-dependent increase in motor activity is presented.

...read moreread less

151 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics