scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Journal ArticleDOI
01 Dec 2012
TL;DR: The proposed method to learn a probability distribution over a set of latent topics of a given audio clip in an unsupervised manner is shown to be superior to other latent structure analysis methods, such as latent semantic analysis and probabilistic latent semanticAnalysis.
Abstract: We propose the notion of latent acoustic topics to capture contextual information embedded within a collection of audio signals. The central idea is to learn a probability distribution over a set of latent topics of a given audio clip in an unsupervised manner, assuming that there exist latent acoustic topics and each audio clip can be described in terms of those latent acoustic topics. In this regard, we use the latent Dirichlet allocation (LDA) to implement the acoustic topic models over elemental acoustic units, referred as acoustic words, and perform text-like audio signal processing. Experiments on audio tag classification with the BBC sound effects library demonstrate the usefulness of the proposed latent audio context modeling schemes. In particular, the proposed method is shown to be superior to other latent structure analysis methods, such as latent semantic analysis and probabilistic latent semantic analysis. We also demonstrate that topic models can be used as complementary features to content-based features and offer about 9% relative improvement in audio classification when combined with the traditional Gaussian mixture model (GMM)–Support Vector Machine (SVM) technique.

17 citations

Book ChapterDOI
01 Aug 1999
TL;DR: A probabilistic approach which combines a statistical, model-based analysis with a topological visualization principle is presented which can be utilized to derive topic maps which represent topical information by characteristic keyword distributions arranged in a two-dimensional spatial layout.
Abstract: The visualization of large text databases and document collections is an important step towards more flexible and interactive types of information retrieval. This paper presents a probabilistic approach which combines a statistical, model-based analysis with a topological visualization principle. Our method can be utilized to derive topic maps which represent topical information by characteristic keyword distributions arranged in a two-dimensional spatial layout. Combined with multi-resolution techniques this provides a three-dimensional space for interactive information navigation in large text collections.

17 citations

DOI
01 Nov 2005
TL;DR: This paper introduces a novel approach to analyze the execution traces of features using Latent Semantic Indexing (LSI), which detects similarities between features based on the content of their traces, and categorizes classesbased on the frequency of the outgoing invocations involved in the traces.
Abstract: Recently there has been a revival of interest in feature analysis of software systems. Approaches to feature location have used a wide range of techniques such as dynamic analysis, static analysis, information retrieval and formal concept analysis. In this paper we introduce a novel approach to analyze the execution traces of features using Latent Semantic Indexing (LSI). Our goal is twofold. On the one hand we detect similarities between features based on the content of their traces, and on the other hand we categorize classes based on the frequency of the outgoing invocations involved in the traces. We apply our approach on two case studies and we discuss its benefits and drawbacks.

17 citations

Journal ArticleDOI
TL;DR: Methods for approximate inference, known as variational approximations, which have been developed in the machine learning, graphical modeling and statistical physics literatures are set alongside some social and behavioral science literature to consider their potential for “categorical causal modeling”, using latent class analysis.
Abstract: Latent class models in the social and behavioral sciences have remained structurally simple. One reason for this is that inference in statistical models can be computationally difficult. Methods for approximate inference, known as variational approximations, which have been developed in the machine learning, graphical modeling and statistical physics literatures, can be used to alleviate the computational difficulties of inference for latent variable models. The aim of the present article is to set these methods alongside some social and behavioral science literature to which they are relevant, and in particular to consider their potential for “categorical causal modeling”, using latent class analysis. We have collated a number of popular categorical-data models with latent variables and causal structure, typically incorporating a Markovian structure. The efficacy of the approximation methods has been demonstrated through simulations related to an important behavioral science model.

17 citations

Journal ArticleDOI
TL;DR: The new method, that is called Markovian Semantic Indexing (MSI), is presented in the context of an online image retrieval system and shown to possess certain theoretical advantages and also to achieve better Precision versus Recall results when compared to Latent Semanticindexing (LSI) and probabilistic Latent semantic indexing methods in Annotation-Based Image Retrieval (ABIR) tasks.
Abstract: We propose a novel method for automatic annotation, indexing and annotation-based retrieval of images. The new method, that we call Markovian Semantic Indexing (MSI), is presented in the context of an online image retrieval system. Assuming such a system, the users' queries are used to construct an Aggregate Markov Chain (AMC) through which the relevance between the keywords seen by the system is defined. The users' queries are also used to automatically annotate the images. A stochastic distance between images, based on their annotation and the keyword relevance captured in the AMC, is then introduced. Geometric interpretations of the proposed distance are provided and its relation to a clustering in the keyword space is investigated. By means of a new measure of Markovian state similarity, the mean first cross passage time (CPT), optimality properties of the proposed distance are proved. Images are modeled as points in a vector space and their similarity is measured with MSI. The new method is shown to possess certain theoretical advantages and also to achieve better Precision versus Recall results when compared to Latent Semantic Indexing (LSI) and probabilistic Latent Semantic Indexing (pLSI) methods in Annotation-Based Image Retrieval (ABIR) tasks.

17 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858