scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Journal ArticleDOI
TL;DR: The experimental results show that the proposed cloud detection method can automatically and accurately detect clouds using the multispectral information of the available four bands.
Abstract: Automatic cloud extraction from satellite imagery is a vital process for many applications in optical remote sensing since clouds can locally obscure the surface features and alter the reflectance. Clouds can be easily distinguished by the human eyes in satellite imagery via remarkable regional characteristics, but finding a way to automatically detect various kinds of clouds by computer programs to speed up the processing efficiency remains a challenge. This paper introduces a new cloud detection method based on probabilistic latent semantic analysis (PLSA) and object-based machine learning. The method begins by segmenting satellite images into superpixels by Simple Linear Iterative Clustering (SLIC) algorithm while also extracting the spectral, texture, frequency and line segment features. Then, the implicit information in each superpixel is extracted from the feature histogram through the PLSA model by which the descriptor of each superpixel can be computed to form a feature vector for classification. Thereafter, the cloud mask is extracted by optimal thresholding and applying the Support Vector Machine (SVM) algorithm at the superpixel level. The GrabCut algorithm is then applied to extract more accurate cloud regions at the pixel level by assuming the cloud mask as the prior knowledge. When compared to different cloud detection methods in the literature, the overall accuracy of the proposed cloud detection method was up to 90 percent for ZY-3 and GF-1 images, which is about a 6.8 percent improvement over the traditional spectral-based methods. The experimental results show that the proposed method can automatically and accurately detect clouds using the multispectral information of the available four bands.

35 citations

Patent
Jun Xu1, Hang Li1, Nick Craswell1
27 Jun 2011
TL;DR: In this article, the Regularized Latent Semantic Indexing (RLSI) approach is applied to a fixed number of documents such that the set of documents is topic modeled.
Abstract: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.

35 citations

Proceedings Article
03 Dec 2012
TL;DR: This paper proposes kernel latent SVM (KLSVM) – a new learning framework that combines latent SVMs and kernel methods and develops an iterative training algorithm to learn the model parameters.
Abstract: Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision. However, a limitation of LSVMs is that they rely on linear models. For many computer vision tasks, linear models are suboptimal and nonlinear models learned with kernels typically perform much better. Therefore it is desirable to develop the kernel version of LSVM. In this paper, we propose kernel latent SVM (KLSVM) – a new learning framework that combines latent SVMs and kernel methods. We develop an iterative training algorithm to learn the model parameters. We demonstrate the effectiveness of KLSVM using three different applications in visual recognition. Our KLSVM formulation is very general and can be applied to solve a wide range of applications in computer vision and machine learning.

35 citations

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed MPLSA method can achieve better scene classification accuracy than the traditional single-feature PLSA method.
Abstract: Scene classification can mine high-level semantic information scene categories from low-level visual features for high spatial resolution remote sensing images (HSRIs). A multifeature probabilistic latent semantic analysis (MPLSA) algorithm is proposed to perform the task of scene classification for HSRIs. Distinct from the traditional probabilistic latent semantic analysis (PLSA) with a single feature, to utilize the spatial information of the HSRIs, in MPLSA, multiple features, including spectral and texture features, and the scale-invariant feature transform feature, are combined with PLSA. The visual words are characterized by the multifeature descriptor, and an image set is represented by a discriminative word-image matrix. During the training phase, the MPLSA model mines the visual words’ latent semantics. For unknown images, the MPLSA model analyzes their corresponding latent semantic distributions by combining the words’ latent semantics obtained from the training step. The spectral angle mapper classifier is utilized to label the scene class, based on the image’s latent semantic distribution. The experimental results demonstrate that the proposed MPLSA method can achieve better scene classification accuracy than the traditional single-feature PLSA method.

35 citations

Proceedings Article
01 Jan 2005
TL;DR: This paper applies efficient variational inference based on DMA, which replaces the DP prior by a simpler alternative, namely Dirichlet-multinomial allocation (DMA), which maintains the main modelling properties of the DP.
Abstract: This paper describes nonparametric Bayesian treatments for analyzing records containing occurrences of items. The introduced model retains the strength of previous approaches that explore the latent factors of each record (e.g. topics of documents), and further uncovers the clustering structure of records, which reflects the statistical dependencies of the latent factors. The nonparametric model induced by a Dirichlet process (DP) flexibly adapts model complexity to reveal the clustering structure of the data. To avoid the problems of dealing with infinite dimensions, we further replace the DP prior by a simpler alternative, namely Dirichlet-multinomial allocation (DMA), which maintains the main modelling properties of the DP. Instead of relying on Markov chain Monte Carlo (MCMC) for inference, this paper applies efficient variational inference based on DMA. The proposed approach yields encouraging empirical results on both a toy problem and text data. The results show that the proposed algorithm uncovers not only the latent factors, but also the clustering structure.

35 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858