scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Journal ArticleDOI
TL;DR: A new approach based on text mining techniques for predicting student performance using LSA (latent semantic analysis) and K-means clustering methods using free-style comments written by students after each lesson is proposed.
Abstract: In this paper we propose a new approach based on text mining techniques for predicting student performance using LSA (latent semantic analysis) and K-means clustering methods. The present study uses free-style comments written by students after each lesson. Since the potentials of these comments can reflect student learning attitudes, understanding of subjects and difficulties of the lessons, they enable teachers to grasp the tendencies of student learning activities. To improve our basic approach using LSA and k-means, overlap and similarity measuring methods are proposed. We conducted experiments to validate our proposed methods. The experimental results reported a model of student academic performance predictors by analyzing their comments data as variables of predictors. Our proposed methods achieved an average 66.4% prediction accuracy after applying the k-means clustering method and those were 73.6% and 78.5% by adding the overlap method and the similarity measuring method, respectively.

39 citations

Journal ArticleDOI
TL;DR: This paper proposes a new generative model for Multi-view Learning via Probabilistic Latent Semantic Analysis, called MVPLSA, which jointly model the co-occurrences of features and documents from different views.

39 citations

Journal Article
TL;DR: A simple categorization on topic models derived from LDA is made, and representative models of each category are introduced, which help to understand the relationship of works during the development of topic models.
Abstract: Topic models are receiving extensive attention in natural language processing.In this field,a topic is regarded as probabilistic distribution of terms.Topic models extract semantic topics using co-occurrence of terms in document level,and are used to transform documents locating in term space to the ones in topic space,obtaining the low dimensional representation of documents. This paper starts from Latent Semantic Indexing(LSI),the origin of topic models,and describes pLSI and LDA,the fundamental works in the development of topic models,with focus on the relationship among these works.As a generative model,LDA can be easily extended to other models.This paper makes a simple categorization on topic models derived from LDA,and representative models of each category are introduced.Furthermore,EM algorithms in parameter estimation of topic models are analyzed,which help to understand the relationship of works during the development of topic models.

39 citations

Book ChapterDOI
01 Jan 1999
TL;DR: Methods for enhancing the indexing of spoken documents by using latent semantic analysis and self-organizing maps are presented and tested to extract extra information from the structure of the document collection and use it for more accurate indexing by generating new index terms and stochastic index weights.
Abstract: This paper describes an important application for state-of-art automatic speech recognition, natural language processing and information retrieval systems. Methods for enhancing the indexing of spoken documents by using latent semantic analysis and self-organizing maps are presented, motivated and tested. The idea is to extract extra information from the structure of the document collection and use it for more accurate indexing by generating new index terms and stochastic index weights. Indexing methods are evaluated for two broadcast news databases (one French and one English) using the average document perplexity defined in this paper and test queries analyzed by human experts

39 citations

Journal ArticleDOI
TL;DR: Experimental results show that integrating the proposed feedback paradigm with a state-of-the-art latent matcher improves its identification accuracy by 0.5-3.5 percent for NIST SD27 and WVU latent databases against a background database of 100k exemplars.
Abstract: Latent fingerprints serve as an important source of forensic evidence in a court of law. Automatic matching of latent fingerprints to rolled/plain (exemplar) fingerprints with high accuracy is quite vital for such applications. However, latent impressions are typically of poor quality with complex background noise which makes feature extraction and matching of latents a significantly challenging problem. We propose incorporating top-down information or feedback from an exemplar to refine the features extracted from a latent for improving latent matching accuracy. The refined latent features (e.g. ridge orientation and frequency), after feedback, are used to re-match the latent to the top $K$ candidate exemplars returned by the baseline matcher and resort the candidate list. The contributions of this research include: (i) devising systemic ways to use information in exemplars for latent feature refinement, (ii) developing a feedback paradigm which can be wrapped around any latent matcher for improving its matching performance, and (iii) determining when feedback is actually necessary to improve latent matching accuracy. Experimental results show that integrating the proposed feedback paradigm with a state-of-the-art latent matcher improves its identification accuracy by 0.5-3.5 percent for NIST SD27 and WVU latent databases against a background database of 100k exemplars.

39 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858