scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Book ChapterDOI
TL;DR: This chapter discusses the topic of semantic memory in psycholinguistic perspective and then demonstrates how semantic propositions are verified, and the basic Feature Comparison model is applied for the verification of simple property statements.
Abstract: Publisher Summary This chapter presents a theoretical approach to semantic memory, which is applicable to a wide range of semantic phenomena. The chapter discusses the topic of semantic memory in psycholinguistic perspective and then demonstrates how semantic propositions are verified. A semantic feature representation is first assumed, which distinguishes between defining and characteristic features. This representation is then coupled with a two-stage processing model, and then the resulting Feature Comparison model is applied to the results of studies requiring the verification of simple subset statements. This model offers an explicit explanation of semantic relatedness and category size effects in this paradigm. The Feature Comparison model is then extended to accommodate findings from recent Same-Different experiments. The extended model proves capable of encompassing a range of semantic relatedness findings, including some newly reported effects which seem problematic for other models. The basic Feature Comparison model is applied for the verification of simple property statements. While the representation of property information necessitates several new structural and processing considerations, the basic model provides an explanation of various semantic effects on verification.

75 citations

Proceedings Article
31 Jul 1999
TL;DR: This work has used LSA as a mechanism for evaluating the quality of student responses in an intelligent tutoring system, and its performance equals that of human raters with intermediate domain knowledge.
Abstract: Latent Semantic Analysis (LSA) is a statistical, corpus-based text comparison mechanism that was originally developed for the task of information retrieval, but in recent years has produced remarkably human-like abilities in a variety of language tasks. LSA has taken the Test of English as a Foreign Language and performed as well as non-native English speakers who were successful college applicants. It has shown an ability to learn words at a rate similar to humans. It has even graded papers as reliably as human graders. We have used LSA as a mechanism for evaluating the quality of student responses in an intelligent tutoring system, and its performance equals that of human raters with intermediate domain knowledge. It has been claimed that LSA's text-comparison abilities stem primarily from its use of a statistical technique called singular value decomposition (SVD) which compresses a large amount of term and document co-occurrence information into a smaller space. This compression is said to capture the semantic information that is latent in the corpus itself. We test this claim by comparing LSA to a version of LSA without SVD, as well as a simple keyword matching model.

75 citations

Proceedings ArticleDOI
15 Jun 2017
TL;DR: Methods of Topic Modeling which includes Vector Space Model (VSM), Latent Semantic Indexing (LSI), Probabilistic LatentSemantic Analysis (PLSA),Latent Dirichlet Allocation (LDA) with their features and limitations are discussed.
Abstract: Topic modeling is a powerful technique for analysis of a huge collection of a document. Topic modeling is used for discovering hidden structure from the collection of a document. The topic is viewed as a recurring pattern of co-occurring words. A topic includes a group of words that often occurs together. Topic modeling can link words with the same context and differentiate across the uses of words with different meanings. In this paper, we discuss methods of Topic Modeling which includes Vector Space Model (VSM), Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA) with their features and limitations. After that, we will discuss tools available for topic modeling such as Gensim, Standford topic modeling toolbox, MALLET, BigARTM. Then some of the applications of Topic Modeling covered. Topic models have a wide range of applications like tag recommendation, text categorization, keyword extraction, information filtering and similarity search in the fields of text mining, information retrieval.

74 citations

Book ChapterDOI
01 Jan 2014
TL;DR: This paper provides a short, concise overview of some selected text mining methods, focusing on statistical methods, i.e. Latent Semantic Analysis, Probabilistic Latent seminar analysis, Hierarchical Latent Dirichlet Allocation, Principal Component Analysis, and Support Vector Machines, along with some examples from the biomedical domain.
Abstract: Text is a very important type of data within the biomedical domain. For example, patient records contain large amounts of text which has been entered in a non-standardized format, consequently posing a lot of challenges to processing of such data. For the clinical doctor the written text in the medical findings is still the basis for decision making – neither images nor multimedia data. However, the steadily increasing volumes of unstructured information need machine learning approaches for data mining, i.e. text mining. This paper provides a short, concise overview of some selected text mining methods, focusing on statistical methods, i.e. Latent Semantic Analysis, Probabilistic Latent Semantic Analysis, Latent Dirichlet Allocation, Hierarchical Latent Dirichlet Allocation, Principal Component Analysis, and Support Vector Machines, along with some examples from the biomedical domain. Finally, we provide some open problems and future challenges, particularly from the clinical domain, that we expect to stimulate future research.

73 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858