Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Learning and performing by exploration: label quality measured by latent semantic analysis

[...]

Rodolfo Soto¹•Institutions (1)

University of Colorado Boulder¹

01 May 1999

TL;DR: Latent SemanticAnalysis (LSA) was used to compute semantic similarity between task descriptions and labels in an applications menu system and when the labels in the menus system were semantically similar to the task descriptions, subjects performed the tasks faster.

...read moreread less

Abstract: Models of learning and performing by exploration assume that the semantic similarity between task descriptions and labels on display objects (e.g., menus, tool bars) controls in part the users search strategies. Nevertheless, none of the models has an objective way to compute semantic similarity. In this study, Latent Semantic Analysis (LSA) was used to compute semantic similarity between task descriptions and labels in an applications menu system. Participants performed twelve tasks by exploration and they were tested for recall after a l-week delay. When the labels in the menu system were semantically similar to the task descriptions, subjects performed the tasks faster. LSA could be incorporated into any of the current models, and it could be used to automate the evaluation of computer applications for ease of learning and performing by exploration.

...read moreread less

26 citations

Journal Article•DOI•

A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model

[...]

Isabella Morlini¹•Institutions (1)

University of Modena and Reggio Emilia¹

01 Apr 2012-Advanced Data Analysis and Classification

TL;DR: A model-based clustering approach with mixed binary and continuous variables where each binary attribute is generated by a latent continuous variable that is dichotomized with a suitable threshold value, and where the scores of the latent variables are estimated from the binary data is proposed.

...read moreread less

Abstract: For clustering objects, we often collect not only continuous variables, but binary attributes as well. This paper proposes a model-based clustering approach with mixed binary and continuous variables where each binary attribute is generated by a latent continuous variable that is dichotomized with a suitable threshold value, and where the scores of the latent variables are estimated from the binary data. In economics, such variables are called utility functions and the assumption is that the binary attributes (the presence or the absence of a public service or utility) are determined by low and high values of these functions. In genetics, the latent response is interpreted as the `liability' to develop a qualitative trait or phenotype. The estimated scores of the latent variables, together with the observed continuous ones, allow to use a multivariate Gaussian mixture model for clustering, instead of using a mixture of discrete and continuous distributions. After describing the method, this paper presents the results of both simulated and real-case data and compares the performances of the multivariate Gaussian mixture model and of a mixture of joint multivariate and multinomial distributions. Results show that the former model outperforms the mixture model for variables with different scales, both in terms of classification error rate and reproduction of the clusters means.

...read moreread less

26 citations

Proceedings Article•

An Exploration of Features for Recognizing Word Emotion

[...]

Changqin Quan¹, Fuji Ren¹•Institutions (1)

University of Tokushima¹

23 Aug 2010

TL;DR: A significant performance improvement over contextual and semantic features was observed after adding word emotion components as feature, demonstrating the effectiveness of using semantic feature for word emotion recognition.

...read moreread less

Abstract: Emotion words have been well used as the most obvious choice as feature in the task of textual emotion recognition and automatic emotion lexicon construction. In this work, we explore features for recognizing word emotion. Based on Ren-CECps (an annotated emotion corpus) and MaxEnt (Maximum entropy) model, several contextual features and their combination have been experimented. Then PLSA (probabilistic latent semantic analysis) is used to get semantic feature by clustering words and sentences. The experimental results demonstrate the effectiveness of using semantic feature for word emotion recognition. After that, "word emotion components" is proposed to describe the combined basic emotions in a word. A significant performance improvement over contextual and semantic features was observed after adding word emotion components as feature.

...read moreread less

26 citations

Proceedings Article•DOI•

Semi-supervised document classification with a mislabeling error model

[...]

Anastasia Krithara¹, Massih R. Amini, Jean-Michel Renders¹, Cyril Goutte²•Institutions (2)

Xerox¹, National Research Council²

30 Mar 2008

TL;DR: The proposed approach iteratively labels the unlabeled documents and estimates the probabilities of its labeling errors, which are then taken into account in the estimation of the new model parameters before the next round.

...read moreread less

Abstract: This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled. The proposed approach iteratively labels the unlabeled documents and estimates the probabilities of its labeling errors. These probabilities are then taken into account in the estimation of the new model parameters before the next round. Our approach outperforms an earlier semi-supervised extension of PLSA introduced by [9] which is based on the use of fake labels. However, it maintains its simplicity and ability to solve multiclass problems. In addition, it gives valuable information about the most uncertain and difficult classes to label. We perform experiments over the 20Newsgroups, WebKB and Reuters document collections and show the effectiveness of our approach over two other semi-supervised algorithms applied to these text classification problems.

...read moreread less

26 citations

Journal Article•DOI•

A Bayesian Model For The Estimation Of Latent Interaction And Quadratic Effects When Latent Variables Are Non-Normally Distributed

[...]

Augustin Kelava¹, Benjamin Nagengast²•Institutions (2)

Technische Universität Darmstadt¹, University of Tübingen²

22 Oct 2012-Multivariate Behavioral Research

TL;DR: A Bayesian model is presented for the estimation of latent nonlinear effects when the latent predictor variables are nonnormally distributed and the nonnormal predictor distribution is approximated by a finite mixture distribution.

...read moreread less

Abstract: Structural equation models with interaction and quadratic effects have become a standard tool for testing nonlinear hypotheses in the social sciences. Most of the current approaches assume normally distributed latent predictor variables. In this article, we present a Bayesian model for the estimation of latent nonlinear effects when the latent predictor variables are nonnormally distributed. The nonnormal predictor distribution is approximated by a finite mixture distribution. We conduct a simulation study that demonstrates the advantages of the proposed Bayesian model over contemporary approaches (Latent Moderated Structural Equations [LMS], Quasi-Maximum-Likelihood [QML], and the extended unconstrained approach) when the latent predictor variables follow a nonnormal distribution. The conventional approaches show biased estimates of the nonlinear effects; the proposed Bayesian model provides unbiased estimates. We present an empirical example from work and stress research and provide syntax for substanti...

...read moreread less

26 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics