Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

Incorporating boosted regression trees into ecological latent variable models

[...]

Rebecca A. Hutchinson¹, Li-Ping Liu¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

07 Aug 2011

TL;DR: A methodology for integrating non-parametric tree methods into probabilistic latent variable models by extending functional gradient boosting is presented in the context of occupancy-detection modeling, where the goal is to model the distribution of a species from imperfect detections.

...read moreread less

Abstract: Important ecological phenomena are often observed indirectly. Consequently, probabilistic latent variable models provide an important tool, because they can include explicit models of the ecological phenomenon of interest and the process by which it is observed. However, existing latent variable methods rely on hand-formulated parametric models, which are expensive to design and require extensive preprocessing of the data. Nonparametric methods (such as regression trees) automate these decisions and produce highly accurate models. However, existing tree methods learn direct mappings from inputs to outputs—they cannot be applied to latent variable models. This paper describes a methodology for integrating non-parametric tree methods into probabilistic latent variable models by extending functional gradient boosting. The approach is presented in the context of occupancy-detection (OD) modeling, where the goal is to model the distribution of a species from imperfect detections. Experiments on 12 real and 3 synthetic bird species compare standard and tree-boosted OD models (latent variable models) with standard and tree-boosted logistic regression models (without latent structure). All methods perform similarly when predicting the observed variables, but the OD models learn better representations of the latent process. Most importantly, tree-boosted OD models learn the best latent representations when non-linearities and interactions are present.

...read moreread less

51 citations

Journal Article•DOI•

Modeling Relations among Discrete Developmental Processes: A General Approach to Associative Latent Transition Analysis

[...]

Bethany C. Bray¹, Stephanie T. Lanza², Linda M. Collins²•Institutions (2)

Virginia Tech¹, Pennsylvania State University²

12 Oct 2010-Structural Equation Modeling

TL;DR: A flexible approach to modeling relations in development among two or more discrete, multidimensional latent variables based on the general framework of loglinear modeling with latent variables called associative latent transition analysis (ALTA).

...read moreread less

Abstract: To understand one developmental process, it is often helpful to investigate its relations with other developmental processes. Statistical methods that model development in multiple processes simultaneously over time include latent growth curve models with time-varying covariates, multivariate latent growth curve models, and dual trajectory models. These models are designed for growth represented by continuous, unidimensional trajectories. The purpose of this article is to present a flexible approach to modeling relations in development among two or more discrete, multidimensional latent variables based on the general framework of loglinear modeling with latent variables called associative latent transition analysis (ALTA). Focus is given to the substantive interpretation of different associative latent transition models, and exactly what hypotheses are expressed in each model. An empirical demonstration of ALTA is presented to examine the association between the development of alcohol use and sexual risk ...

...read moreread less

51 citations

Term representation with Generalized Latent Semantic Analysis

[...]

Irina Matveeva, Gina-Anne Levow, Ayman Farahat, Christiaan Royer

13 Dec 2007

TL;DR: This paper presents Generalized Latent Semantic Analysis as a framework for computing semantically motivated term and document vectors and demonstrates that GLSA term vectors efficiently capture semantic relations between terms and outperform related approaches on the synonymy test.

...read moreread less

Abstract: Document indexing and representation of termdocument relations are very important issues for document clustering and retrieval. In this paper, we present Generalized Latent Semantic Analysis as a framework for computing semantically motivated term and document vectors. Our focus on term vectors is motivated by the recent success of co-occurrence based measures of semantic similarity obtained from very large corpora. Our experiments demonstrate that GLSA term vectors efficiently capture semantic relations between terms and outperform related approaches on the synonymy test.

...read moreread less

50 citations

Journal Article•DOI•

A novel sentence similarity measure for semantic-based expert systems

[...]

Ming Che Lee¹•Institutions (1)

Ming Chuan University¹

01 May 2011-Expert Systems With Applications

TL;DR: The proposed two-phase algorithm evaluates the semantic similarity for two or more sentences via a semantic vector space and has outstanding performance in handling long sentences with complex syntax.

...read moreread less

Abstract: Research highlights? This research takes advantages of corpus-based ontology and Information Retrieval technologies to evaluate the semantic similarity between irregular sentences. ? The part of speech concept was taken into account and was integrated into the proposed semantic-VSM measure. ? This research tries to qualify the semantic similarity of natural language sentences. A novel sentence similarity measure for semantic based expert systems is presented. The well-known problem in the fields of semantic processing, such as QA systems, is to evaluate the semantic similarity between irregular sentences. This paper takes advantage of corpus-based ontology to overcome this problem. A transformed vector space model is introduced in this article. The proposed two-phase algorithm evaluates the semantic similarity for two or more sentences via a semantic vector space. The first phase built part-of-speech (POS) based subspaces by the raw data, and the latter carried out a cosine evaluation and adopted the WordNet ontology to construct the semantic vectors. Unlike other related researches that focused only on short sentences, our algorithm is applicable to short (4-5 words), medium (8-12 words), and even long sentences (over 12 words). The experiment demonstrates that the proposed algorithm has outstanding performance in handling long sentences with complex syntax. The significance of this research lies in the semantic similarity extraction of sentences, with arbitrary structures.

...read moreread less

50 citations

Proceedings Article•DOI•

Gpsm: a generalized probabilistic semantic model for ambiguity resolution

[...]

Jing-Shin Chang¹, Yih-Fen Luo, Keh-Yih Su¹•Institutions (1)

National Tsing Hua University¹

28 Jun 1992

TL;DR: This article proposed a generalized probabilistic semantic model (GPSM) for preference assignment in natural language processing, which integrates lexical, syntactic and semantic preference under a uniform formulation and showed substantial improvement in structural disambiguation over a syntax-based approach.

...read moreread less

Abstract: In natural language processing, ambiguity resolution is a central issue, and can be regarded as a preference assignment problem. In this paper, a Generalized Probabilistic Semantic Model (GPSM) is proposed for preference computation. An effective semantic tagging procedure is proposed for tagging semantic features. A semantic score function is derived based on a score function, which integrates lexical, syntactic and semantic preference under a uniform formulation. The semantic score measure shows substantial improvement in structural disambiguation over a syntax-based approach.

...read moreread less

50 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics