Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Using semantic analysis to improve speech recognition performance

[...]

Hakan Erdogan¹, Ruhi Sarikaya², Stanley F. Chen², Yuqing Gao², Michael Picheny² - Show less +1 more•Institutions (2)

Sabancı University¹, IBM²

01 Jul 2005-Computer Speech & Language

TL;DR: This study proposes three new language modeling techniques that use semantic analysis for spoken dialog systems, and shows that as the semantic information utilized is increased and as the tightness of integration between lexical and semantic items is increased, the two types of models become more complementary in nature.

...read moreread less

44 citations

Journal Article•DOI•

Ensemble multi-label text categorization based on rotation forest and latent semantic indexing

[...]

Haytham Elghazel¹, Alex Aussem¹, Ouadie Gharroudi¹, Wafa Saadaoui¹•Institutions (1)

University of Lyon¹

15 Sep 2016-Expert Systems With Applications

TL;DR: The combination of both rotation-based ensemble construction and Latent Semantic Indexing projection is shown to bring about significant improvements in terms of Average Precision, Coverage, Ranking loss and One error compared to five state-of-the-art approaches across 14 real-word textual data sets covering a wide variety of topics including health, education, business, science and arts.

...read moreread less

Abstract: Text categorization has gained increasing popularity in the last years due the explosive growth of multimedia documents. As a document can be associated with multiple non-exclusive categories simultaneously (e.g., Virus, Health, Sports, and Olympic Games), text categorization provides many opportunities for developing novel multi-label learning approaches devoted specifically to textual data. In this paper, we propose an ensemble multi-label classification method for text categorization based on four key ideas: (1) performing Latent Semantic Indexing based on distinct orthogonal projections on lower-dimensional spaces of concepts; (2) random splitting of the vocabulary; (3) document bootstrapping; and (4) the use of BoosTexter as a powerful multi-label base learner for text categorization to simultaneously encourage diversity and individual accuracy in the committee. Diversity of the ensemble is promoted through random splits of the vocabulary that leads to different orthogonal projections on lower-dimensional latent concept spaces. Accuracy of the committee members is promoted through the underlying latent semantic structure uncovered in the text. The combination of both rotation-based ensemble construction and Latent Semantic Indexing projection is shown to bring about significant improvements in terms of Average Precision, Coverage, Ranking loss and One error compared to five state-of-the-art approaches across 14 real-word textual data sets covering a wide variety of topics including health, education, business, science and arts.

...read moreread less

44 citations

Proceedings Article•

Methods to Integrate a Language Model with Semantic Information for a Word Prediction Component

[...]

Tonio Wandmacher, Jean-Yves Antoine

01 Jun 2007

TL;DR: This paper explored the predictive powers of Latent Semantic Analysis (LSA), a method that has been shown to provide reliable information on long-distance semantic dependencies between words in a context, and presented several methods that integrate LSA-based information with a standard language model: a semantic cache, partial re-ranking, and different forms of interpolation.

...read moreread less

Abstract: Most current word prediction systems make use of n-gram language models (LM) to estimate the probability of the following word in a phrase. In the past years there have been many attempts to enrich such language models with further syntactic or semantic information. We want to explore the predictive powers of Latent Semantic Analysis (LSA), a method that has been shown to provide reliable information on long-distance semantic dependencies between words in a context. We present and evaluate here several methods that integrate LSA-based information with a standard language model: a semantic cache , partial reranking , and different forms of interpolation. We found that all methods show significant improvements, compared to the 4gram baseline, and most of them to a simple cache model as well.

...read moreread less

44 citations

Proceedings Article•DOI•

Multi-scale blocks based image emotion classification using multiple instance learning

[...]

Tianrong Rao¹, Min Xu¹, Huiying Liu², Jinqiao Wang³, Ian Burnett¹ - Show less +1 more•Institutions (3)

University of Technology, Sydney¹, Institute for Infocomm Research Singapore², Chinese Academy of Sciences³

03 Aug 2016

TL;DR: This work proposes an emotion classification method based on multi-scale blocks using Multiple Instance Learning (MIL), which reduces the need for exact labelling and is employed to classify the dominant emotion type of the image.

...read moreread less

Abstract: Emotional factors usually affect users' preferences for and evaluations of images. Although affective image analysis attracts increasing attention, there are still three major challenges remaining: 1) it is difficult to classify an image into a single emotion type since different regions within an image can represent different emotions; 2) there is a gap between low-level features and high-level emotions and 3) it is difficult to collect a training set of reliable emotional image content. To address these three issues, we propose an emotion classification method based on multi-scale blocks using Multiple Instance Learning (MIL). We firstly extract blocks of an image at multiple scales using different image segmentation methods pyramid segmentation and simple linear iterative clustering (SLIC) and represent each block using the bag-of-visual-words (BoVW) method. Then, to bridge the “affective gap”, probabilistic latent semantic analysis (pLSA) is employed to estimate the latent topic distribution as a mid-level representation of each block. Finally, MIL, which reduces the need for exact labelling, is employed to classify the dominant emotion type of the image. Experiments carried out on three widely used datasets demonstrate that our proposed method with S-LIC effectively improves the state-of-the-art results of image emotion classification 5.1% on average.

...read moreread less

44 citations

Proceedings Article•

Factorial LDA: Sparse Multi-Dimensional Text Models

[...]

Michael D. Paul¹, Mark Dredze¹•Institutions (1)

Johns Hopkins University¹

03 Dec 2012

TL;DR: Factorial LDA is introduced, a multi-dimensional model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables, which incorporates structured word priors and learns a sparse product of factors.

...read moreread less

Abstract: Latent variable models can be enriched with a multi-dimensional structure to consider the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured word priors and learns a sparse product of factors. Experiments on research abstracts show that our model can learn latent factors such as research topic, scientific discipline, and focus (methods vs. applications). Our modeling improvements reduce test perplexity and improve human interpretability of the discovered factors.

...read moreread less

43 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics