Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A two-stage feature selection method for text categorization

[...]

Jiana Meng¹, Hongfei Lin¹•Institutions (1)

Dalian University of Technology¹

09 Sep 2010

TL;DR: This paper proposes a two-stage feature selection algorithm based on a kind of feature selection method and latent semantic indexing that constructs a new reduced semantic space between terms based on latent semanticindexing method.

...read moreread less

Abstract: Feature selection for text classification is a well-studied problem and the goals are improving classification effectiveness, computational efficiency, or both. In this paper, we propose a two-stage feature selection algorithm based on a kind of feature selection method and latent semantic indexing. Traditional word-matching based text categorization system uses vector space model to represent the document. However, it needs a high dimensional space to represent the document, and does not take into account the semantic relationship between terms, which can also lead to poor classification accuracy. Latent semantic indexing can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimensionality but also discovers the important associative relationship between terms. Because of the too much calculation time of constructing a new semantic space, in this algorithm, firstly we apply a kind of feature selection method to reduce the term dimensions. Secondly, we construct a new reduced semantic space between terms based on latent semantic indexing method. Through some applications involving spam database categorization, we find that our two-stage feature selection method performs better.

...read moreread less

15 citations

Proceedings Article•DOI•

Probabilistic 3D occupancy flow with latent silhouette cues

[...]

Li Guan¹, Jean-Sébastien Franco², Edmond Boyer³, Marc Pollefeys¹•Institutions (3)

University of North Carolina at Chapel Hill¹, University of Bordeaux², French Institute for Research in Computer Science and Automation³

13 Jun 2010

TL;DR: A new low-level analysis based on latent silhouette cues, particularly suited for low-texture and outdoor datasets is proposed, using an EM framework to simultaneously update a set of volumetric voxel occupancy probabilities and retrieve a best estimate of the dense 3D motion field from the last consecutively observed multi-view frame set.

...read moreread less

Abstract: In this paper we investigate shape and motion retrieval in the context of multi-camera systems. We propose a new low-level analysis based on latent silhouette cues, particularly suited for low-texture and outdoor datasets. Our analysis does not rely on explicit surface representations, instead using an EM framework to simultaneously update a set of volumetric voxel occupancy probabilities and retrieve a best estimate of the dense 3D motion field from the last consecutively observed multi-view frame set. As the framework uses only latent, probabilistic silhouette information, the method yields a promising 3D scene analysis method robust to many sources of noise and arbitrary scene objects. It can be used as input for higher level shape modeling and structural inference tasks. We validate the approach and demonstrate its practical use for shape and motion analysis experimentally.

...read moreread less

15 citations

Proceedings Article•DOI•

Nondeterministic versus probabilistic linear search algorithms

[...]

Friedhelm Meyer auf der Heide

21 Oct 1985

TL;DR: The proof of the lower bound differs fundamentally from all known lower bounds for LSA's or PLSA's, because it does not reduce the problem to a combinatorial one but argues extensively about e.g. a non-discrete measure for similarity of sets in Rn.

...read moreread less

Abstract: The "component counting lower bound" known for deterministic linear search algorithms (LSA's) also holds for their probabilistic versions (PLSA's) for many problems, even if two-sided error is allowed, and if one does not charge for probabilistic choice. This implies lower bounds on PLSA's for e.g. the element distinctness problem (n log n) or the knapsack problem (n2). These results yield the first separations between probabilistic and non-deterministic LSA's, because the above problems are non-deterministically much easier. Previous lower bounds for PLSA's either only worked for one-sided error "on the nice side", i.e. on the side where the problems are even non-deterministically hard, or only for probabilistic comparison trees. The proof of the lower bound differs fundamentally from all known lower bounds for LSA's or PLSA's, because it does not reduce the problem to a combinatorial one but argues extensively about e.g. a non-discrete measure for similarity of sets in Rn. This lower bound result solves an open problem posed by Manber and Tompa as well as by Snir. Furthermore, a PLSA for n input variables with two-sided error and expected runtime T can be simulated by a (deterministic) LSA in T2n steps. This proves that the gaps between probabilistic and deterministic LSA's shown by Snir cannot be too large. As this simulation even holds for algebraic computation trees we show that probabilistic and deterministic versions of this model are polynomially related. This is a weaker version of a result due to the author which shows that in case of LSA's, even the non-deterministic and deterministic versions are polynomially related.

...read moreread less

15 citations

Journal Article•DOI•

A Weighted Topic Model Learned From Local Semantic Space for Automatic Image Annotation

[...]

Haiyu Song¹, Pengjie Wang¹, Jian Yun¹, Wei Li¹, Bo Xue¹, Gang Wu¹ - Show less +2 more•Institutions (1)

Dalian Nationalities University¹

21 Apr 2020-IEEE Access

TL;DR: A novel annotation method based on topic model, namely local learning-based probabilistic latent semantic analysis (LL-PLSA) that significantly outperforms the state-of-the-art especially in terms of overall metrics.

...read moreread less

Abstract: Automatic image annotation plays a significant role in image understanding, retrieval, classification, and indexing. Today, it is becoming increasingly important in order to annotate large-scale social media images from content-sharing websites and social networks. These social images are usually annotated by user-provided low-quality tags. The topic model is considered as a promising method to describe these weak-labeling images by learning latent representations of training samples. The recent annotation methods based on topic models have two shortcomings. First, they are difficult to scale to a large-scale image dataset. Second, they can not be used to online image repository because of continuous addition of new images and new tags. In this paper, we propose a novel annotation method based on topic model, namely local learning-based probabilistic latent semantic analysis (LL-PLSA), to solve the above problems. The key idea is to train a weighted topic model for a given test image on its semantic neighborhood consisting of a fixed number of semantically and visually similar images. This method can scale to a large-scale image database, as training samples involved in modeling are a few nearest neighbors rather than the entire database. Moreover, this proposed topic model, online customized for the test image, naturally addresses the issue of continuous addition of new images and new tags in a database. Extensive experiments on three benchmark datasets demonstrate that the proposed method significantly outperforms the state-of-the-art especially in terms of overall metrics.

...read moreread less

15 citations

Journal Article•DOI•

A feature-word-topic model for image annotation and retrieval

[...]

Cam-Tu Nguyen¹, Natsuda Kaothanthong², Takeshi Tokuyama², Xuan-Hieu Phan³•Institutions (3)

Nanjing University¹, Tohoku University², University of Engineering and Technology, Lahore³

30 Sep 2013-ACM Transactions on The Web

TL;DR: This article proposes a novel method for image annotation based on combining feature-word distributions, which map from visual space to word space, and word-topic distributions, who form a structure to capture label relationships for annotation.

...read moreread less

Abstract: Image annotation is a process of finding appropriate semantic labels for images in order to obtain a more convenient way for indexing and searching images on the Web. This article proposes a novel method for image annotation based on combining feature-word distributions, which map from visual space to word space, and word-topic distributions, which form a structure to capture label relationships for annotation. We refer to this type of model as Feature-Word-Topic models. The introduction of topics allows us to efficiently take word associations, such as locean, fish, coralr or ldesert, sand, cactusr, into account for image annotation. Unlike previous topic-based methods, we do not consider topics as joint distributions of words and visual features, but as distributions of words only. Feature-word distributions are utilized to define weights in computation of topic distributions for annotation. By doing so, topic models in text mining can be applied directly in our method. Our Feature-word-topic model, which exploits Gaussian Mixtures for feature-word distributions, and probabilistic Latent Semantic Analysis (pLSA) for word-topic distributions, shows that our method is able to obtain promising results in image annotation and retrieval.

...read moreread less

15 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics