scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Proceedings ArticleDOI
01 Nov 2014
TL;DR: This paper proposes an unsupervised image categorisation model in which the semantic content of images are discovered using Probabilistic Latent Semantic Analysis, after which they are clustered into unique groups based on semantic content similarities using K-means algorithm, thereby providing suitable annotation exemplars.
Abstract: -Image annotation has been identified to be a suitable means by which the semantic gap which has made the accuracy of Content-based image retrieval unsatisfactory be eliminated. However existing methods of automatic annotation of images depends on supervised learning, which can be difficult to implement due to the need for manually annotated training samples which are not always readily available. This paper argues that the unsupervised learning via Probabilistic Latent Semantic Analysis provides a more suitable machine learning approach for image annotation especially due to its potential to based categorisation on the latent semantic content of the image samples, which can bridge the semantic gap present in Content Based Image Retrieval. This paper therefore proposes an unsupervised image categorisation model in which the semantic content of images are discovered using Probabilistic Latent Semantic Analysis, after which they are clustered into unique groups based on semantic content similarities using K-means algorithm, thereby providing suitable annotation exemplars. A common problem with categorisation algorithms based on Bag-of-Visual Words modelling is the loss of accuracy due to spatial incoherency of the Bag-of-Visual Word modelling, this paper also examines the effectiveness of Spatial pyramid as a means of eliminating spatial incoherency in Probabilistic Latent Semantic Analysis classification.

15 citations

Journal ArticleDOI
TL;DR: In this paper, a class of probabilistic latent class models with or without response errors and without intrinsically unscalable respondents is described starting from perfectly discriminating nonmonotone dichotomous items.
Abstract: Starting from perfectly discriminating nonmonotone dichotomous items, a class of probabilistic models with or without response errors and with or without intrinsically unscalable respondents is described. All these models can be understood as simply restricted latent class analysis. Thus, the estimation and identifiability of the parameters (class sizes and item latent probabilities) as well as the chi-squared goodness-of-fit tests (Pearson and likelihood-ratio) are free of the problems. The applicability of the proposed variants of latent class models is demonstrated on real attitudinal data.

15 citations

Book ChapterDOI
11 Nov 2012
TL;DR: This work compares the effectiveness of a series of approaches to select the best tags ranging from traditional IR techniques such as TF/IDF weighting to novel techniques based on ontological distances and latent Dirichlet allocation.
Abstract: We tackle the problem of improving the relevance of automatically selected tags in large-scale ontology-based information systems. Contrary to traditional settings where tags can be chosen arbitrarily, we focus on the problem of recommending tags (e.g., concepts) directly from a collaborative, user-driven ontology. We compare the effectiveness of a series of approaches to select the best tags ranging from traditional IR techniques such as TF/IDF weighting to novel techniques based on ontological distances and latent Dirichlet allocation. All our experiments are run against a real corpus of tags and documents extracted from the ScienceWise portal, which is connected to ArXiv.org and is currently used by growing number of researchers. The datasets for the experiments are made available online for reproducibility purposes.

14 citations

Journal ArticleDOI
TL;DR: A latent space random effects model with a covariate-defined social space, where the social space is a linear combination of the covariates as estimated by an MCMC algorithm is proposed.

14 citations

Posted Content
TL;DR: A Dirichlet-multinomial model is presented in which frames are latent categories that explain the linking of verb-subject-object triples, given document-level sparsity, and what the model learns is analyzed.
Abstract: We develop a probabilistic latent-variable model to discover semantic frames---types of events and their participants---from corpora. We present a Dirichlet-multinomial model in which frames are latent categories that explain the linking of verb-subject-object triples, given document-level sparsity. We analyze what the model learns, and compare it to FrameNet, noting it learns some novel and interesting frames. This document also contains a discussion of inference issues, including concentration parameter learning; and a small-scale error analysis of syntactic parsing accuracy.

14 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858