scispace - formally typeset
Search or ask a question
Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.


Papers
More filters
Proceedings ArticleDOI
Jason Weston1, Ron Weiss1, Hector Yee1
12 Oct 2013
TL;DR: This work proposes to model the user with a richer set of functions, specifically via a set of latent vectors, where each vector captures one of the user's latent interests or tastes, and describes a simple, general and efficient algorithm for learning such a model.
Abstract: Classical matrix factorization approaches to collaborative filtering learn a latent vector for each user and each item, and recommendations are scored via the similarity between two such vectors, which are of the same dimension In this work, we are motivated by the intuition that a user is a much more complicated entity than any single item, and cannot be well described by the same representation Hence, the variety of a user's interests could be better captured by a more complex representation We propose to model the user with a richer set of functions, specifically via a set of latent vectors, where each vector captures one of the user's latent interests or tastes The overall recommendation model is then nonlinear where the matching score between a user and a given item is the maximum matching score over each of the user's latent interests with respect to the item's latent representation We describe a simple, general and efficient algorithm for learning such a model, and apply it to large scale, real-world datasets from YouTube and Google Music, where our approach outperforms existing techniques

69 citations

Proceedings ArticleDOI
13 Mar 2005
TL;DR: The proposed approach consists in identifying important concepts in documents using two criterions, co-occurrence and semantic relatedness and then disambiguating them via an external general purpose ontology, namely WordNet.
Abstract: This paper deals with the use of ontologies for Information Retrieval. Roughly, the proposed approach consists in identifying important concepts in documents using two criterions, co-occurrence and semantic relatedness and then disambiguating them via an external general purpose ontology, namely WordNet. Matching the ontology and a document results in a set of scored concept-senses (nodes) with weighted links. This representation, called semantic core of a document best reveals the semantic content of the document. We regard our approach, of which the first evaluation results are encouraging, as a short but strong step toward the long term goal of Intelligent Indexing and Semantic Retrieval.

69 citations

Proceedings ArticleDOI
05 Jun 2000
TL;DR: A new latent semantic indexing (LSI) method for spoken audio documents that smoothing by the closest document clusters is important here, because the documents are often short and have a high word error rate (WER).
Abstract: This paper describes a new latent semantic indexing (LSI) method for spoken audio documents. The framework is indexing broadcast news from radio and TV as a combination of large vocabulary continuous speech recognition (LVCSR), natural language processing (NLP) and information retrieval (IR). For indexing, the documents are presented as vectors of word counts, whose dimensionality is rapidly reduced by random mapping (RM). The obtained vectors are projected into the latent semantic subspace determined by SVD, where the vectors are then smoothed by a self-organizing map (SOM). The smoothing by the closest document clusters is important here, because the documents are often short and have a high word error rate (WER). As the clusters in the semantic subspace reflect the news topics, the SOMs provide an easy way to visualize the index and query results and to explore the database. Test results are reported for TREC's spoken document retrieval databases (www.idiap.ch/kurimo/thisl.html).

67 citations

Proceedings ArticleDOI
23 Oct 2008
TL;DR: An incremental recommendation algorithm based on Probabilistic Latent Semantic Analysis (PLSA) that can consider not only the users' long-term and short-term interests, but also users' negative and positive feedback is proposed.
Abstract: With the fast development of web 2.0, user-centric publishing and knowledge management platforms, such as Wiki, Blogs, and Q & A systems attract a large number of users. Given the availability of the huge amount of meaningful user generated content, incremental model based recommendation techniques can be employed to improve users' experience using automatic recommendations. In this paper, we propose an incremental recommendation algorithm based on Probabilistic Latent Semantic Analysis (PLSA). The proposed algorithm can consider not only the users' long-term and short-term interests, but also users' negative and positive feedback. We compare the proposed method with several baseline methods using a real-world Question & Answer website called Wenda. Experiments demonstrate both the effectiveness and the efficiency of the proposed methods.

66 citations

Journal ArticleDOI
TL;DR: This work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association and infers correlations between views not explained by the latent space model.
Abstract: Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090–1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)].

66 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202277
202114
202036
201927
201858