scispace - formally typeset
Open AccessProceedings Article

Probabilistic latent semantic analysis

Reads0
Chats0
TLDR
This work proposes a widely applicable generalization of maximum likelihood model fitting by tempered EM, based on a mixture decomposition derived from a latent class model which results in a more principled approach which has a solid foundation in statistics.
Abstract
Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results in a more principled approach which has a solid foundation in statistics. In order to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM. Our approach yields substantial and consistent improvements over Latent Semantic Analysis in a number of experiments.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Context sensitive topic models for author influence in document networks

TL;DR: This work proposes novel document generation schemes that incorporate the context while simultaneously modeling the interests of citing authors and influence of the cited authors and shows significant improvements over baseline models for various evaluation criteria.
Journal ArticleDOI

A hierarchical model for ordinal matrix factorization

TL;DR: The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies, and shows that the suggested model outperforms alternative factorization techniques.
Journal ArticleDOI

A hybrid term-term relations analysis approach for topic detection

TL;DR: The approach fuses multiple relations into a term graph and detects topics from the graph using a graph analytical method and can not only detect topics more effectively by combing mutually complementary relations, but also mine important rare topics by leveraging latent co-occurrence relations.
Patent

Indicator-based recommendation system

TL;DR: In this article, an indicator-based recommendation system is proposed to capture a greater depth and variety of real-world relationships among items, and is able to handle p-adic systems and systems with ternary or higher relations.
Book ChapterDOI

Using probabilistic latent semantic analysis for personalized web search

TL;DR: This paper presents an approach that mines unseen factors from web logs to personalized web search, based on probabilistic latent semantic analysis, a model based technique that is used to analyze co-occurrence data.
References
More filters
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Book

Introduction to Modern Information Retrieval

TL;DR: Reading is a need and a hobby at once and this condition is the on that will make you feel that you must read.
Journal ArticleDOI

A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge.

TL;DR: A new general theory of acquired similarity and knowledge representation, latent semantic analysis (LSA), is presented and used to successfully simulate such learning and several other psycholinguistic phenomena.
Journal ArticleDOI

Probabilistic latent semantic indexing

TL;DR: Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data.
Related Papers (5)