Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

PLSA-based image auto-annotation: constraining the latent space

[...]

Florent Monay¹, Daniel Gatica-Perez¹•Institutions (1)

Idiap Research Institute¹

10 Oct 2004

TL;DR: A new way of modeling multi-modal co-occurrences is proposed, constraining the definition of the latent space to ensure its consistency in semantic terms (words), while retaining the ability to jointly model visual information.

...read moreread less

Abstract: We address the problem of unsupervised image auto-annotation with probabilistic latent space models. Unlike most previous works, which build latent space representations assuming equal relevance for the text and visual modalities, we propose a new way of modeling multi-modal co-occurrences, constraining the definition of the latent space to ensure its consistency in semantic terms (words), while retaining the ability to jointly model visual information. The concept is implemented by a linked pair of Probabilistic Latent Semantic Analysis (PLSA) models. On a 16000-image collection, we show with extensive experiments that our approach significantly outperforms previous joint models.

...read moreread less

258 citations

Posted Content•

Autoencoding Variational Inference For Topic Models

[...]

Akash Srivastava¹, Charles Sutton•Institutions (1)

University of Edinburgh¹

04 Mar 2017-arXiv: Machine Learning

TL;DR: This work presents what is to their knowledge the first effective AEVB based inference method for latent Dirichlet allocation (LDA), which it is called Autoencoded Variational Inference For Topic Model (AVITM).

...read moreread less

Abstract: Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address this problem is autoencoding variational Bayes (AEVB), but it has proven diffi- cult to apply to topic models in practice. We present what is to our knowledge the first effective AEVB based inference method for latent Dirichlet allocation (LDA), which we call Autoencoded Variational Inference For Topic Model (AVITM). This model tackles the problems caused for AEVB by the Dirichlet prior and by component collapsing. We find that AVITM matches traditional methods in accuracy with much better inference time. Indeed, because of the inference network, we find that it is unnecessary to pay the computational cost of running variational optimization on test data. Because AVITM is black box, it is readily applied to new topic models. As a dramatic illustration of this, we present a new topic model called ProdLDA, that replaces the mixture model in LDA with a product of experts. By changing only one line of code from LDA, we find that ProdLDA yields much more interpretable topics, even if LDA is trained via collapsed Gibbs sampling.

...read moreread less

258 citations

Proceedings Article•DOI•

Unsupervised discovery of visual object class hierarchies

[...]

Josef Sivic¹, Bryan Russell¹, Andrew Zisserman², William T. Freeman³, Alyosha Efros⁴ - Show less +1 more•Institutions (4)

École Normale Supérieure¹, University of Oxford², Massachusetts Institute of Technology³, Carnegie Mellon University⁴

23 Jun 2008

TL;DR: This work proposes to group visual objects using a multi-layer hierarchy tree that is based on common visual elements by adapting to the visual domain the generative hierarchical latent Dirichlet allocation (hLDA) model previously used for unsupervised discovery of topic hierarchies in text.

...read moreread less

Abstract: Objects in the world can be arranged into a hierarchy based on their semantic meaning (e.g. organism - animal - feline - cat). What about defining a hierarchy based on the visual appearance of objects? This paper investigates ways to automatically discover a hierarchical structure for the visual world from a collection of unlabeled images. Previous approaches for unsupervised object and scene discovery focused on partitioning the visual data into a set of non-overlapping classes of equal granularity. In this work, we propose to group visual objects using a multi-layer hierarchy tree that is based on common visual elements. This is achieved by adapting to the visual domain the generative hierarchical latent Dirichlet allocation (hLDA) model previously used for unsupervised discovery of topic hierarchies in text. Images are modeled using quantized local image regions as analogues to words in text. Employing the multiple segmentation framework of Russell et al. [22], we show that meaningful object hierarchies, together with object segmentations, can be automatically learned from unlabeled and unsegmented image collections without supervision. We demonstrate improved object classification and localization performance using hLDA over the previous non-hierarchical method on the MSRC dataset [33].

...read moreread less

255 citations

Journal Article•DOI•

UTOPIAN: User-Driven Topic Modeling Based on Interactive Nonnegative Matrix Factorization

[...]

Jaegul Choo¹, Chang Hyun Lee¹, Chandan K. Reddy², Haesun Park¹•Institutions (2)

Georgia Institute of Technology¹, Wayne State University²

01 Dec 2013-IEEE Transactions on Visualization and Computer Graphics

TL;DR: This work proposes a reliable and flexible visual analytics system for topic modeling called UTOPIAN (User-driven Topic modeling based on Interactive Nonnegative Matrix Factorization), which enables users to interact with the topic modeling method and steer the result in a user-driven manner.

...read moreread less

Abstract: Topic modeling has been widely used for analyzing text document collections. Recently, there have been significant advancements in various topic modeling techniques, particularly in the form of probabilistic graphical modeling. State-of-the-art techniques such as Latent Dirichlet Allocation (LDA) have been successfully applied in visual text analytics. However, most of the widely-used methods based on probabilistic modeling have drawbacks in terms of consistency from multiple runs and empirical convergence. Furthermore, due to the complicatedness in the formulation and the algorithm, LDA cannot easily incorporate various types of user feedback. To tackle this problem, we propose a reliable and flexible visual analytics system for topic modeling called UTOPIAN (User-driven Topic modeling based on Interactive Nonnegative Matrix Factorization). Centered around its semi-supervised formulation, UTOPIAN enables users to interact with the topic modeling method and steer the result in a user-driven manner. We demonstrate the capability of UTOPIAN via several usage scenarios with real-world document corpuses such as InfoVis/VAST paper data set and product review data sets.

...read moreread less

252 citations

Proceedings Article•

A Topic Model for Word Sense Disambiguation

[...]

Jordan Boyd-Graber¹, David M. Blei¹, Xiaojin Zhu²•Institutions (2)

Princeton University¹, University of Wisconsin-Madison²

01 Jun 2007

TL;DR: A probabilistic posterior inference algorithm for simultaneously disambiguating a corpus and learning the domains in which to consider each word is developed.

...read moreread less

Abstract: We develop latent Dirichlet allocation with WORDNET (LDAWN), an unsupervised probabilistic topic model that includes word sense as a hidden variable. We develop a probabilistic posterior inference algorithm for simultaneously disambiguating a corpus and learning the domains in which to consider each word. Using the WORDNET hierarchy, we embed the construction of Abney and Light (1999) in the topic model and show that automatically learned domains improve WSD accuracy compared to alternative contexts.

...read moreread less

252 citations

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics