Lost relatives of the Gumbel trick

doi:10.17863/CAM.11066

Open AccessProceedings ArticleDOI

Lost relatives of the Gumbel trick

Matej Balog, +3 more

- pp 371-379

Chats0

TLDR

The Gumbel trick as mentioned in this paper is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function, which relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration.

Abstract:

© 2017 International Machine Learning Society (IMLS). All rights reserved. The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. We derive an entire family of related methods, of which the Gumbel trick is one member, and show that the new methods have superior properties in several settings with minimal additional computational cost. In particular, for the Gumbel trick to yield computational benefits for discrete graphical models, Gumbel perturbations on all configurations are typically replaced with socalled low-rank perturbations. We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution. Finally, we balance the discussion by showing how the simpler analytical form of the Gumbel trick enables additional theoretical results.

Citations

PDF

Open Access

More filters

Proceedings Article

Adversarial Filters of Dataset Biases

Ronan Le Bras, +6 more

TL;DR: This work presents extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks.

...read moreread less

Posted Content

Learning Latent Permutations with Gumbel-Sinkhorn Networks

Gonzalo E. Mena, +3 more

- 23 Feb 2018 -

arXiv: Machine Learning

TL;DR: A collection of new methods for end-to-end learning in such models that approximate discrete maximum-weight matching using the continuous Sinkhorn operator are introduced.

...read moreread less

Posted Content

Stochastic Optimization of Sorting Networks via Continuous Relaxations

Aditya Grover, +3 more

- 21 Mar 2019 -

arXiv: Machine Learning

TL;DR: This work proposes NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, which permits straight-through optimization of any computational graph involve a sorting operation.

...read moreread less

Proceedings Article

Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement

Wouter Kool, +2 more

TL;DR: It is shown that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.

...read moreread less

Learning with differentiable perturbed optimizers

Quentin Berthet, +5 more

TL;DR: This work proposes a systematic method to transform optimizers into operations that are differentiable and never locally constant, and relies on stochastically perturbed optimizers, and can be used readily together with existing solvers.

...read moreread less

References

PDF

Open Access

More filters

Book

Graphical Models, Exponential Families, and Variational Inference

Martin J. Wainwright, +1 more

TL;DR: The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

...read moreread less

Proceedings ArticleDOI

Fast approximate energy minimization via graph cuts

Yuri Boykov, +2 more

TL;DR: This paper proposes two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed, and generates a labeling such that there is no expansion move that decreases the energy.

...read moreread less

Proceedings Article

Convergent tree-reweighted message passing for energy minimization.

Vladimir Kolmogorov

TL;DR: This paper develops a modification of the recent technique proposed by Wainwright et al. (Nov. 2005), called sequential tree-reweighted message passing, which outperforms both the ordinary belief propagation and tree- reweighted algorithm in both synthetic and real problems.

...read moreread less

Journal ArticleDOI

libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models

Joris M. Mooij

- 01 Mar 2010 -

Journal of Machine Learning Research

TL;DR: The software package libDAI, a free & open source C++ library that provides implementations of various exact and approximate inference methods for graphical models with discrete-valued variables, is described.

...read moreread less

Posted Content

A* Sampling

Chris J. Maddison, +2 more

- 31 Oct 2014 -

arXiv: Computation

TL;DR: A* sampling as mentioned in this paper is a generic sampling algorithm that searches for the maximum of a Gumbel process using A* search, which makes more efficient use of bound and likelihood evaluations than the most closely related adaptive rejection sampling based algorithms.

...read moreread less