scispace - formally typeset
Open AccessProceedings ArticleDOI

Lost relatives of the Gumbel trick

Reads0
Chats0
TLDR
The Gumbel trick as mentioned in this paper is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function, which relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration.
Abstract
© 2017 International Machine Learning Society (IMLS). All rights reserved. The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. We derive an entire family of related methods, of which the Gumbel trick is one member, and show that the new methods have superior properties in several settings with minimal additional computational cost. In particular, for the Gumbel trick to yield computational benefits for discrete graphical models, Gumbel perturbations on all configurations are typically replaced with socalled low-rank perturbations. We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution. Finally, we balance the discussion by showing how the simpler analytical form of the Gumbel trick enables additional theoretical results.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Adversarial Filters of Dataset Biases

TL;DR: This work presents extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks.
Posted Content

Learning Latent Permutations with Gumbel-Sinkhorn Networks

TL;DR: A collection of new methods for end-to-end learning in such models that approximate discrete maximum-weight matching using the continuous Sinkhorn operator are introduced.
Posted Content

Stochastic Optimization of Sorting Networks via Continuous Relaxations

TL;DR: This work proposes NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, which permits straight-through optimization of any computational graph involve a sorting operation.
Proceedings Article

Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement

TL;DR: It is shown that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.

Learning with differentiable perturbed optimizers

TL;DR: This work proposes a systematic method to transform optimizers into operations that are differentiable and never locally constant, and relies on stochastically perturbed optimizers, and can be used readily together with existing solvers.
References
More filters
Book

Graphical Models, Exponential Families, and Variational Inference

TL;DR: The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.
Proceedings ArticleDOI

Fast approximate energy minimization via graph cuts

TL;DR: This paper proposes two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed, and generates a labeling such that there is no expansion move that decreases the energy.
Proceedings Article

Convergent tree-reweighted message passing for energy minimization.

TL;DR: This paper develops a modification of the recent technique proposed by Wainwright et al. (Nov. 2005), called sequential tree-reweighted message passing, which outperforms both the ordinary belief propagation and tree- reweighted algorithm in both synthetic and real problems.
Journal ArticleDOI

libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models

TL;DR: The software package libDAI, a free & open source C++ library that provides implementations of various exact and approximate inference methods for graphical models with discrete-valued variables, is described.
Posted Content

A* Sampling

TL;DR: A* sampling as mentioned in this paper is a generic sampling algorithm that searches for the maximum of a Gumbel process using A* search, which makes more efficient use of bound and likelihood evaluations than the most closely related adaptive rejection sampling based algorithms.
Related Papers (5)