Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

Open AccessProceedings Article

Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

- pp 1213-1222

TLDR

In this paper, the authors provide a unified view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures.

Abstract:

This paper provides a unifying view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures. In particular, we show that generative adversarial networks, variational inference, and actor-critic methods in reinforcement learning can all be seen through the lens of our framework. We then discuss a generic optimization algorithm for our formulation, called probability functional descent (PFD), and show how this algorithm recovers existing methods developed independently in the settings mentioned earlier.

Citations

PDF

Open Access

More filters

Linear And Nonlinear Programming

Phillipp Kaestner

TL;DR: The linear and nonlinear programming is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can download it instantly.

...read moreread less

Proceedings Article

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

Junyu Zhang, +4 more

TL;DR: A new Variational Policy Gradient Theorem for RL with general utilities is derived, which establishes that the parametrized policy gradient may be obtained as the solution of a stochastic saddle point problem involving the Fenchel dual of the utility function.

...read moreread less

Posted Content

Smoothness and Stability in GANs

Casey Chu, +2 more

- 11 Feb 2020 -

arXiv: Learning

TL;DR: This work develops a principled theoretical framework for understanding the stability of various types of GANs and derives conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the GAN and the generator's architecture.

...read moreread less

Proceedings Article

Smoothness and Stability in GANs

Casey Chu, +2 more

TL;DR: In this paper, the authors derive conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the generator and the GAN.

...read moreread less

Proceedings ArticleDOI

Learning from All Types of Experiences: A Unifying Machine Learning Perspective

Zhiting Hu, +1 more

TL;DR: This tutorial presents a systematic, unified blueprint of ML, for both a refreshing holistic understanding of the diverse ML paradigms/algorithms, and guidance of operationalizing ML for creating problem solutions in a composable manner.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Proceedings Article

Auto-Encoding Variational Bayes

Diederik P. Kingma, +1 more

TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

...read moreread less

Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Martin L. Puterman

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

...read moreread less

Journal ArticleDOI

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Ronald J. Williams

- 01 May 1992 -

Machine Learning

TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.

...read moreread less

Proceedings Article

Asynchronous methods for deep reinforcement learning

Volodymyr Mnih, +7 more

TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

...read moreread less