scispace - formally typeset
Open AccessProceedings Article

Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

TLDR
In this paper, the authors provide a unified view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures.
Abstract
This paper provides a unifying view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures. In particular, we show that generative adversarial networks, variational inference, and actor-critic methods in reinforcement learning can all be seen through the lens of our framework. We then discuss a generic optimization algorithm for our formulation, called probability functional descent (PFD), and show how this algorithm recovers existing methods developed independently in the settings mentioned earlier.

read more

Content maybe subject to copyright    Report

Citations
More filters

Linear And Nonlinear Programming

TL;DR: The linear and nonlinear programming is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can download it instantly.
Proceedings Article

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

TL;DR: A new Variational Policy Gradient Theorem for RL with general utilities is derived, which establishes that the parametrized policy gradient may be obtained as the solution of a stochastic saddle point problem involving the Fenchel dual of the utility function.
Posted Content

Smoothness and Stability in GANs

TL;DR: This work develops a principled theoretical framework for understanding the stability of various types of GANs and derives conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the GAN and the generator's architecture.
Proceedings Article

Smoothness and Stability in GANs

TL;DR: In this paper, the authors derive conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the generator and the GAN.
Proceedings ArticleDOI

Learning from All Types of Experiences: A Unifying Machine Learning Perspective

TL;DR: This tutorial presents a systematic, unified blueprint of ML, for both a refreshing holistic understanding of the diverse ML paradigms/algorithms, and guidance of operationalizing ML for creating problem solutions in a composable manner.
References
More filters
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings Article

Auto-Encoding Variational Bayes

TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Journal ArticleDOI

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
Proceedings Article

Asynchronous methods for deep reinforcement learning

TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Related Papers (5)