Open AccessProceedings Article
Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning
Casey Chu,Jose Blanchet,Peter W. Glynn +2 more
- pp 1213-1222
TLDR
In this paper, the authors provide a unified view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures.Abstract:
This paper provides a unifying view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures. In particular, we show that generative adversarial networks, variational inference, and actor-critic methods in reinforcement learning can all be seen through the lens of our framework. We then discuss a generic optimization algorithm for our formulation, called probability functional descent (PFD), and show how this algorithm recovers existing methods developed independently in the settings mentioned earlier.read more
Citations
More filters
Linear And Nonlinear Programming
TL;DR: The linear and nonlinear programming is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can download it instantly.
Proceedings Article
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
TL;DR: A new Variational Policy Gradient Theorem for RL with general utilities is derived, which establishes that the parametrized policy gradient may be obtained as the solution of a stochastic saddle point problem involving the Fenchel dual of the utility function.
Posted Content
Smoothness and Stability in GANs
TL;DR: This work develops a principled theoretical framework for understanding the stability of various types of GANs and derives conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the GAN and the generator's architecture.
Proceedings Article
Smoothness and Stability in GANs
TL;DR: In this paper, the authors derive conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the generator and the GAN.
Proceedings ArticleDOI
Learning from All Types of Experiences: A Unifying Machine Learning Perspective
Zhiting Hu,Eric P. Xing +1 more
TL;DR: This tutorial presents a systematic, unified blueprint of ML, for both a refreshing holistic understanding of the diverse ML paradigms/algorithms, and guidance of operationalizing ML for creating problem solutions in a composable manner.
References
More filters
Journal ArticleDOI
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings Article
Auto-Encoding Variational Bayes
Diederik P. Kingma,Max Welling +1 more
TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Book
Markov Decision Processes: Discrete Stochastic Dynamic Programming
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Journal ArticleDOI
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
Proceedings Article
Asynchronous methods for deep reinforcement learning
Volodymyr Mnih,Adrià Puigdomènech Badia,Mehdi Mirza,Alex Graves,Tim Harley,Timothy P. Lillicrap,David Silver,Koray Kavukcuoglu +7 more
TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Related Papers (5)
A formal framework for reinforcement learning with function approximation in learning classifier systems
Jan Drugowitsch,Alwyn M. Barry +1 more