scispace - formally typeset
J

John Schulman

Researcher at OpenAI

Publications -  73
Citations -  40785

John Schulman is an academic researcher from OpenAI. The author has contributed to research in topics: Reinforcement learning & Benchmark (computing). The author has an hindex of 48, co-authored 67 publications receiving 30168 citations. Previous affiliations of John Schulman include University of California & University of California, Berkeley.

Papers
More filters
Posted Content

Proximal Policy Optimization Algorithms

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.
Proceedings Article

Trust Region Policy Optimization

TL;DR: A method for optimizing control policies, with guaranteed monotonic improvement, by making several approximations to the theoretically-justified scheme, called Trust Region Policy Optimization (TRPO).
Posted Content

Trust Region Policy Optimization

TL;DR: Trust Region Policy Optimization (TRPO) as mentioned in this paper is an iterative procedure for optimizing policies, with guaranteed monotonic improvement, which is similar to natural policy gradient methods and is effective for optimizing large nonlinear policies such as neural networks.
Posted Content

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

TL;DR: InfoGAN as mentioned in this paper is a generative adversarial network that maximizes the mutual information between a small subset of the latent variables and the observation, which can be interpreted as a variation of the Wake-Sleep algorithm.
Proceedings Article

InfoGAN: interpretable representation learning by information maximizing generative adversarial nets

TL;DR: InfoGAN as mentioned in this paper is an information-theoretic extension to the GAN that is able to learn disentangled representations in a completely unsupervised manner, and it also discovers visual concepts that include hair styles, presence of eyeglasses, and emotions on the CelebA face dataset.