scispace - formally typeset
F

Filip Wolski

Researcher at OpenAI

Publications -  7
Citations -  12039

Filip Wolski is an academic researcher from OpenAI. The author has contributed to research in topics: Reinforcement learning & Hindsight bias. The author has an hindex of 6, co-authored 7 publications receiving 7372 citations.

Papers
More filters
Posted Content

Proximal Policy Optimization Algorithms

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.
Proceedings Article

Hindsight Experience Replay

TL;DR: A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum.
Posted Content

Hindsight Experience Replay

TL;DR: In this paper, a technique called hindsight experience replay is proposed to learn from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering, which can be combined with an arbitrary off-policy algorithm and may be seen as a form of implicit curriculum.
Proceedings Article

Evolved Policy Gradients

TL;DR: Empirical results show that the evolved policy gradient algorithm (EPG) achieves faster learning on several randomized environments compared to an off-the-shelf policy gradient method, and its learned loss can generalize to out-of-distribution test time tasks, and exhibits qualitatively different behavior from other popular metalearning algorithms.