scispace - formally typeset
J

Justin Fu

Researcher at University of California, Berkeley

Publications -  32
Citations -  4315

Justin Fu is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Reinforcement learning & Artificial neural network. The author has an hindex of 19, co-authored 28 publications receiving 2177 citations. Previous affiliations of Justin Fu include Google.

Papers
More filters
Posted Content

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

TL;DR: This tutorial article aims to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcementlearning algorithms that utilize previously collected data, without additional online data collection.
Journal Article

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

TL;DR: This work introduces benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL, and releases benchmark tasks and datasets with a comprehensive evaluation of existing algorithms and an evaluation protocol together with an open-source codebase.
Proceedings Article

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning.

TL;DR: It is demonstrated that AIRL is able to recover reward functions that are robust to changes in dynamics, enabling us to learn policies even under significant variation in the environment seen during training.
Proceedings Article

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

TL;DR: In this article, the authors identify bootstrapping error as a key source of instability in off-policy reinforcement learning and propose a practical algorithm, Bootstrapping Error accumulation reduction (BEAR), to mitigate it.
Posted Content

When to Trust Your Model: Model-Based Policy Optimization

TL;DR: This paper first formulate and analyze a model-based reinforcement learning algorithm with a guarantee of monotonic improvement at each step, and demonstrates that a simple procedure of using short model-generated rollouts branched from real data has the benefits of more complicated model- based algorithms without the usual pitfalls.