Mohammad Gheshlaghi Azar

Researcher at Northwestern University

Publications - 53

Citations - 7837

Mohammad Gheshlaghi Azar is an academic researcher from Northwestern University. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 26, co-authored 48 publications receiving 4182 citations. Previous affiliations of Mohammad Gheshlaghi Azar include Radboud University Nijmegen & Google.

Papers

PDF

Open Access

More filters

Posted Content

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

Jean-Bastien Grill, +13 more

- 13 Jun 2020 -

arXiv: Learning

TL;DR: This work introduces Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning that performs on par or better than the current state of the art on both transfer and semi- supervised benchmarks.

...read moreread less

Posted Content

Rainbow: Combining Improvements in Deep Reinforcement Learning

Matteo Hessel, +9 more

- 06 Oct 2017 -

arXiv: Artificial Intelligence

TL;DR: This paper examines six extensions to the DQN algorithm and empirically studies their combination, showing that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance.

...read moreread less

Proceedings Article

Rainbow: Combining Improvements in Deep Reinforcement Learning

Matteo Hessel, +9 more

TL;DR: In this article, the authors examined six extensions to the DQN algorithm and empirically studied their combination, showing that the combination provided state-of-the-art performance on the Atari 2600 benchmark.

...read moreread less

Proceedings Article

Minimax regret bounds for reinforcement learning

Mohammad Gheshlaghi Azar, +2 more

TL;DR: The problem of provably optimal exploration in reinforcement learning for finite horizon MDPs is considered, and an optimistic modification to value iteration achieves a regret bound of $\tilde{O}( \sqrt{HSAT} + H^2S^2A+H\sqrt {T})$ where $H$ is the time horizon, $S$ the number of states, $A$the number of actions and $T$ thenumber of time-steps.

...read moreread less

Proceedings Article

Noisy Networks For Exploration

Meire Fortunato, +11 more

TL;DR: It is found that replacing the conventional exploration heuristics for A3C, DQN and dueling agents with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.

...read moreread less

Collapse