Mohammad Ghavamzadeh

Researcher at Google

Publications - 207

Citations - 8517

Mohammad Ghavamzadeh is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Markov decision process. The author has an hindex of 45, co-authored 186 publications receiving 6307 citations. Previous affiliations of Mohammad Ghavamzadeh include University of Alberta & University of Massachusetts Amherst.

Papers

PDF

Open Access

More filters

Journal ArticleDOI

A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges

Moloud Abdar, +13 more

- 12 Nov 2020 -

arXiv: Learning

TL;DR: This study reviews recent advances in UQ methods used in deep learning and investigates the application of these methods in reinforcement learning (RL), and outlines a few important applications of UZ methods.

...read moreread less

Journal ArticleDOI

Natural actor-critic algorithms

Shalabh Bhatnagar, +3 more

- 01 Nov 2009 -

Automatica

TL;DR: Four new reinforcement learning algorithms based on actor-critic, natural-gradient and function-approximation ideas are presented, and their convergence proofs are provided, providing the first convergence proofs and the first fully incremental algorithms.

...read moreread less

Proceedings Article

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Victor Gabillon, +2 more

TL;DR: A performance bound is proved for the two versions of the UGapE algorithm showing that the two problems are characterized by the same notion of complexity.

...read moreread less

Posted Content

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

Yinlam Chow, +3 more

- 05 Dec 2015 -

arXiv: Artificial Intelligence

TL;DR: In this paper, the authors present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented via a chance constraint or a constraint on the conditional value-at-risk (CVaR) of the cumulative cost.

...read moreread less

Proceedings Article

High confidence off-policy evaluation

Philip S. Thomas, +2 more

TL;DR: This paper proposes an off-policy method for computing a lower confidence bound on the expected return of a policy and provides confidences regarding the accuracy of their estimates.

...read moreread less

Collapse