G
George Tucker
Researcher at Google
Publications - 95
Citations - 12830
George Tucker is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Estimator. The author has an hindex of 41, co-authored 91 publications receiving 8022 citations. Previous affiliations of George Tucker include Massachusetts Institute of Technology & FICO.
Papers
More filters
Journal ArticleDOI
Efficient Bayesian mixed-model analysis increases association power in large cohorts
Po-Ru Loh,George Tucker,Brendan Bulik-Sullivan,Bjarni J. Vilhjálmsson,Bjarni J. Vilhjálmsson,Hilary K. Finucane,Rany M. Salem,Daniel I. Chasman,Paul M. Ridker,Benjamin M. Neale,Benjamin M. Neale,Bonnie Berger,Nick Patterson,Alkes L. Price +13 more
TL;DR: BOLT-LMM is presented, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes.
Posted Content
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja,Aurick Zhou,Kristian Hartikainen,George Tucker,Sehoon Ha,Jie Tan,Vikash Kumar,Henry Zhu,Abhishek Gupta,Pieter Abbeel,Sergey Levine +10 more
TL;DR: Soft Actor-Critic (SAC), the recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework, achieves state-of-the-art performance, outperforming prior on-policy and off- policy methods in sample-efficiency and asymptotic performance.
Posted Content
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
TL;DR: This tutorial article aims to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcementlearning algorithms that utilize previously collected data, without additional online data collection.
Posted Content
Conservative Q-Learning for Offline Reinforcement Learning
TL;DR: Conservative Q-learning (CQL) is proposed, which aims to address limitations of offline RL methods by learning a conservative Q-function such that the expected value of a policy under this Q- function lower-bounds its true value.
Proceedings Article
Regularizing Neural Networks by Penalizing Confident Output Distributions
TL;DR: It is found that both label smoothing and the confidence penalty improve state-of-the-art models across benchmarks without modifying existing hyperparameters, suggesting the wide applicability of these regularizers.