M
Matthew W. Hoffman
Researcher at Google
Publications - 54
Citations - 4483
Matthew W. Hoffman is an academic researcher from Google. The author has contributed to research in topics: Bayesian optimization & Reinforcement learning. The author has an hindex of 27, co-authored 54 publications receiving 3746 citations. Previous affiliations of Matthew W. Hoffman include University of British Columbia & University of Cambridge.
Papers
More filters
Proceedings Article
Learning to learn by gradient descent by gradient descent
Marcin Andrychowicz,Misha Denil,Sergio Gomez,Matthew W. Hoffman,David Pfau,Tom Schaul,Brendan Shillingford,Nando de Freitas +7 more
TL;DR: This paper shows how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way.
Posted Content
Predictive Entropy Search for Efficient Global Optimization of Black-box Functions
TL;DR: Predictive Entropy Search (PES) as mentioned in this paper selects the next evaluation point that maximizes the expected information gained with respect to the global maximum in each iteration, at each iteration.
Proceedings Article
Music Transformer: Generating Music with Long-Term Structure
Cheng-Zhi Anna Huang,Ashish Vaswani,Jakob Uszkoreit,Noam Shazeer,Ian Simon,Curtis Hawthorne,Andrew M. Dai,Matthew W. Hoffman,Monica Dinculescu,Douglas Eck +9 more
TL;DR: It is demonstrated that a Transformer with the modified relative attention mechanism can generate minutelong compositions with compelling structure, generate continuations that coherently elaborate on a given motif, and in a seq2seq setup generate accompaniments conditioned on melodies.
Proceedings Article
Predictive Entropy Search for Efficient Global Optimization of Black-box Functions
TL;DR: This work proposes a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES), which codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution.
Proceedings Article
Learning to Learn without Gradient Descent by Gradient Descent
Yutian Chen,Matthew W. Hoffman,Sergio Gomez Colmenarejo,Misha Denil,Timothy P. Lillicrap,Matthew Botvinick,Nando de Freitas +6 more
TL;DR: It is shown that recurrent neural network optimizers trained on simple synthetic functions by gradient descent exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks.