Z
Zachary DeVito
Researcher at Stanford University
Publications - 4
Citations - 36081
Zachary DeVito is an academic researcher from Stanford University. The author has contributed to research in topics: Debugging & Usability. The author has an hindex of 3, co-authored 3 publications receiving 21128 citations.
Papers
More filters
Automatic differentiation in PyTorch
Adam Paszke,Sam Gross,Soumith Chintala,Gregory Chanan,Edward Z. Yang,Zachary DeVito,Zeming Lin,Alban Desmaison,Luca Antiga,Adam Lerer +9 more
TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Posted Content
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke,Sam Gross,Francisco Massa,Adam Lerer,James Bradbury,Gregory Chanan,Trevor Killeen,Zeming Lin,Natalia Gimelshein,Luca Antiga,Alban Desmaison,Andreas Kopf,Edward Z. Yang,Zachary DeVito,Martin Raison,Alykhan Tejani,Sasank Chilamkurthy,Benoit Steiner,Lu Fang,Junjie Bai,Soumith Chintala +20 more
TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
Proceedings Article
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke,Sam Gross,Francisco Massa,Adam Lerer,James Bradbury,Gregory Chanan,Trevor Killeen,Zeming Lin,Natalia Gimelshein,Luca Antiga,Alban Desmaison,Andreas Kopf,Edward Z. Yang,Zachary DeVito,Martin Raison,Alykhan Tejani,Sasank Chilamkurthy,Benoit Steiner,Lu Fang,Junjie Bai,Soumith Chintala +20 more
TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
Journal ArticleDOI
A Theory on Adam Instability in Large-Scale Machine Learning
Igor Molybog,Moya Chen,Zachary DeVito,David Esiobu,Naman Goyal,Punit Singh Koura,Sharan Narang,Andrew M. Poulton,Binh Tang,Diana Liskovich,Yuchen Zhang,Melanie Rae Kambadur,Stephen Roller,Susan Zhang +13 more
TL;DR: This article showed that Adam can enter a state in which the parameter update vector has a relatively large norm and is essentially uncorrelated with the direction of descent on the training loss landscape, leading to divergence.