Automatic differentiation in PyTorch

Open Access

Automatic differentiation in PyTorch

TLDR

An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.

Abstract:

In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

Citations

PDF

Open Access

More filters

Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019 -

arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

- 03 Dec 2019 -

arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

Proceedings Article

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.

...read moreread less

Proceedings ArticleDOI

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Jiankang Deng, +3 more

TL;DR: This paper presents arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks, and shows that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead.

...read moreread less

Posted Content

HuggingFace's Transformers: State-of-the-art Natural Language Processing.

Thomas Wolf, +10 more

- 09 Oct 2019 -

arXiv: Computation and Language

TL;DR: The \textit{Transformers} library is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martín Abadi, +39 more

- 01 Jan 2015 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

SciPy: Open Source Scientific Tools for Python

Eric Jones, +2 more

Book

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation

Andreas Griewank, +1 more

TL;DR: This second edition has been updated and expanded to cover recent developments in applications and theory, including an elegant NP completeness argument by Uwe Naumann and a brief introduction to scarcity, a generalization of sparsity.

...read moreread less

ReportDOI

Compiling fast partial derivatives of functions given by algorithms

Bert Speelpenning

TL;DR: The system Jake described produces gradients significantly faster than numerical differencing for n > 8 and can handle algorithms Af with arbitrary flow of control.

...read moreread less

Automatic differentiation in PyTorch

Citations

RoBERTa: A Robustly Optimized BERT Pretraining Approach

PyTorch: An Imperative Style, High-Performance Deep Learning Library

PyTorch: An Imperative Style, High-Performance Deep Learning Library

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

HuggingFace's Transformers: State-of-the-art Natural Language Processing.

References

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

SciPy: Open Source Scientific Tools for Python

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation

DyNet: The Dynamic Neural Network Toolkit

Compiling fast partial derivatives of functions given by algorithms

Related Papers (5)

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet: A large-scale hierarchical image database

Very Deep Convolutional Networks for Large-Scale Image Recognition