scispace - formally typeset
Open Access

Automatic differentiation in PyTorch

TLDR
An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Abstract
In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
Proceedings ArticleDOI

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

TL;DR: This paper presents arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks, and shows that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead.
Posted Content

HuggingFace's Transformers: State-of-the-art Natural Language Processing.

TL;DR: The \textit{Transformers} library is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.
References
More filters
Book

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation

TL;DR: This second edition has been updated and expanded to cover recent developments in applications and theory, including an elegant NP completeness argument by Uwe Naumann and a brief introduction to scarcity, a generalization of sparsity.
Posted Content

DyNet: The Dynamic Neural Network Toolkit

TL;DR: DyNet is a toolkit for implementing neural network models based on dynamic declaration of network structure that has an optimized C++ backend and lightweight graph representation and is designed to allow users to implement their models in a way that is idiomatic in their preferred programming language.
ReportDOI

Compiling fast partial derivatives of functions given by algorithms

TL;DR: The system Jake described produces gradients significantly faster than numerical differencing for n > 8 and can handle algorithms Af with arbitrary flow of control.