scispace - formally typeset
Search or ask a question
Topic

Automatic differentiation

About: Automatic differentiation is a research topic. Over the lifetime, 2073 publications have been published within this topic receiving 55056 citations. The topic is also known as: algorithmic differentiation.


Papers
More filters
28 Oct 2017
TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Abstract: In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

13,268 citations

Book
01 Jan 1987
TL;DR: This second edition has been updated and expanded to cover recent developments in applications and theory, including an elegant NP completeness argument by Uwe Naumann and a brief introduction to scarcity, a generalization of sparsity.
Abstract: Algorithmic, or automatic, differentiation (AD) is a growing area of theoretical research and software development concerned with the accurate and efficient evaluation of derivatives for function evaluations given as computer programs. The resulting derivative values are useful for all scientific computations that are based on linear, quadratic, or higher order approximations to nonlinear scalar or vector functions. AD has been applied in particular to optimization, parameter identification, nonlinear equation solving, the numerical integration of differential equations, and combinations of these. Apart from quantifying sensitivities numerically, AD also yields structural dependence information, such as the sparsity pattern and generic rank of Jacobian matrices. The field opens up an exciting opportunity to develop new algorithms that reflect the true cost of accurate derivatives and to use them for improvements in speed and reliability. This second edition has been updated and expanded to cover recent developments in applications and theory, including an elegant NP completeness argument by Uwe Naumann and a brief introduction to scarcity, a generalization of sparsity. There is also added material on checkpointing and iterative differentiation. To improve readability the more detailed analysis of memory and complexity bounds has been relegated to separate, optional chapters.The book consists of three parts: a stand-alone introduction to the fundamentals of AD and its software; a thorough treatment of methods for sparse problems; and final chapters on program-reversal schedules, higher derivatives, nonsmooth problems and iterative processes. Each of the 15 chapters concludes with examples and exercises. Audience: This volume will be valuable to designers of algorithms and software for nonlinear computational problems. Current numerical software users should gain the insight necessary to choose and deploy existing AD software tools to the best advantage. Contents: Rules; Preface; Prologue; Mathematical Symbols; Chapter 1: Introduction; Chapter 2: A Framework for Evaluating Functions; Chapter 3: Fundamentals of Forward and Reverse; Chapter 4: Memory Issues and Complexity Bounds; Chapter 5: Repeating and Extending Reverse; Chapter 6: Implementation and Software; Chapter 7: Sparse Forward and Reverse; Chapter 8: Exploiting Sparsity by Compression; Chapter 9: Going beyond Forward and Reverse; Chapter 10: Jacobian and Hessian Accumulation; Chapter 11: Observations on Efficiency; Chapter 12: Reversal Schedules and Checkpointing; Chapter 13: Taylor and Tensor Coefficients; Chapter 14: Differentiation without Differentiability; Chapter 15: Implicit and Iterative Differentiation; Epilogue; List of Figures; List of Tables; Assumptions and Definitions; Propositions, Corollaries, and Lemmas; Bibliography; Index

2,920 citations

Journal ArticleDOI
TL;DR: The basic components and the underlying philosophy of ADMB are described, with an emphasis on functionality found in no other statistical software, and the main advantages are flexibility, speed, precision, stability and built-in methods to quantify uncertainty.
Abstract: Many criteria for statistical parameter estimation, such as maximum likelihood, are formulated as a nonlinear optimization problem. Automatic Differentiation Model Builder (ADMB) is a programming framework based on automatic differentiation, aimed at highly nonlinear models with a large number of parameters. The benefits of using AD are computational efficiency and high numerical accuracy, both crucial in many practical problems. We describe the basic components and the underlying philosophy of ADMB, with an emphasis on functionality found in no other statistical software. One example of such a feature is the generic implementation of Laplace approximation of high-dimensional integrals for use in latent variable models. We also review the literature in which ADMB has been used, and discuss future development of ADMB as an open source project. Overall, the main advantages of ADMB are flexibility, speed, precision, stability and built-in methods to quantify uncertainty.

1,753 citations

Journal ArticleDOI
TL;DR: The described method is based on a few basic principles, which permits the establishment of simple construction rules for adjoint statements and complete adjoint subprograms and is an implementation of the tangent linear and adjoint model compiler (TAMC).
Abstract: Adjoint models are increasingly being developed for use in meteorology and oceanography. Typical applications are data assimilation, model tuning, sensitivity analysis, and determination of singular vectors. The adjoint model computes the gradient of a cost function with respect to control variables. Generation of adjoint code may be seen as the special case of differentiation of algorithms in reverse mode, where the dependent function is a scalar. The described method for adjoint code generation is based on a few basic principles, which permits the establishment of simple construction rules for adjoint statements and complete adjoint subprograms. These rules are presented and illustrated with some examples. Conflicts that occur due to loops and redefinition of variables are also discussed. Direct coding of the adjoint of a more sophisticated model is extremely time consuming and subject to errors. Hence, automatic generation of adjoint code represents a distinct advantage. An implementation of the method, described in this article, is the tangent linear and adjoint model compiler (TAMC).

856 citations


Network Information
Related Topics (5)
Partial differential equation
70.8K papers, 1.6M citations
78% related
Nonlinear system
208.1K papers, 4M citations
78% related
Linear system
59.5K papers, 1.4M citations
77% related
Optimization problem
96.4K papers, 2.1M citations
77% related
Differential equation
88K papers, 2M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022137
2021172
2020145
2019127
2018130