Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation

doi:10.1145/347837.347846

Journal ArticleDOI

Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation

Andreas Griewank, +1 more

- 01 Mar 2000 -

ACM Transactions on Mathematical Softwar...

- Vol. 26, Iss: 1, pp 19-45

TLDR

This article presents the function revolve, which generates checkpointing schedules that are provably optimal with regard to a primary and a secondary criterion and is intended to be used as an explicit “controller” for running a time-dependent applications program.

Abstract:

In its basic form, the reverse mode of computational differentiation yields the gradient of a scalar-valued function at a cost that is a small multiple of the computational work needed to evaluate the function itself. However, the corresponding memory requirement is proportional to the run-time of the evaluation program. Therefore, the practical applicability of the reverse mode in its original formulation is limited despite the availability of ever larger memory systems. This observation leads to the development of checkpointing schedules to reduce the storage requirements. This article presents the function revolve, which generates checkpointing schedules that are provably optimal with regard to a primary and a secondary criterion. This routine is intended to be used as an explicit “controller” for running a time-dependent applications program.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Transparent Checkpointing for Automatic Differentiation of Program Loops Through Expression Transformations

Sri Hari Krishna Narayanan

- 01 Jan 2023 -

Lecture Notes in Computer Science

TL;DR: Checkpointing as discussed by the authors leverages expression transformations in the programming language Julia and the package ChainRules.jl to automatically and transparently transform loop iterations into differentiated loops, and demonstrate its features on an automatically differentiated MPI implementation of Burgers' equation on the Polaris cluster at the Argonne Leadership Computing Facility.

...read moreread less

Posted Content

Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM

Deepak Narayanan, +11 more

- 09 Apr 2021 -

arXiv: Computation and Language

TL;DR: In this article, different types of parallelism methods (tensor, pipeline, and data parallelism) can be composed to scale to thousands of GPUs and models with trillions of parameters, and a novel interleaved pipeline parallelism schedule that can improve throughput by 10+% with memory footprint comparable to existing approaches.

...read moreread less

Journal ArticleDOI

Nonlinear optimal perturbations and formation mechanism of localized wave packet in channel flow

- 01 May 2023 -

Physics of fluids

TL;DR: In this paper , a nonlinear nonmodal optimization method is used to find the minimal energy perturbation triggering the subcritical transition in a two-dimensional channel flow is characterized by the emergence of localized wave packet (LWP).

...read moreread less

Proceedings ArticleDOI

ScaDLES: Scalable Deep Learning over Streaming data at the Edge

TL;DR: ScaDLES as mentioned in this paper proposes to train on streaming data at the edge in an online fashion, while also addressing the challenges of limited bandwidth and training with non-IID data.

...read moreread less

Proceedings ArticleDOI

Partitioned solution of the unsteady adjoint equations for the one-dimensional flowin a flexible tube

Joris Degroote, +4 more

TL;DR: In this article, the gradient of an objective function for an elastic tube through which an incompressible flow of an inviscid fluid flows through the tube is derived and solved using quasi-Newton coupling iterations.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Numerical methods for conservation laws

Randall J. LeVeque

TL;DR: In this paper, the authors describe the derivation of conservation laws and apply them to linear systems, including the linear advection equation, the Euler equation, and the Riemann problem.

...read moreread less

Book

Optimal Control of Systems Governed by Partial Differential Equations

Jacques-Louis Lions

TL;DR: In this paper, the authors consider the problem of minimizing the sum of a differentiable and non-differentiable function in the context of a system governed by a Dirichlet problem.

...read moreread less

Book

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation

Andreas Griewank, +1 more

TL;DR: This second edition has been updated and expanded to cover recent developments in applications and theory, including an elegant NP completeness argument by Uwe Naumann and a brief introduction to scarcity, a generalization of sparsity.

...read moreread less

Journal ArticleDOI

Upwind difference schemes for hyperbolic systems of conservation laws

Stanley Osher, +1 more

- 01 Apr 1982 -

Mathematics of Computation

TL;DR: In this article, a new upwind finite difference approximation to systems of nonlinear hyperbolic conservation laws has been derived. But the scheme has desirable properties for shock calculations, such as unique and sharp shocks.

...read moreread less

Journal ArticleDOI

Achieving logarithmic growth of temporal and spatial complexity in reverse automatic differentiation

Andreas Griewank

- 01 Jan 1992 -

Optimization Methods & Software

TL;DR: It is shown here that, by a recursive scheme related to the multilevel differentiation approach of Volin and Ostrovskii, the growth in both temporal and spatial complexity can be limited to a fixed multiple of log(T).

...read moreread less

Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation

Citations

Transparent Checkpointing for Automatic Differentiation of Program Loops Through Expression Transformations

Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM

Nonlinear optimal perturbations and formation mechanism of localized wave packet in channel flow

ScaDLES: Scalable Deep Learning over Streaming data at the Edge

Partitioned solution of the unsteady adjoint equations for the one-dimensional flowin a flexible tube

References

Numerical methods for conservation laws

Optimal Control of Systems Governed by Partial Differential Equations

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation

Upwind difference schemes for hyperbolic systems of conservation laws

Achieving logarithmic growth of temporal and spatial complexity in reverse automatic differentiation

Related Papers (5)

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation

Inversion of seismic reflection data in the acoustic approximation

Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, Second Edition

An overview of full-waveform inversion in exploration geophysics

A review of the adjoint-state method for computing the gradient of a functional with geophysical applications