scispace - formally typeset
Search or ask a question
Author

Andrea Walther

Bio: Andrea Walther is an academic researcher from University of Paderborn. The author has contributed to research in topics: Automatic differentiation & Jacobian matrix and determinant. The author has an hindex of 23, co-authored 109 publications receiving 5497 citations. Previous affiliations of Andrea Walther include Dresden University of Technology & Humboldt University of Berlin.


Papers
More filters
Book ChapterDOI
28 Aug 2006
TL;DR: A new online procedure for determining the checkpoint distribution on the fly is presented and the resulting checkpointing approach is integrated in HiFlow, a multipurpose parallel finite-element package with a strong emphasis in computational fluid dynamic, reactive flows and related subjects.
Abstract: The computation of derivatives for the optimization of time-dependent flow problems is based on the integration of the adjoint differential equation For this purpose, the knowledge of the complete forward solution is required Similar information is needed for a posteriori error estimation with respect to a given functional In the area of flow control, especially for three dimensional problems, it is usually impossible to store the full forward solution due to the lack of memory capacities Additionally, adaptive time-stepping procedures are needed for efficient integration schemes in time Therefore, standard optimal offline checkpointing strategies are usually not well-suited in that framework We present a new online procedure for determining the checkpoint distribution on the fly Complexity estimates and consequences for storing and retrieving the checkpoints using parallel I/O are discussed The resulting checkpointing approach is integrated in HiFlow, a multipurpose parallel finite-element package with a strong emphasis in computational fluid dynamic, reactive flows and related subjects Using an adjoint-based error control for prototypical three dimensional flow problems, numerical experiments demonstrate the effectiveness of the proposed approach

29 citations

Journal ArticleDOI
TL;DR: This work addresses the problem of minimizing objectives from the class of piecewise differentiable functions whose nonsmoothness can be encapsulated in the absolute value function and demonstrates how the local model can be minimized by a bundle-type method, which benefits from the availability of additional gray-box information via the abs-normal form.
Abstract: We address the problem of minimizing objectives from the class of piecewise differentiable functions whose nonsmoothness can be encapsulated in the absolute value function. They possess local piecewise linear approximations with a discrepancy that can be bounded by a quadratic proximal term. This overestimating local model is continuous but generally nonconvex. It can be generated in its abs-normal form by a minor extension of standard algorithmic differentiation tools. Here we demonstrate how the local model can be minimized by a bundle-type method, which benefits from the availability of additional gray-box information via the abs-normal form. In the convex case our algorithm realizes the consistent steepest descent trajectory for which finite convergence was established earlier, specifically covering counterexamples where steepest descent with exact line-search famously fails. The analysis of the abs-normal representation and the design of the optimization algorithm are geared toward the general case, whereas the convergence proof so far covers only the convex case.

29 citations

Journal ArticleDOI
TL;DR: This work describes local optimality by first- and second-order necessary and sufficient conditions, which generalize the corresponding Kuhn-Tucker-Karush (KKT) theory for smooth problems and exemplifies the theory on two nonsmooth examples of Nesterov.
Abstract: Any piecewise smooth function that is specified by an evaluation procedure involving smooth elemental functions and piecewise linear functions like and can be represented in the so-called abs-normal form. By an extension of algorithmic, or automatic, differentiation, one can then compute certain first-and second-order derivative vectors and matrices that represent a local piecewise linearization and provide additional curvature information. On the basis of these quantities, we characterize local optimality by first-and second-order necessary and sufficient conditions, which generalize the corresponding Kuhn-Tucker-Karush KKT theory for smooth problems. The key assumption is the linear independence kink qualification, a generalization of Linear Independence Constraint Qualification LICQ familiar from nonlinear optimization. It implies that the objective has locally a so-called decomposition and renders everything tractable in terms of matrix factorizations and other simple linear algebra operations. By yielding descent directions, whenever they are violated the new optimality conditions point the way to a superlinearly convergent generalized Quadratic Program solver, which is currently under development. We exemplify the theory on two nonsmooth examples of Nesterov.

28 citations

Book ChapterDOI
01 Jan 2008
TL;DR: The reverse mode of automatic differentiation allows the computation of gradients at a temporal complexity that is only a small multiple of the temporal complexity to evaluate the function itself, but this is not efficient, since any structure of the problem is neglected.
Abstract: The reverse mode of automatic differentiation allows the computation of gradients at a temporal complexity that is only a small multiple of the temporal complexity to evaluate the function itself. However, the memory requirement of the reverse mode in its basic form is proportional to the operation count of the function to be differentiated. For iterative processes consisting of iterations with uniform complexity this means that the memory requirement of the reverse mode grows linearly with the number of iterations. For fixed point iterations this is not efficient, since any structure of the problem is neglected.

28 citations

Book ChapterDOI
01 Jan 2008
TL;DR: A strategy for the efficient implementation of the reverse mode of AD with trace-based AD-tools and implement it with the ADOL-C tool is developed, which combines checkpointing at the outer level with parallel trace generation and evaluation at the inner level.
Abstract: Shared-memory multicore computing platforms are becoming commonplace, and loop parallelization with OpenMP offers an easy way for the user to harness their power. As a result, tools for automatic differentiation (AD) should be able to deal with such codes in a fashion that preserves their parallel nature also for the derivative evaluation. In this paper, we explore this issue using a plasma simulation code. Its structure, which in essence is a time stepping loop with several parallelizable inner loops, is representative of many other computations. Using this code as an example, we develop a strategy for the efficient implementation of the reverse mode of AD with trace-based AD-tools and implement it with the ADOL-C tool. The strategy combines checkpointing at the outer level with parallel trace generation and evaluation at the inner level. We discuss the extensions necessary for ADOL-C to work in a multithreaded environment and the setup necessary for the user code and present performance results on a shared-memory multiprocessor.

25 citations


Cited by
More filters
28 Oct 2017
TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Abstract: In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

13,268 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations

Journal ArticleDOI
TL;DR: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan, allowing users to fit linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multileVEL context.
Abstract: The brms package implements Bayesian multilevel models in R using the probabilistic programming language Stan. A wide range of distributions and link functions are supported, allowing users to fit - among others - linear, robust linear, binomial, Poisson, survival, ordinal, zero-inflated, hurdle, and even non-linear models all in a multilevel context. Further modeling options include autocorrelation of the response variable, user defined covariance structures, censored data, as well as meta-analytic standard errors. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with the Watanabe-Akaike information criterion and leave-one-out cross-validation.

4,353 citations

Journal ArticleDOI
TL;DR: This work considers approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non‐Gaussian response variables and can directly compute very accurate approximations to the posterior marginals.
Abstract: Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.

4,164 citations

Journal ArticleDOI
TL;DR: A Bayesian calibration technique which improves on this traditional approach in two respects and attempts to correct for any inadequacy of the model which is revealed by a discrepancy between the observed data and the model predictions from even the best‐fitting parameter values is presented.
Abstract: We consider prediction and uncertainty analysis for systems which are approximated using complex mathematical models. Such models, implemented as computer codes, are often generic in the sense that by a suitable choice of some of the model's input parameters the code can be used to predict the behaviour of the system in a variety of specific applications. However, in any specific application the values of necessary parameters may be unknown. In this case, physical observations of the system in the specific context are used to learn about the unknown parameters. The process of fitting the model to the observed data by adjusting the parameters is known as calibration. Calibration is typically effected by ad hoc fitting, and after calibration the model is used, with the fitted input values, to predict the future behaviour of the system. We present a Bayesian calibration technique which improves on this traditional approach in two respects. First, the predictions allow for all sources of uncertainty, including the remaining uncertainty over the fitted parameters. Second, they attempt to correct for any inadequacy of the model which is revealed by a discrepancy between the observed data and the model predictions from even the best-fitting parameter values. The method is illustrated by using data from a nuclear radiation release at Tomsk, and from a more complex simulated nuclear accident exercise.

3,745 citations