scispace - formally typeset
Search or ask a question
Journal ArticleDOI

DGM: A deep learning algorithm for solving partial differential equations

TL;DR: A deep learning algorithm similar in spirit to Galerkin methods, using a deep neural network instead of linear combinations of basis functions is proposed, and is implemented for American options in up to 100 dimensions.
About: This article is published in Journal of Computational Physics.The article was published on 2018-12-15 and is currently open access. It has received 1290 citations till now. The article focuses on the topics: Partial differential equation & Boundary value problem.
Citations
More filters
Journal ArticleDOI
01 Jun 2021
TL;DR: Some of the prevailing trends in embedding physics into machine learning are reviewed, some of the current capabilities and limitations are presented and diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems are discussed.
Abstract: Despite great progress in simulating multiphysics problems using the numerical discretization of partial differential equations (PDEs), one still cannot seamlessly incorporate noisy data into existing algorithms, mesh generation remains complex, and high-dimensional problems governed by parameterized PDEs cannot be tackled. Moreover, solving inverse problems with hidden physics is often prohibitively expensive and requires different formulations and elaborate computer codes. Machine learning has emerged as a promising alternative, but training deep neural networks requires big data, not always available for scientific problems. Instead, such networks can be trained from additional information obtained by enforcing the physical laws (for example, at random points in the continuous space-time domain). Such physics-informed learning integrates (noisy) data and mathematical models, and implements them through neural networks or other kernel-based regression networks. Moreover, it may be possible to design specialized network architectures that automatically satisfy some of the physical invariants for better accuracy, faster training and improved generalization. Here, we review some of the prevailing trends in embedding physics into machine learning, present some of the current capabilities and limitations and discuss diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems. The rapidly developing field of physics-informed learning integrates data and mathematical models seamlessly, enabling accurate inference of realistic and high-dimensional multiphysics problems. This Review discusses the methodology and provides diverse examples and an outlook for further developments.

1,114 citations

Proceedings Article
17 Jun 2020
TL;DR: In this paper, the authors propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or Sirens, are ideally suited for representing complex natural signals and their derivatives.
Abstract: Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals with fine detail, and fail to represent a signal's spatial and temporal derivatives, despite the fact that these are essential to many physical signals defined implicitly as the solution to partial differential equations. We propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or Sirens, are ideally suited for representing complex natural signals and their derivatives. We analyze Siren activation statistics to propose a principled initialization scheme and demonstrate the representation of images, wavefields, video, sound, and their derivatives. Further, we show how Sirens can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. Lastly, we combine Sirens with hypernetworks to learn priors over the space of Siren functions.

1,058 citations

Journal ArticleDOI
TL;DR: Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently as discussed by the authors, and a comprehensive overview of deep learning for PDEs can be found in Section 2.1.
Abstract: Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of p...

760 citations

Journal ArticleDOI
TL;DR: This work reviews the recent status of methodologies and techniques related to the construction of digital twins mostly from a modeling perspective to provide a detailed coverage of the current challenges and enabling technologies along with recommendations and reflections for various stakeholders.
Abstract: Digital twin can be defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision making. Recent advances in computational pipelines, multiphysics solvers, artificial intelligence, big data cybernetics, data processing and management tools bring the promise of digital twins and their impact on society closer to reality. Digital twinning is now an important and emerging trend in many applications. Also referred to as a computational megamodel, device shadow, mirrored system, avatar or a synchronized virtual prototype, there can be no doubt that a digital twin plays a transformative role not only in how we design and operate cyber-physical intelligent systems, but also in how we advance the modularity of multi-disciplinary systems to tackle fundamental barriers not addressed by the current, evolutionary modeling practices. In this work, we review the recent status of methodologies and techniques related to the construction of digital twins mostly from a modeling perspective. Our aim is to provide a detailed coverage of the current challenges and enabling technologies along with recommendations and reflections for various stakeholders.

660 citations

Journal ArticleDOI
TL;DR: This paper provides a methodology that incorporates the governing equations of the physical model in the loss/likelihood functions of the model predictive density and the reference conditional density as a minimization problem of the reverse Kullback-Leibler (KL) divergence.

560 citations


Cites background or methods from "DGM: A deep learning algorithm for ..."

  • ...(6) by minimizing the residual loss where the exact derivatives are calculated with automatic differentiation [32, 39, 33, 37]....

    [...]

  • ...Given one input x = [K(s1), · · · , K(sns)], most previous works [32, 39, 33, 37] use FC-NNs to represent the solution as...

    [...]

  • ...analytical and meshfree [33, 34]; (2) the loss function can be derived from the variational form [35, 36]; (3) stochastic gradient descent is used to train the network by randomly sampling mini-batches of inputs (spatial locations and/or time instances) [37, 35]; (4) deeper networks are used to break the curse of dimensionality [38] allowing for several high-dimensional PDEs to be solved with high accuracy and speed [39, 40, 37, 41]; (5) multiscale numerical solvers are enhanced by replacing the linear basis with learned ones with DNNs [42, 43]; (6) surrogate modeling for PDEs [44, 45, 36]....

    [...]

References
More filters
Proceedings Article
01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

111,197 citations


"DGM: A deep learning algorithm for ..." refers methods in this paper

  • ...Parameters are updated using the well-known ADAM algorithm (see [26]) with a decaying learning rate schedule (more details on the learning rate are provided below)....

    [...]

Journal ArticleDOI
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

72,897 citations


"DGM: A deep learning algorithm for ..." refers background in this paper

  • ...2) is similar to the architecture for LSTM networks (see [23]) and highway networks (see [46])....

    [...]

Posted Content
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

23,486 citations

Book
07 Jan 2013
TL;DR: In this article, Leray-Schauder and Harnack this article considered the Dirichlet Problem for Poisson's Equation and showed that it is a special case of Divergence Form Operators.
Abstract: Chapter 1. Introduction Part I: Linear Equations Chapter 2. Laplace's Equation 2.1 The Mean Value Inequalities 2.2 Maximum and Minimum Principle 2.3 The Harnack Inequality 2.4 Green's Representation 2.5 The Poisson Integral 2.6 Convergence Theorems 2.7 Interior Estimates of Derivatives 2.8 The Dirichlet Problem the Method of Subharmonic Functions 2.9 Capacity Problems Chapter 3. The Classical Maximum Principle 3.1 The Weak Maximum Principle 3.2 The Strong Maximum Principle 3.3 Apriori Bounds 3.4 Gradient Estimates for Poisson's Equation 3.5 A Harnack Inequality 3.6 Operators in Divergence Form Notes Problems Chapter 4. Poisson's Equation and Newtonian Potential 4.1 Holder Continuity 4.2 The Dirichlet Problem for Poisson's Equation 4.3 Holder Estimates for the Second Derivatives 4.4 Estimates at the Boundary 4.5 Holder Estimates for the First Derivatives Notes Problems Chapter 5. Banach and Hilbert Spaces 5.1 The Contraction Mapping 5.2 The Method of Cintinuity 5.3 The Fredholm Alternative 5.4 Dual Spaces and Adjoints 5.5 Hilbert Spaces 5.6 The Projection Theorem 5.7 The Riesz Representation Theorem 5.8 The Lax-Milgram Theorem 5.9 The Fredholm Alternative in Hilbert Spaces 5.10 Weak Compactness Notes Problems Chapter 6. Classical Solutions the Schauder Approach 6.1 The Schauder Interior Estimates 6.2 Boundary and Global Estimates 6.3 The Dirichlet Problem 6.4 Interior and Boundary Regularity 6.5 An Alternative Approach 6.6 Non-Uniformly Elliptic Equations 6.7 Other Boundary Conditions the Obliue Derivative Problem 6.8 Appendix 1: Interpolation Inequalities 6.9 Appendix 2: Extension Lemmas Notes Problems Chapter 7. Sobolev Spaces 7.1 L^p spaces 7.2 Regularization and Approximation by Smooth Functions 7.3 Weak Derivatives 7.4 The Chain Rule 7.5 The W^(k,p) Spaces 7.6 DensityTheorems 7.7 Imbedding Theorems 7.8 Potential Estimates and Imbedding Theorems 7.9 The Morrey and John-Nirenberg Estimes 7.10 Compactness Results 7.11 Difference Quotients 7.12 Extension and Interpolation Notes Problems Chapter 8 Generalized Solutions and Regularity 8.1 The Weak Maximum Principle 8.2 Solvability of the Dirichlet Problem 8.3 Diferentiability of Weak Solutions 8.4 Global Regularity 8.5 Global Boundedness of Weak Solutions 8.6 Local Properties of Weak Solutions 8.7 The Strong Maximum Principle 8.8 The Harnack Inequality 8.9 Holder Continuity 8.10 Local Estimates at the Boundary 8.11 Holder Estimates for the First Derivatives 8.12 The Eigenvalue Problem Notes Problems Chapter 9. Strong Solutions 9.1 Maximum Princiles for Strong Solutions 9.2 L^p Estimates: Preliminary Analysis 9.3 The Marcinkiewicz Interpolation Theorem 9.4 The Calderon-Zygmund Inequality 9.5 L^p Estimates 9.6 The Dirichlet Problem 9.7 A Local Maximum Principle 9.8 Holder and Harnack Estimates 9.9 Local Estimates at the Boundary Notes Problems Part II: Quasilinear Equations Chapter 10. Maximum and Comparison Principles 10.1 The Comparison Principle 10.2 Maximum Principles 10.3 A Counterexample 10.4 Comparison Principles for Divergence Form Operators 10.5 Maximum Principles for Divergence Form Operators Notes Problems Chapter 11. Topological Fixed Point Theorems and Their Application 11.1 The Schauder Fixes Point Theorem 11.2 The Leray-Schauder Theorem: a Special Case 11.3 An Application 11.4 The Leray-Schauder Fixed Point Theorem 11.5 Variational Problems Notes Chapter 12. Equations in Two Variables 12.1 Quasiconformal Mappings 12.2 holder Gradient Estimates for Linear Equations 12.3 The Dirichlet Problem for Uniformly Elliptic Equations 12.4 Non-Uniformly Elliptic Equations Notes Problems Chapter 13. Holder Estimates for

18,443 citations


"DGM: A deep learning algorithm for ..." refers background in this paper

  • ...If g 6= 0 such that g is the trace of some appropriately smooth function, say φ, then one can reduce the inhomogeneous boundary conditions on ∂ΩT to the homogeneous one by introducing in place of u the new function u− φ, see Section 4 of Chapter V in [27] or Chapter 8 of [19] for details on such considerations....

    [...]

Journal ArticleDOI
TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.
Abstract: In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.

12,286 citations