DGM: A deep learning algorithm for solving partial differential equations

doi:10.1016/J.JCP.2018.08.029

Home
/
Papers
/
DGM: A deep learning algorithm for solving partial differential equations

Journal Article•DOI•

DGM: A deep learning algorithm for solving partial differential equations

Justin Sirignano¹, Konstantinos Spiliopoulos²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Boston University²

15 Dec 2018-Journal of Computational Physics (Academic Press)-Vol. 375, pp 1339-1364

TL;DR: A deep learning algorithm similar in spirit to Galerkin methods, using a deep neural network instead of linear combinations of basis functions is proposed, and is implemented for American options in up to 100 dimensions.

read less

About: This article is published in Journal of Computational Physics.The article was published on 2018-12-15 and is currently open access. It has received 1290 citations till now. The article focuses on the topics: Partial differential equation & Boundary value problem.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Physics-informed machine learning

[...]

George Em Karniadakis¹, Ioannis G. Kevrekidis², Lu Lu³, Paris Perdikaris⁴, Sifan Wang⁴, Liu Yang¹ - Show less +2 more•Institutions (4)

Brown University¹, Johns Hopkins University², Massachusetts Institute of Technology³, University of Pennsylvania⁴

01 Jun 2021

TL;DR: Some of the prevailing trends in embedding physics into machine learning are reviewed, some of the current capabilities and limitations are presented and diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems are discussed.

...read moreread less

Abstract: Despite great progress in simulating multiphysics problems using the numerical discretization of partial differential equations (PDEs), one still cannot seamlessly incorporate noisy data into existing algorithms, mesh generation remains complex, and high-dimensional problems governed by parameterized PDEs cannot be tackled. Moreover, solving inverse problems with hidden physics is often prohibitively expensive and requires different formulations and elaborate computer codes. Machine learning has emerged as a promising alternative, but training deep neural networks requires big data, not always available for scientific problems. Instead, such networks can be trained from additional information obtained by enforcing the physical laws (for example, at random points in the continuous space-time domain). Such physics-informed learning integrates (noisy) data and mathematical models, and implements them through neural networks or other kernel-based regression networks. Moreover, it may be possible to design specialized network architectures that automatically satisfy some of the physical invariants for better accuracy, faster training and improved generalization. Here, we review some of the prevailing trends in embedding physics into machine learning, present some of the current capabilities and limitations and discuss diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems. The rapidly developing field of physics-informed learning integrates data and mathematical models seamlessly, enabling accurate inference of realistic and high-dimensional multiphysics problems. This Review discusses the methodology and provides diverse examples and an outlook for further developments.

...read moreread less

1,114 citations

Proceedings Article•

Implicit Neural Representations with Periodic Activation Functions

[...]

Vincent Sitzmann¹, Julien N. P. Martel¹, Alexander W. Bergman¹, David B. Lindell¹, Gordon Wetzstein¹ - Show less +1 more•Institutions (1)

Stanford University¹

17 Jun 2020

TL;DR: In this paper, the authors propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or Sirens, are ideally suited for representing complex natural signals and their derivatives.

...read moreread less

Abstract: Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals with fine detail, and fail to represent a signal's spatial and temporal derivatives, despite the fact that these are essential to many physical signals defined implicitly as the solution to partial differential equations. We propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or Sirens, are ideally suited for representing complex natural signals and their derivatives. We analyze Siren activation statistics to propose a principled initialization scheme and demonstrate the representation of images, wavefields, video, sound, and their derivatives. Further, we show how Sirens can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. Lastly, we combine Sirens with hypernetworks to learn priors over the space of Siren functions.

...read moreread less

1,058 citations

Journal Article•DOI•

DeepXDE: A deep learning library for solving differential equations

[...]

Lu Lu, Xuhui Meng, Zhiping Mao, George Em Karniadakis

04 Feb 2021-Siam Review

TL;DR: Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently as discussed by the authors, and a comprehensive overview of deep learning for PDEs can be found in Section 2.1.

...read moreread less

Abstract: Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of p...

...read moreread less

760 citations

Journal Article•DOI•

Digital Twin: Values, Challenges and Enablers From a Modeling Perspective

[...]

Adil Rasheed¹, Omer San², Trond Kvamsdal³•Institutions (3)

Norwegian University of Science and Technology¹, Oklahoma State University–Stillwater², SINTEF³

28 Jan 2020-IEEE Access

TL;DR: This work reviews the recent status of methodologies and techniques related to the construction of digital twins mostly from a modeling perspective to provide a detailed coverage of the current challenges and enabling technologies along with recommendations and reflections for various stakeholders.

...read moreread less

Abstract: Digital twin can be defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision making. Recent advances in computational pipelines, multiphysics solvers, artificial intelligence, big data cybernetics, data processing and management tools bring the promise of digital twins and their impact on society closer to reality. Digital twinning is now an important and emerging trend in many applications. Also referred to as a computational megamodel, device shadow, mirrored system, avatar or a synchronized virtual prototype, there can be no doubt that a digital twin plays a transformative role not only in how we design and operate cyber-physical intelligent systems, but also in how we advance the modularity of multi-disciplinary systems to tackle fundamental barriers not addressed by the current, evolutionary modeling practices. In this work, we review the recent status of methodologies and techniques related to the construction of digital twins mostly from a modeling perspective. Our aim is to provide a detailed coverage of the current challenges and enabling technologies along with recommendations and reflections for various stakeholders.

...read moreread less

660 citations

Journal Article•DOI•

Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data

[...]

Yinhao Zhu¹, Nicholas Zabaras¹, Phaedon-Stelios Koutsourelakis², Paris Perdikaris³•Institutions (3)

University of Notre Dame¹, Technische Universität München², University of Pennsylvania³

01 Oct 2019-Journal of Computational Physics

TL;DR: This paper provides a methodology that incorporates the governing equations of the physical model in the loss/likelihood functions of the model predictive density and the reference conditional density as a minimization problem of the reverse Kullback-Leibler (KL) divergence.

...read moreread less

560 citations

Cites background or methods from "DGM: A deep learning algorithm for ..."

...(6) by minimizing the residual loss where the exact derivatives are calculated with automatic differentiation [32, 39, 33, 37]....
[...]
...Given one input x = [K(s1), · · · , K(sns)], most previous works [32, 39, 33, 37] use FC-NNs to represent the solution as...
[...]
...analytical and meshfree [33, 34]; (2) the loss function can be derived from the variational form [35, 36]; (3) stochastic gradient descent is used to train the network by randomly sampling mini-batches of inputs (spatial locations and/or time instances) [37, 35]; (4) deeper networks are used to break the curse of dimensionality [38] allowing for several high-dimensional PDEs to be solved with high accuracy and speed [39, 40, 37, 41]; (5) multiscale numerical solvers are enhanced by replacing the linear basis with learned ones with DNNs [42, 43]; (6) surrogate modeling for PDEs [44, 45, 36]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

...read moreread less

111,197 citations

"DGM: A deep learning algorithm for ..." refers methods in this paper

...Parameters are updated using the well-known ADAM algorithm (see [26]) with a decaying learning rate schedule (more details on the learning rate are provided below)....
[...]

Journal Article•DOI•

Long short-term memory

[...]

Sepp Hochreiter¹, Jürgen Schmidhuber²•Institutions (2)

Technische Universität München¹, Dalle Molle Institute for Artificial Intelligence Research²

01 Nov 1997-Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

...read moreread less

72,897 citations

"DGM: A deep learning algorithm for ..." refers background in this paper

...2) is similar to the architecture for LSTM networks (see [23]) and highway networks (see [46])....
[...]

Posted Content•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

22 Dec 2014-arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

23,486 citations

Book•

Elliptic Partial Differential Equations of Second Order

[...]

David Gilbarg, Neil S. Trudinger

07 Jan 2013

TL;DR: In this article, Leray-Schauder and Harnack this article considered the Dirichlet Problem for Poisson's Equation and showed that it is a special case of Divergence Form Operators.

...read moreread less

Abstract: Chapter 1. Introduction Part I: Linear Equations Chapter 2. Laplace's Equation 2.1 The Mean Value Inequalities 2.2 Maximum and Minimum Principle 2.3 The Harnack Inequality 2.4 Green's Representation 2.5 The Poisson Integral 2.6 Convergence Theorems 2.7 Interior Estimates of Derivatives 2.8 The Dirichlet Problem the Method of Subharmonic Functions 2.9 Capacity Problems Chapter 3. The Classical Maximum Principle 3.1 The Weak Maximum Principle 3.2 The Strong Maximum Principle 3.3 Apriori Bounds 3.4 Gradient Estimates for Poisson's Equation 3.5 A Harnack Inequality 3.6 Operators in Divergence Form Notes Problems Chapter 4. Poisson's Equation and Newtonian Potential 4.1 Holder Continuity 4.2 The Dirichlet Problem for Poisson's Equation 4.3 Holder Estimates for the Second Derivatives 4.4 Estimates at the Boundary 4.5 Holder Estimates for the First Derivatives Notes Problems Chapter 5. Banach and Hilbert Spaces 5.1 The Contraction Mapping 5.2 The Method of Cintinuity 5.3 The Fredholm Alternative 5.4 Dual Spaces and Adjoints 5.5 Hilbert Spaces 5.6 The Projection Theorem 5.7 The Riesz Representation Theorem 5.8 The Lax-Milgram Theorem 5.9 The Fredholm Alternative in Hilbert Spaces 5.10 Weak Compactness Notes Problems Chapter 6. Classical Solutions the Schauder Approach 6.1 The Schauder Interior Estimates 6.2 Boundary and Global Estimates 6.3 The Dirichlet Problem 6.4 Interior and Boundary Regularity 6.5 An Alternative Approach 6.6 Non-Uniformly Elliptic Equations 6.7 Other Boundary Conditions the Obliue Derivative Problem 6.8 Appendix 1: Interpolation Inequalities 6.9 Appendix 2: Extension Lemmas Notes Problems Chapter 7. Sobolev Spaces 7.1 L^p spaces 7.2 Regularization and Approximation by Smooth Functions 7.3 Weak Derivatives 7.4 The Chain Rule 7.5 The W^(k,p) Spaces 7.6 DensityTheorems 7.7 Imbedding Theorems 7.8 Potential Estimates and Imbedding Theorems 7.9 The Morrey and John-Nirenberg Estimes 7.10 Compactness Results 7.11 Difference Quotients 7.12 Extension and Interpolation Notes Problems Chapter 8 Generalized Solutions and Regularity 8.1 The Weak Maximum Principle 8.2 Solvability of the Dirichlet Problem 8.3 Diferentiability of Weak Solutions 8.4 Global Regularity 8.5 Global Boundedness of Weak Solutions 8.6 Local Properties of Weak Solutions 8.7 The Strong Maximum Principle 8.8 The Harnack Inequality 8.9 Holder Continuity 8.10 Local Estimates at the Boundary 8.11 Holder Estimates for the First Derivatives 8.12 The Eigenvalue Problem Notes Problems Chapter 9. Strong Solutions 9.1 Maximum Princiles for Strong Solutions 9.2 L^p Estimates: Preliminary Analysis 9.3 The Marcinkiewicz Interpolation Theorem 9.4 The Calderon-Zygmund Inequality 9.5 L^p Estimates 9.6 The Dirichlet Problem 9.7 A Local Maximum Principle 9.8 Holder and Harnack Estimates 9.9 Local Estimates at the Boundary Notes Problems Part II: Quasilinear Equations Chapter 10. Maximum and Comparison Principles 10.1 The Comparison Principle 10.2 Maximum Principles 10.3 A Counterexample 10.4 Comparison Principles for Divergence Form Operators 10.5 Maximum Principles for Divergence Form Operators Notes Problems Chapter 11. Topological Fixed Point Theorems and Their Application 11.1 The Schauder Fixes Point Theorem 11.2 The Leray-Schauder Theorem: a Special Case 11.3 An Application 11.4 The Leray-Schauder Fixed Point Theorem 11.5 Variational Problems Notes Chapter 12. Equations in Two Variables 12.1 Quasiconformal Mappings 12.2 holder Gradient Estimates for Linear Equations 12.3 The Dirichlet Problem for Uniformly Elliptic Equations 12.4 Non-Uniformly Elliptic Equations Notes Problems Chapter 13. Holder Estimates for

...read moreread less

18,443 citations

"DGM: A deep learning algorithm for ..." refers background in this paper

...If g 6= 0 such that g is the trace of some appropriately smooth function, say φ, then one can reduce the inhomogeneous boundary conditions on ∂ΩT to the homogeneous one by introducing in place of u the new function u− φ, see Section 4 of Chapter V in [27] or Chapter 8 of [19] for details on such considerations....
[...]

Journal Article•DOI•

Approximation by superpositions of a sigmoidal function

[...]

George Cybenko¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1989-Mathematics of Control, Signals, and Systems

TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.

...read moreread less

Abstract: In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.

...read moreread less

12,286 citations