Home
/
Authors
/
Yeonjong Shin

Author

Yeonjong Shin

Other affiliations: Ohio State University, University of Utah

Bio: Yeonjong Shin is an academic researcher from Brown University. The author has contributed to research in topics: Artificial neural network & Gradient descent. The author has an hindex of 11, co-authored 24 publications receiving 432 citations. Previous affiliations of Yeonjong Shin include Ohio State University & University of Utah.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Dying ReLU and Initialization: Theory and Numerical Examples

[...]

Lu Lu¹, Yeonjong Shin², Yanhui Su³, George Em Karniadakis²•Institutions (3)

Massachusetts Institute of Technology¹, Brown University², Fuzhou University³

15 Mar 2019-arXiv: Machine Learning

TL;DR: In this article, a randomized asymmetric initialization procedure was proposed to prevent the dying ReLU in deep ReLU networks, which can effectively prevent the network from dying in probability as the depth goes to infinity.

...read moreread less

Abstract: The dying ReLU refers to the problem when ReLU neurons become inactive and only output 0 for any input. There are many empirical and heuristic explanations of why ReLU neurons die. However, little is known about its theoretical analysis. In this paper, we rigorously prove that a deep ReLU network will eventually die in probability as the depth goes to infinite. Several methods have been proposed to alleviate the dying ReLU. Perhaps, one of the simplest treatments is to modify the initialization procedure. One common way of initializing weights and biases uses symmetric probability distributions, which suffers from the dying ReLU. We thus propose a new initialization procedure, namely, a randomized asymmetric initialization. We prove that the new initialization can effectively prevent the dying ReLU. All parameters required for the new initialization are theoretically designed. Numerical examples are provided to demonstrate the effectiveness of the new initialization procedure.

...read moreread less

208 citations

Journal Article•DOI•

On the convergence of physics informed neural networks for linear second-order elliptic and parabolic type PDEs

[...]

Yeonjong Shin

03 Apr 2020-Communications in Computational Physics

TL;DR: This is the first theoretical work that shows the consistency of PINNs, and it is shown that the sequence of minimizers strongly converges to the PDE solution in $C^0$.

...read moreread less

Abstract: Physics informed neural networks (PINNs) are deep learning based techniques for solving partial differential equations (PDEs) encounted in computational science and engineering. Guided by data and physical laws, PINNs find a neural network that approximates the solution to a system of PDEs. Such a neural network is obtained by minimizing a loss function in which any prior knowledge of PDEs and data are encoded. Despite its remarkable empirical success in one, two or three dimensional problems, there is little theoretical justification for PINNs. As the number of data grows, PINNs generate a sequence of minimizers which correspond to a sequence of neural networks. We want to answer the question: Does the sequence of minimizers converge to the solution to the PDE? We consider two classes of PDEs: linear second-order elliptic and parabolic. By adapting the Schauder approach and the maximum principle, we show that the sequence of minimizers strongly converges to the PDE solution in $C^0$. Furthermore, we show that if each minimizer satisfies the initial/boundary conditions, the convergence mode becomes $H^1$. Computational examples are provided to illustrate our theoretical findings. To the best of our knowledge, this is the first theoretical work that shows the consistency of PINNs.

...read moreread less

153 citations

Journal Article•DOI•

Dying ReLU and Initialization: Theory and Numerical Examples

[...]

Lu Lu, Yeonjong Shin, Yanhui Su, George Em Karniadakis

01 Jun 2020-Communications in Computational Physics

TL;DR: This paper rigorously proves that a deep ReLU network will eventually die in probability as the depth goes to infinite, and proposes a new initialization procedure, namely, a randomized asymmetric initialization, which can effectively prevent the dying ReLU.

...read moreread less

100 citations

Posted Content•

On the Convergence and generalization of Physics Informed Neural Networks.

[...]

Yeonjong Shin, Jérôme Darbon, George Em Karniadakis

03 Apr 2020

TL;DR: This is the first theoretical work that shows the consistency of the PINNs methodology and shows that if each minimizer satisfies the initial/boundary conditions, the convergence mode can be improved to $H^1$.

...read moreread less

Abstract: Physics informed neural networks (PINNs) are deep learning based techniques for solving partial differential equations (PDEs). Guided by data and physical laws, PINNs find a neural network that approximates the solution to a system of PDEs. Such a neural network is obtained by minimizing a loss function in which any prior knowledge of PDEs and data are encoded. Despite its remarkable empirical success, there is little theoretical justification for PINNs. In this paper, we establish a mathematical foundation of the PINNs methodology. As the number of data grows, PINNs generate a sequence of minimizers which correspond to a sequence of neural networks. We want to answer the question: Does the sequence of minimizers converge to the solution to the PDE? This question is also related to the generalization of PINNs. We consider two classes of PDEs: elliptic and parabolic. By adapting the Schuader approach, we show that the sequence of minimizers strongly converges to the PDE solution in $L^2$. Furthermore, we show that if each minimizer satisfies the initial/boundary conditions, the convergence mode can be improved to $H^1$. Computational examples are provided to illustrate our theoretical findings. To the best of our knowledge, this is the first theoretical work that shows the consistency of the PINNs methodology.

...read moreread less

66 citations

Journal Article•DOI•

Nonadaptive Quasi-Optimal Points Selection for Least Squares Linear Regression

[...]

Yeonjong Shin¹, Dongbin Xiu•Institutions (1)

University of Utah¹

04 Feb 2016-SIAM Journal on Scientific Computing

TL;DR: This paper presents a quasi-optimal sample set for ordinary least squares (OLS) regression, and presents its efficient implementation via a greedy algorithm, along with several numerical examples to demonstrate its efficacy.

...read moreread less

Abstract: In this paper we present a quasi-optimal sample set for ordinary least squares (OLS) regression. The quasi-optimal set is designed in such a way that, for a given number of samples, it can deliver the regression result as close as possible to the result obtained by a (much) larger set of candidate samples. The quasi-optimal set is determined by maximizing a quantity measuring the mutual column orthogonality and the determinant of the model matrix. This procedure is nonadaptive, in the sense that it does not depend on the sample data. This is useful in practice, as it allows one to determine, prior to the potentially expensive data collection procedure, where to sample the underlying system. In addition to presenting the theoretical motivation of the quasi-optimal set, we also present its efficient implementation via a greedy algorithm, along with several numerical examples to demonstrate its efficacy. Since the quasi-optimal set allows one to obtain a near optimal regression result under any affordable nu...

...read moreread less

51 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Physics-informed machine learning

[...]

George Em Karniadakis¹, Ioannis G. Kevrekidis², Lu Lu³, Paris Perdikaris⁴, Sifan Wang⁴, Liu Yang¹ - Show less +2 more•Institutions (4)

Brown University¹, Johns Hopkins University², Massachusetts Institute of Technology³, University of Pennsylvania⁴

01 Jun 2021

TL;DR: Some of the prevailing trends in embedding physics into machine learning are reviewed, some of the current capabilities and limitations are presented and diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems are discussed.

...read moreread less

Abstract: Despite great progress in simulating multiphysics problems using the numerical discretization of partial differential equations (PDEs), one still cannot seamlessly incorporate noisy data into existing algorithms, mesh generation remains complex, and high-dimensional problems governed by parameterized PDEs cannot be tackled. Moreover, solving inverse problems with hidden physics is often prohibitively expensive and requires different formulations and elaborate computer codes. Machine learning has emerged as a promising alternative, but training deep neural networks requires big data, not always available for scientific problems. Instead, such networks can be trained from additional information obtained by enforcing the physical laws (for example, at random points in the continuous space-time domain). Such physics-informed learning integrates (noisy) data and mathematical models, and implements them through neural networks or other kernel-based regression networks. Moreover, it may be possible to design specialized network architectures that automatically satisfy some of the physical invariants for better accuracy, faster training and improved generalization. Here, we review some of the prevailing trends in embedding physics into machine learning, present some of the current capabilities and limitations and discuss diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems. The rapidly developing field of physics-informed learning integrates data and mathematical models seamlessly, enabling accurate inference of realistic and high-dimensional multiphysics problems. This Review discusses the methodology and provides diverse examples and an outlook for further developments.

...read moreread less

1,114 citations

Journal Article•DOI•

Approximation theory and methods, by M. J. D. Powell. Pp 339. £25 (hardcover), £8·50 (paperback). 1981. ISBN 0-521-22472-1/29514-9 (Cambridge University Press)

[...]

Alan J. Davies

01 Mar 1984-The Mathematical Gazette

TL;DR: In this article, the authors consider the problem of finding the best approximation operator for a given function, and the uniqueness of best approximations and the existence of best approximation operators.

...read moreread less

Abstract: Preface 1. The approximation problem and existence of best approximations 2. The uniqueness of best approximations 3. Approximation operators and some approximating functions 4. Polynomial interpolation 5. Divided differences 6. The uniform convergence of polynomial approximations 7. The theory of minimax approximation 8. The exchange algorithm 9. The convergence of the exchange algorithm 10. Rational approximation by the exchange algorithm 11. Least squares approximation 12. Properties of orthogonal polynomials 13. Approximation of periodic functions 14. The theory of best L1 approximation 15. An example of L1 approximation and the discrete case 16. The order of convergence of polynomial approximations 17. The uniform boundedness theorem 18. Interpolation by piecewise polynomials 19. B-splines 20. Convergence properties of spline approximations 21. Knot positions and the calculation of spline approximations 22. The Peano kernel theorem 23. Natural and perfect splines 24. Optimal interpolation Appendices Index.

...read moreread less

841 citations

Journal Article•DOI•

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

[...]

Lu Lu¹, Pengzhan Jin², Pengzhan Jin³, Guofei Pang², Zhongqiang Zhang⁴, George Em Karniadakis² - Show less +2 more•Institutions (4)

Massachusetts Institute of Technology¹, Brown University², Chinese Academy of Sciences³, Worcester Polytechnic Institute⁴

01 Mar 2021-Nature Machine Intelligence

TL;DR: A new deep neural network called DeepONet can lean various mathematical operators with small generalization error and can learn various explicit operators, such as integrals and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations.

...read moreread less

Abstract: It is widely known that neural networks (NNs) are universal approximators of continuous functions. However, a less known but powerful result is that a NN with a single hidden layer can accurately approximate any nonlinear continuous operator. This universal approximation theorem of operators is suggestive of the structure and potential of deep neural networks (DNNs) in learning continuous operators or complex systems from streams of scattered data. Here, we thus extend this theorem to DNNs. We design a new network with small generalization error, the deep operator network (DeepONet), which consists of a DNN for encoding the discrete input function space (branch net) and another DNN for encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators, such as integrals and fractional Laplacians, as well as implicit operators that represent deterministic and stochastic differential equations. We study different formulations of the input function space and its effect on the generalization error for 16 different diverse applications. Neural networks are known as universal approximators of continuous functions, but they can also approximate any mathematical operator (mapping a function to another function), which is an important capability for complex systems such as robotics control. A new deep neural network called DeepONet can lean various mathematical operators with small generalization error.

...read moreread less

675 citations

Journal Article•DOI•

DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators

[...]

Lu Lu, Pengzhan Jin, George Em Karniadakis

08 Oct 2019-arXiv: Learning

TL;DR: This work proposes deep operator networks (DeepONets) to learn operators accurately and efficiently from a relatively small dataset, and demonstrates that DeepONet significantly reduces the generalization error compared to the fully-connected networks.

...read moreread less

Abstract: While it is widely known that neural networks are universal approximators of continuous functions, a less known and perhaps more powerful result is that a neural network with a single hidden layer can approximate accurately any nonlinear continuous operator. This universal approximation theorem is suggestive of the potential application of neural networks in learning nonlinear operators from data. However, the theorem guarantees only a small approximation error for a sufficient large network, and does not consider the important optimization and generalization errors. To realize this theorem in practice, we propose deep operator networks (DeepONets) to learn operators accurately and efficiently from a relatively small dataset. A DeepONet consists of two sub-networks, one for encoding the input function at a fixed number of sensors $x_i, i=1,\dots,m$ (branch net), and another for encoding the locations for the output functions (trunk net). We perform systematic simulations for identifying two types of operators, i.e., dynamic systems and partial differential equations, and demonstrate that DeepONet significantly reduces the generalization error compared to the fully-connected networks. We also derive theoretically the dependence of the approximation error in terms of the number of sensors (where the input function is defined) as well as the input function type, and we verify the theorem with computational results. More importantly, we observe high-order error convergence in our computational tests, namely polynomial rates (from half order to fourth order) and even exponential convergence with respect to the training dataset size.

...read moreread less

324 citations

Journal Article•DOI•

Extended Physics-Informed Neural Networks (XPINNs): A Generalized Space-Time Domain Decomposition Based Deep Learning Framework for Nonlinear Partial Differential Equations

[...]

Ameya D. Jagtap Karniadakis, George Em

01 Jun 2020-Communications in Computational Physics

TL;DR: The proposed XPINN method is the generalization of PINN and cPINN approaches, both in terms of applicability as well as domain decomposition approach, which efficiently lends itself to parallelized computation.

...read moreread less

Abstract: We propose a generalized space-time domain decomposition framework for the physics-informed neural networks (PINNs) to solve nonlinear partial differential equations (PDEs) on arbitrary complex-geometry domains. The proposed framework, named eXtended PINNs (XPINNs), further pushes the boundaries of both PINNs as well as conservative PINNs (cPINNs), which is a recently proposed domain decomposition approach in the PINN framework tailored to conservation laws. Compared to PINN, the XPINN method has large representation and parallelization capacity due to the inherent property of deployment of multiple neural networks in the smaller subdomains. Unlike cPINN, XPINN can be extended to any type of PDEs. Moreover, the domain can be decomposed in any arbitrary way (in space and time), which is not possible in cPINN. Thus, XPINN offers both space and time parallelization, thereby reducing the training cost more effectively. In each subdomain, a separate neural network is employed with optimally selected hyperparameters, e.g., depth/width of the network, number and location of residual points, activation function, optimization method, etc. A deep network can be employed in a subdomain with complex solution, whereas a shallow neural network can be used in a subdomain with relatively simple and smooth solutions. We demonstrate the versatility of XPINN by solving both forward and inverse PDE problems, ranging from one-dimensional to three-dimensional problems, from timedependent to time-independent problems, and from continuous to discontinuous problems, which clearly shows that the XPINN method is promising in many practical problems. The proposed XPINN method is the generalization of PINN and cPINN approaches, both in terms of applicability as well as domain decomposition approach, which efficiently lends itself to parallelized computation. The XPINN code will be available on https://github.com/AmeyaJagtap/XPINNs.

...read moreread less

308 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150

Collapse