scispace - formally typeset
Search or ask a question
Posted Content

Solving high-dimensional parabolic PDEs using the tensor train format

TL;DR: In this article, the authors argue that tensor trains provide an appealing approximation framework for parabolic PDEs: the combination of reformulations in terms of backward stochastic differential equations and regression-type methods in the tensor format holds the promise of leveraging latent low-rank structures enabling both compression and efficient computation.
Abstract: High-dimensional partial differential equations (PDEs) are ubiquitous in economics, science and engineering. However, their numerical treatment poses formidable challenges since traditional grid-based methods tend to be frustrated by the curse of dimensionality. In this paper, we argue that tensor trains provide an appealing approximation framework for parabolic PDEs: the combination of reformulations in terms of backward stochastic differential equations and regression-type methods in the tensor format holds the promise of leveraging latent low-rank structures enabling both compression and efficient computation. Following this paradigm, we develop novel iterative schemes, involving either explicit and fast or implicit and accurate updates. We demonstrate in a number of examples that our methods achieve a favorable trade-off between accuracy and computational efficiency in comparison with state-of-the-art neural network based approaches.
Citations
More filters
DOI
01 Aug 2021
TL;DR: The potential of iterative diffusion optimisation techniques is investigated, in particular considering applications in importance sampling and rare event simulation, and focusing on problems without diffusion control, with linearly controlled drift and running costs that depend quadratically on the control.
Abstract: Optimal control of diffusion processes is intimately connected to the problem of solving certain Hamilton–Jacobi–Bellman equations. Building on recent machine learning inspired approaches towards high-dimensional PDEs, we investigate the potential of iterative diffusion optimisation techniques, in particular considering applications in importance sampling and rare event simulation, and focusing on problems without diffusion control, with linearly controlled drift and running costs that depend quadratically on the control. More generally, our methods apply to nonlinear parabolic PDEs with a certain shift invariance. The choice of an appropriate loss function being a central element in the algorithmic design, we develop a principled framework based on divergences between path measures, encompassing various existing methods. Motivated by connections to forward-backward SDEs, we propose and study the novel log-variance divergence, showing favourable properties of corresponding Monte Carlo estimators. The promise of the developed approach is exemplified by a range of high-dimensional and metastable numerical examples.

44 citations

Posted Content
TL;DR: In this paper, the authors consider a finite horizon control system with associated Bellman equation and obtain a sequence of short time horizon problems, which they call local optimal control problems, and apply two different methods, one being the well-known policy iteration, where a fixed-point iteration is required for every time step.
Abstract: Controlling systems of ordinary differential equations (ODEs) is ubiquitous in science and engineering. For finding an optimal feedback controller, the value function and associated fundamental equations such as the Bellman equation and the Hamilton-Jacobi-Bellman (HJB) equation are essential. The numerical treatment of these equations poses formidable challenges due to their non-linearity and their (possibly) high-dimensionality. In this paper we consider a finite horizon control system with associated Bellman equation. After a time-discretization, we obtain a sequence of short time horizon problems which we call local optimal control problems. For solving the local optimal control problems we apply two different methods, one being the well-known policy iteration, where a fixed-point iteration is required for every time step. The other algorithm borrows ideas from Model Predictive Control (MPC), by solving the local optimal control problem via open-loop control methods on a short time horizon, allowing us to replace the fixed-point iteration by an adjoint method. For high-dimensional systems we apply low rank hierarchical tensor product approximation/tree-based tensor formats, in particular tensor trains (TT tensors) and multi-polynomials, together with high-dimensional quadrature, e.g. Monte-Carlo. We prove a linear error propagation with respect to the time discretization and give numerical evidence by controlling a diffusion equation with unstable reaction term and an Allen-Kahn equation.

3 citations

Posted Content
TL;DR: In this paper, a low-rank tensor train (TT) decomposition based on the Dirac-Frenkel variational principle is proposed for nonlinear optimal control.
Abstract: We present a novel method to approximate optimal feedback laws for nonlinear optimal control based on low-rank tensor train (TT) decompositions. The approach is based on the Dirac-Frenkel variational principle with the modification that the optimisation uses an empirical risk. Compared to current state-of-the-art TT methods, our approach exhibits a greatly reduced computational burden while achieving comparable results. A rigorous description of the numerical scheme and demonstrations of its performance are provided.

1 citations

Posted Content
TL;DR: In this article, the authors proposed several randomized algorithms for rounding a sum of tensors in the Tensor-Train format and compared their empirical accuracy and computational time with deterministic alternatives.
Abstract: The Tensor-Train (TT) format is a highly compact low-rank representation for high-dimensional tensors. TT is particularly useful when representing approximations to the solutions of certain types of parametrized partial differential equations. For many of these problems, computing the solution explicitly would require an infeasible amount of memory and computational time. While the TT format makes these problems tractable, iterative techniques for solving the PDEs must be adapted to perform arithmetic while maintaining the implicit structure. The fundamental operation used to maintain feasible memory and computational time is called rounding, which truncates the internal ranks of a tensor already in TT format. We propose several randomized algorithms for this task that are generalizations of randomized low-rank matrix approximation algorithms and provide significant reduction in computation compared to deterministic TT-rounding algorithms. Randomization is particularly effective in the case of rounding a sum of TT-tensors (where we observe 20x speedup), which is the bottleneck computation in the adaptation of GMRES to vectors in TT format. We present the randomized algorithms and compare their empirical accuracy and computational time with deterministic alternatives.
Posted Content
TL;DR: In this paper, a block sparsity pattern corresponds to some subspace of homogeneous multivariate polynomials, which is used to adapt the ansatz space to align better with known sample complexity results.
Abstract: Low-rank tensors are an established framework for high-dimensional least-squares problems. We propose to extend this framework by including the concept of block-sparsity. In the context of polynomial regression each sparsity pattern corresponds to some subspace of homogeneous multivariate polynomials. This allows us to adapt the ansatz space to align better with known sample complexity results. The resulting method is tested in numerical experiments and demonstrates improved computational resource utilization and sample efficiency.
References
More filters
Proceedings ArticleDOI
21 Jul 2017
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections—one between each layer and its subsequent layer—our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.

27,821 citations

Posted Content
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

23,486 citations

Book
01 Jan 1987
TL;DR: In this paper, the authors present a characterization of continuous local martingales with respect to Brownian motion in terms of Markov properties, including the strong Markov property, and a generalized version of the Ito rule.
Abstract: 1 Martingales, Stopping Times, and Filtrations.- 1.1. Stochastic Processes and ?-Fields.- 1.2. Stopping Times.- 1.3. Continuous-Time Martingales.- A. Fundamental inequalities.- B. Convergence results.- C. The optional sampling theorem.- 1.4. The Doob-Meyer Decomposition.- 1.5. Continuous, Square-Integrable Martingales.- 1.6. Solutions to Selected Problems.- 1.7. Notes.- 2 Brownian Motion.- 2.1. Introduction.- 2.2. First Construction of Brownian Motion.- A. The consistency theorem.- B. The Kolmogorov-?entsov theorem.- 2.3. Second Construction of Brownian Motion.- 2.4. The SpaceC[0, ?), Weak Convergence, and Wiener Measure.- A. Weak convergence.- B. Tightness.- C. Convergence of finite-dimensional distributions.- D. The invariance principle and the Wiener measure.- 2.5. The Markov Property.- A. Brownian motion in several dimensions.- B. Markov processes and Markov families.- C. Equivalent formulations of the Markov property.- 2.6. The Strong Markov Property and the Reflection Principle.- A. The reflection principle.- B. Strong Markov processes and families.- C. The strong Markov property for Brownian motion.- 2.7. Brownian Filtrations.- A. Right-continuity of the augmented filtration for a strong Markov process.- B. A "universal" filtration.- C. The Blumenthal zero-one law.- 2.8. Computations Based on Passage Times.- A. Brownian motion and its running maximum.- B. Brownian motion on a half-line.- C. Brownian motion on a finite interval.- D. Distributions involving last exit times.- 2.9. The Brownian Sample Paths.- A. Elementary properties.- B. The zero set and the quadratic variation.- C. Local maxima and points of increase.- D. Nowhere differentiability.- E. Law of the iterated logarithm.- F. Modulus of continuity.- 2.10. Solutions to Selected Problems.- 2.11. Notes.- 3 Stochastic Integration.- 3.1. Introduction.- 3.2. Construction of the Stochastic Integral.- A. Simple processes and approximations.- B. Construction and elementary properties of the integral.- C. A characterization of the integral.- D. Integration with respect to continuous, local martingales.- 3.3. The Change-of-Variable Formula.- A. The Ito rule.- B. Martingale characterization of Brownian motion.- C. Bessel processes, questions of recurrence.- D. Martingale moment inequalities.- E. Supplementary exercises.- 3.4. Representations of Continuous Martingales in Terms of Brownian Motion.- A. Continuous local martingales as stochastic integrals with respect to Brownian motion.- B. Continuous local martingales as time-changed Brownian motions.- C. A theorem of F. B. Knight.- D. Brownian martingales as stochastic integrals.- E. Brownian functionals as stochastic integrals.- 3.5. The Girsanov Theorem.- A. The basic result.- B. Proof and ramifications.- C. Brownian motion with drift.- D. The Novikov condition.- 3.6. Local Time and a Generalized Ito Rule for Brownian Motion.- A. Definition of local time and the Tanaka formula.- B. The Trotter existence theorem.- C. Reflected Brownian motion and the Skorohod equation.- D. A generalized Ito rule for convex functions.- E. The Engelbert-Schmidt zero-one law.- 3.7. Local Time for Continuous Semimartingales.- 3.8. Solutions to Selected Problems.- 3.9. Notes.- 4 Brownian Motion and Partial Differential Equations.- 4.1. Introduction.- 4.2. Harmonic Functions and the Dirichlet Problem.- A. The mean-value property.- B. The Dirichlet problem.- C. Conditions for regularity.- D. Integral formulas of Poisson.- E. Supplementary exercises.- 4.3. The One-Dimensional Heat Equation.- A. The Tychonoff uniqueness theorem.- B. Nonnegative solutions of the heat equation.- C. Boundary crossing probabilities for Brownian motion.- D. Mixed initial/boundary value problems.- 4.4. The Formulas of Feynman and Kac.- A. The multidimensional formula.- B. The one-dimensional formula.- 4.5. Solutions to selected problems.- 4.6. Notes.- 5 Stochastic Differential Equations.- 5.1. Introduction.- 5.2. Strong Solutions.- A. Definitions.- B. The Ito theory.- C. Comparison results and other refinements.- D. Approximations of stochastic differential equations.- E. Supplementary exercises.- 5.3. Weak Solutions.- A. Two notions of uniqueness.- B. Weak solutions by means of the Girsanov theorem.- C. A digression on regular conditional probabilities.- D. Results of Yamada and Watanabe on weak and strong solutions.- 5.4. The Martingale Problem of Stroock and Varadhan.- A. Some fundamental martingales.- B. Weak solutions and martingale problems.- C. Well-posedness and the strong Markov property.- D. Questions of existence.- E. Questions of uniqueness.- F. Supplementary exercises.- 5.5. A Study of the One-Dimensional Case.- A. The method of time change.- B. The method of removal of drift.- C. Feller's test for explosions.- D. Supplementary exercises.- 5.6. Linear Equations.- A. Gauss-Markov processes.- B. Brownian bridge.- C. The general, one-dimensional, linear equation.- D. Supplementary exercises.- 5.7. Connections with Partial Differential Equations.- A. The Dirichlet problem.- B. The Cauchy problem and a Feynman-Kac representation.- C. Supplementary exercises.- 5.8. Applications to Economics.- A. Portfolio and consumption processes.- B. Option pricing.- C. Optimal consumption and investment (general theory).- D. Optimal consumption and investment (constant coefficients).- 5.9. Solutions to Selected Problems.- 5.10. Notes.- 6 P. Levy's Theory of Brownian Local Time.- 6.1. Introduction.- 6.2. Alternate Representations of Brownian Local Time.- A. The process of passage times.- B. Poisson random measures.- C. Subordinators.- D. The process of passage times revisited.- E. The excursion and downcrossing representations of local time.- 6.3. Two Independent Reflected Brownian Motions.- A. The positive and negative parts of a Brownian motion.- B. The first formula of D. Williams.- C. The joint density of (W(t), L(t), ? +(t)).- 6.4. Elastic Brownian Motion.- A. The Feynman-Kac formulas for elastic Brownian motion.- B. The Ray-Knight description of local time.- C. The second formula of D. Williams.- 6.5. An Application: Transition Probabilities of Brownian Motion with Two-Valued Drift.- 6.6. Solutions to Selected Problems.- 6.7. Notes.

8,639 citations

Journal ArticleDOI
TL;DR: In this article, the authors introduce physics-informed neural networks, which are trained to solve supervised learning tasks while respecting any given laws of physics described by general nonlinear partial differential equations.

5,448 citations

Book
18 Dec 1992
TL;DR: In this paper, an introduction to optimal stochastic control for continuous time Markov processes and to the theory of viscosity solutions is given, as well as a concise introduction to two-controller, zero-sum differential games.
Abstract: This book is intended as an introduction to optimal stochastic control for continuous time Markov processes and to the theory of viscosity solutions. The authors approach stochastic control problems by the method of dynamic programming. The text provides an introduction to dynamic programming for deterministic optimal control problems, as well as to the corresponding theory of viscosity solutions. A new Chapter X gives an introduction to the role of stochastic optimal control in portfolio optimization and in pricing derivatives in incomplete markets. Chapter VI of the First Edition has been completely rewritten, to emphasize the relationships between logarithmic transformations and risk sensitivity. A new Chapter XI gives a concise introduction to two-controller, zero-sum differential games. Also covered are controlled Markov diffusions and viscosity solutions of Hamilton-Jacobi-Bellman equations. The authors have tried, through illustrative examples and selective material, to connect stochastic control theory with other mathematical areas (e.g. large deviations theory) and with applications to engineering, physics, management, and finance. In this Second Edition, new material on applications to mathematical finance has been added. Concise introductions to risk-sensitive control theory, nonlinear H-infinity control and differential games are also included.

3,885 citations

Trending Questions (1)
Can highdimensional backwards stochastic differential equations efficiently be solved using tensor train methods?

Yes, high-dimensional backward stochastic differential equations can be efficiently solved using tensor train methods, as they leverage low-rank structures for compression and computational efficiency.