scispace - formally typeset
Search or ask a question

Showing papers in "Acta Numerica in 2021"


Journal ArticleDOI
TL;DR: In particular, this article showed that simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy.
Abstract: The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behaviour of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favourable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

141 citations


Journal ArticleDOI
TL;DR: In this paper, the authors attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning, including interpolation and its sibling over-parametrization.
Abstract: In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I will attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning. The two key themes will be interpolation and its sibling over-parametrization. Interpolation corresponds to fitting data, even noisy data, exactly. Over-parametrization enables interpolation and provides flexibility to select a suitable interpolating model. As we will see, just as a physical prism separates colours mixed within a ray of light, the figurative prism of interpolation helps to disentangle generalization and optimization properties within the complex picture of modern machine learning. This article is written in the belief and hope that clearer understanding of these issues will bring us a step closer towards a general theory of deep learning and machine learning.

77 citations


Journal ArticleDOI
TL;DR: In this paper, the authors address the inference of physics models from data, from the perspectives of inverse problems and model reduction, and highlight several illustrative applications to large-scale complex problems across different domains of science and engineering.
Abstract: This article addresses the inference of physics models from data, from the perspectives of inverse problems and model reduction. These fields develop formulations that integrate data into physics-based models while exploiting the fact that many mathematical models of natural and engineered systems exhibit an intrinsically low-dimensional solution manifold. In inverse problems, we seek to infer uncertain components of the inputs from observations of the outputs, while in model reduction we seek low-dimensional models that explicitly capture the salient features of the input–output map through approximation in a low-dimensional subspace. In both cases, the result is a predictive model that reflects data-driven learning yet deeply embeds the underlying physics, and thus can be used for design, control and decision-making, often with quantified uncertainties. We highlight recent developments in scalable and efficient algorithms for inverse problems and model reduction governed by large-scale models in the form of partial differential equations. Several illustrative applications to large-scale complex problems across different domains of science and engineering are provided.

62 citations


Journal ArticleDOI
TL;DR: A review of numerical homogenization methods for multiscale partial differential equations can be found in this paper, where the authors provide a unified variational framework for their design and numerical analysis.
Abstract: Numerical homogenization is a methodology for the computational solution of multiscale partial differential equations. It aims at reducing complex large-scale problems to simplified numerical models valid on some target scale of interest, thereby accounting for the impact of features on smaller scales that are otherwise not resolved. While constructive approaches in the mathematical theory of homogenization are restricted to problems with a clear scale separation, modern numerical homogenization methods can accurately handle problems with a continuum of scales. This paper reviews such approaches embedded in a historical context and provides a unified variational framework for their design and numerical analysis. Apart from prototypical elliptic model problems, the class of partial differential equations covered here includes wave scattering in heterogeneous media and serves as a template for more general multi-physics problems.

44 citations


Journal ArticleDOI
TL;DR: A review of mathematical models and their connections to liquid crystals can be found in this paper, where a survey of the developments of numerical methods for finding rich configurations of liquid crystals is presented.
Abstract: Liquid crystals are a type of soft matter that is intermediate between crystalline solids and isotropic fluids. The study of liquid crystals has made tremendous progress over the past four decades, which is of great importance for fundamental scientific research and has widespread applications in industry. In this paper we review the mathematical models and their connections to liquid crystals, and survey the developments of numerical methods for finding rich configurations of liquid crystals.

34 citations


Journal ArticleDOI
TL;DR: The notion of tensors captures three great ideas: equivariance, multilinearity, and separability as mentioned in this paper. But trying to be three things at once makes the notion difficult to understand.
Abstract: The notion of a tensor captures three great ideas: equivariance, multilinearity, separability. But trying to be three things at once makes the notion difficult to understand. We will explain tensors in an accessible and elementary way through the lens of linear algebra and numerical linear algebra, elucidated with examples from computational and applied mathematics.

15 citations


Journal ArticleDOI
TL;DR: In this paper, an overview of the basic theory, modern optimal transportation extensions and recent algorithmic advances is presented, as well as selected modeling and numerical applications illustrate the impact of optimal transportation in numerical analysis.
Abstract: We present an overviewof the basic theory, modern optimal transportation extensions and recent algorithmic advances. Selected modelling and numerical applications illustrate the impact of optimal transportation in numerical analysis.

11 citations