Showing papers in "Acta Numerica in 2021"

PDF

Open Access

Journal Article•DOI•

[...]

Peter L. Bartlett¹, Andrea Montanari², Alexander Rakhlin³•Institutions (3)

University of California¹, Stanford University², Massachusetts Institute of Technology³

16 Mar 2021-Acta Numerica

TL;DR: In particular, this article showed that simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy.

...read moreread less

Abstract: The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behaviour of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favourable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

...read moreread less

141 citations

Journal Article•DOI•

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

[...]

Mikhail Belkin¹•Institutions (1)

University of California, San Diego¹

29 May 2021-Acta Numerica

TL;DR: In this paper, the authors attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning, including interpolation and its sibling over-parametrization.

...read moreread less

Abstract: In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I will attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning. The two key themes will be interpolation and its sibling over-parametrization. Interpolation corresponds to fitting data, even noisy data, exactly. Over-parametrization enables interpolation and provides flexibility to select a suitable interpolating model. As we will see, just as a physical prism separates colours mixed within a ray of light, the figurative prism of interpolation helps to disentangle generalization and optimization properties within the complex picture of modern machine learning. This article is written in the belief and hope that clearer understanding of these issues will bring us a step closer towards a general theory of deep learning and machine learning.

...read moreread less

77 citations

Journal Article•DOI•

Learning physics-based models from data: perspectives from inverse problems and model reduction

[...]

Omar Ghattas¹, Karen Willcox¹•Institutions (1)

University of Texas at Austin¹

01 May 2021-Acta Numerica

TL;DR: In this paper, the authors address the inference of physics models from data, from the perspectives of inverse problems and model reduction, and highlight several illustrative applications to large-scale complex problems across different domains of science and engineering.

...read moreread less

Abstract: This article addresses the inference of physics models from data, from the perspectives of inverse problems and model reduction. These fields develop formulations that integrate data into physics-based models while exploiting the fact that many mathematical models of natural and engineered systems exhibit an intrinsically low-dimensional solution manifold. In inverse problems, we seek to infer uncertain components of the inputs from observations of the outputs, while in model reduction we seek low-dimensional models that explicitly capture the salient features of the input–output map through approximation in a low-dimensional subspace. In both cases, the result is a predictive model that reflects data-driven learning yet deeply embeds the underlying physics, and thus can be used for design, control and decision-making, often with quantified uncertainties. We highlight recent developments in scalable and efficient algorithms for inverse problems and model reduction governed by large-scale models in the form of partial differential equations. Several illustrative applications to large-scale complex problems across different domains of science and engineering are provided.

...read moreread less

62 citations

Journal Article•DOI•

Numerical homogenization beyond scale separation

[...]

Robert Altmann¹, Patrick Henning², Daniel Peterseim¹•Institutions (2)

Augsburg College¹, Ruhr University Bochum²

01 May 2021-Acta Numerica

TL;DR: A review of numerical homogenization methods for multiscale partial differential equations can be found in this paper, where the authors provide a unified variational framework for their design and numerical analysis.

...read moreread less

Abstract: Numerical homogenization is a methodology for the computational solution of multiscale partial differential equations. It aims at reducing complex large-scale problems to simplified numerical models valid on some target scale of interest, thereby accounting for the impact of features on smaller scales that are otherwise not resolved. While constructive approaches in the mathematical theory of homogenization are restricted to problems with a clear scale separation, modern numerical homogenization methods can accurately handle problems with a continuum of scales. This paper reviews such approaches embedded in a historical context and provides a unified variational framework for their design and numerical analysis. Apart from prototypical elliptic model problems, the class of partial differential equations covered here includes wave scattering in heterogeneous media and serves as a template for more general multi-physics problems.

...read moreread less

44 citations

Journal Article•DOI•

Modelling and computation of liquid crystals

[...]

Wei Wang¹, Lei Zhang², Pingwen Zhang²•Institutions (2)

Zhejiang University¹, Peking University²

01 May 2021-Acta Numerica

TL;DR: A review of mathematical models and their connections to liquid crystals can be found in this paper, where a survey of the developments of numerical methods for finding rich configurations of liquid crystals is presented.

...read moreread less

Abstract: Liquid crystals are a type of soft matter that is intermediate between crystalline solids and isotropic fluids. The study of liquid crystals has made tremendous progress over the past four decades, which is of great importance for fundamental scientific research and has widespread applications in industry. In this paper we review the mathematical models and their connections to liquid crystals, and survey the developments of numerical methods for finding rich configurations of liquid crystals.

...read moreread less

34 citations

Journal Article•DOI•

Tensors in computations

[...]

Lek-Heng Lim¹•Institutions (1)

University of Chicago¹

01 May 2021-Acta Numerica

TL;DR: The notion of tensors captures three great ideas: equivariance, multilinearity, and separability as mentioned in this paper. But trying to be three things at once makes the notion difficult to understand.

...read moreread less

Abstract: The notion of a tensor captures three great ideas: equivariance, multilinearity, separability. But trying to be three things at once makes the notion difficult to understand. We will explain tensors in an accessible and elementary way through the lens of linear algebra and numerical linear algebra, elucidated with examples from computational and applied mathematics.

...read moreread less

15 citations

Journal Article•DOI•

Optimal transportation, modelling and numerical simulation

[...]

Jean-David Benamou¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 May 2021-Acta Numerica

TL;DR: In this paper, an overview of the basic theory, modern optimal transportation extensions and recent algorithmic advances is presented, as well as selected modeling and numerical applications illustrate the impact of optimal transportation in numerical analysis.

...read moreread less

Abstract: We present an overviewof the basic theory, modern optimal transportation extensions and recent algorithmic advances. Selected modelling and numerical applications illustrate the impact of optimal transportation in numerical analysis.

...read moreread less

11 citations