scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Enforcing Analytic Constraints in Neural Networks Emulating Physical Systems.

04 Mar 2021-Physical Review Letters (American Physical Society)-Vol. 126, Iss: 9, pp 098302-098302
TL;DR: This work introduces a systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the architecture or the loss function, which reduces errors in the subsets of the outputs most impacted by the constraints.
Abstract: Neural networks can emulate nonlinear physical systems with high accuracy, yet they may produce physically inconsistent results when violating fundamental constraints. Here, we introduce a systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the architecture or the loss function. Applied to convective processes for climate modeling, architectural constraints enforce conservation laws to within machine precision without degrading performance. Enforcing constraints also reduces errors in the subsets of the outputs most impacted by the constraints.
Citations
More filters
Posted Content
01 Jan 2020
TL;DR: An overview of techniques to integrate machine learning with physics-based modeling and classes of methodologies used to construct physics-guided machine learning models and hybrid physics-machine learning frameworks from a machine learning standpoint is provided.
Abstract: In this manuscript, we provide a structured and comprehensive overview of techniques to integrate machine learning with physics-based modeling. First, we provide a summary of application areas for which these approaches have been applied. Then, we describe classes of methodologies used to construct physics-guided machine learning models and hybrid physics-machine learning frameworks from a machine learning standpoint. With this foundation, we then provide a systematic organization of these existing techniques and discuss ideas for future research.

230 citations


Cites background from "Enforcing Analytic Constraints in N..."

  • ...Parameterization [198] [157] [203] [135] [169] [95] [43] [90] [186] [212] [29] [33] [189] [292] [23] [24] [285] [21] [23] [294]...

    [...]

  • ...show that enforcing energy conservation laws improves prediction when emulating cloud processes [23, 24]....

    [...]

  • ...Most of the existing work uses standard black box ML models for parameterization, but there is an increasing interest in integrating physics in the ML models [23], as it has the potential to make them more robust and generalizable to unseen scenarios as well as reduce the number of training samples needed for training....

    [...]

Journal ArticleDOI
TL;DR: This commentary is a call to action for the hydrology community to focus on developing a quantitative understanding of where and when hydrological process understanding is valuable in a modeling discipline increasingly dominated by machine learning.
Abstract: We suggest that there is a potential danger to the hydrological sciences community in not recognizing how transformative machine learning will be for the future of hydrological modeling. Given the recent success of machine learning applied to modeling problems, it is unclear what the role of hydrological theory might be in the future. We suggest that a central challenge in hydrology right now should be to clearly delineate where and when hydrological theory adds value to prediction systems. Lessons learned from the history of hydrological modeling motivate several clear next steps toward integrating machine learning into hydrological modeling workflows.

174 citations

Journal ArticleDOI
TL;DR: In this paper, the authors survey systematic approaches to incorporating physics and domain knowledge into ML models and distill these approaches into broad categories, and show how these approaches have been used successfully for emulating, downscaling, and forecasting weather and climate processes.
Abstract: Machine learning (ML) provides novel and powerful ways of accurately and efficiently recognizing complex patterns, emulating nonlinear dynamics, and predicting the spatio-temporal evolution of weather and climate processes. Off-the-shelf ML models, however, do not necessarily obey the fundamental governing laws of physical systems, nor do they generalize well to scenarios on which they have not been trained. We survey systematic approaches to incorporating physics and domain knowledge into ML models and distill these approaches into broad categories. Through 10 case studies, we show how these approaches have been used successfully for emulating, downscaling, and forecasting weather and climate processes. The accomplishments of these studies include greater physical consistency, reduced training time, improved data efficiency, and better generalization. Finally, we synthesize the lessons learned and identify scientific, diagnostic, computational, and resource challenges for developing truly robust and reliable physics-informed ML models for weather and climate processes. This article is part of the theme issue 'Machine learning for weather and climate modelling'.

119 citations

Posted Content
TL;DR: This work proposes to improve accuracy and generalization by incorporating symmetries into deep neural networks by employing a variety of methods each tailored to enforce a different symmetry.
Abstract: Recent work has shown deep learning can accelerate the prediction of physical dynamics relative to numerical solvers. However, limited physical accuracy and an inability to generalize under distributional shift limit its applicability to the real world. We propose to improve accuracy and generalization by incorporating symmetries into convolutional neural networks. Specifically, we employ a variety of methods each tailored to enforce a different symmetry. Our models are both theoretically and experimentally robust to distributional shift by symmetry group transformations and enjoy favorable sample complexity. We demonstrate the advantage of our approach on a variety of physical dynamics including Rayleigh Benard convection and real-world ocean currents and temperatures. Compared with image or text applications, our work is a significant step towards applying equivariant neural networks to high-dimensional systems with complex dynamics. We open-source our simulation, data, and code at \url{this https URL}.

88 citations

Journal ArticleDOI
TL;DR: This work demonstrates the usefulness of the RVM algorithm to reveal closed‐form equations for eddy parameterizations with embedded conservation laws and shows the potential for new physics‐aware interpretable ML turbulence parameterizations for use in ocean climate models.
Abstract: The resolution of climate models is limited by computational cost. Therefore, we must rely on parameterizations to represent processes occurring below the scale resolved by the models. Here, we foc...

83 citations


Cites background from "Enforcing Analytic Constraints in N..."

  • ...However, this new class of ML parameterizations often uses black box algorithms (e.g., neural networks) such that the laws of physics are not necessarily respected unless imposed (Beucler et al., 2019; Ling et al., 2016), and interpreting the data‐driven parameterization becomes intractable....

    [...]

  • ...The architecture of the FCNN is physically constrained (Beucler et al., 2019) such that the activationmaps (i.e., the results) of the third convolution layer represent the elements of a symmetric eddy stress tensor T....

    [...]

References
More filters
Proceedings Article
01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

111,197 citations


"Enforcing Analytic Constraints in N..." refers methods in this paper

  • ...We optimized the NN’s weights and biases with the RMSprop optimizer [26] because it was more stable than the Adam optimizer [27] for LCnets, and save the NN’s state of minimal validation loss over 20 epochs....

    [...]

Journal ArticleDOI
08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

38,211 citations


"Enforcing Analytic Constraints in N..." refers background in this paper

  • ...This motivates physically-constraining a broader class of machine-learning algorithms, such as generative adversarial networks [28, 29]....

    [...]

Journal ArticleDOI
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

30,811 citations

Dissertation
01 Jan 2009
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Abstract: In this work we describe how to train a multi-layer generative model of natural images. We use a dataset of millions of tiny colour images, described in the next section. This has been attempted by several groups but without success. The models on which we focus are RBMs (Restricted Boltzmann Machines) and DBNs (Deep Belief Networks). These models learn interesting-looking filters, which we show are more useful to a classifier than the raw pixels. We train the classifier on a labeled subset that we have collected and call the CIFAR-10 dataset.

15,005 citations

Posted Content
TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

10,447 citations


"Enforcing Analytic Constraints in N..." refers methods in this paper

  • ...We implement the three NN types (UCnet, LCnet, ACnet) using the Keras library [25] with the Tensorflow backend [26]....

    [...]