scispace - formally typeset
Search or ask a question
Author

Mustafa Mustafa

Bio: Mustafa Mustafa is an academic researcher from Lawrence Berkeley National Laboratory. The author has contributed to research in topics: Feature learning & Deep learning. The author has an hindex of 11, co-authored 27 publications receiving 449 citations.

Papers
More filters
Proceedings ArticleDOI
23 Aug 2020
TL;DR: This paper proposes a hybrid approach to predict turbulent flow by learning its highly nonlinear dynamics from spatiotemporal velocity fields of large-scale fluid flow simulations of relevance to turbulence modeling and climate modeling by marrying two well-established turbulent flow simulation techniques with deep learning.
Abstract: While deep learning has shown tremendous success in a wide range of domains, it remains a grand challenge to incorporate physical principles in a systematic manner to the design, training, and inference of such models. In this paper, we aim to predict turbulent flow by learning its highly nonlinear dynamics from spatiotemporal velocity fields of large-scale fluid flow simulations of relevance to turbulence modeling and climate modeling. We adopt a hybrid approach by marrying two well-established turbulent flow simulation techniques with deep learning. Specifically, we introduce trainable spectral filters in a coupled model of Reynolds-averaged Navier-Stokes (RANS) and Large Eddy Simulation (LES), followed by a specialized U-net for prediction. Our approach, which we call Turbulent-Flow Net, is grounded in a principled physics model, yet offers the flexibility of learned representations. We compare our model with state-of-the-art baselines and observe significant reductions in error for predictions 60 frames ahead. Most importantly, our method predicts physical fields that obey desirable physical characteristics, such as conservation of mass, whilst faithfully emulating the turbulent kinetic energy field and spectrum, which are critical for accurate prediction of turbulent flows.

217 citations

Posted Content
TL;DR: AdaHessian is introduced, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the Hessian, and it exhibits robustness towards its hyperparameters.
Abstract: We introduce ADAHESSIAN, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the HESSIAN. Second order algorithms are among the most powerful optimization algorithms with superior convergence properties as compared to first order methods such as SGD and Adam. The main disadvantage of traditional second order methods is their heavier per iteration computation and poor accuracy as compared to first order methods. To address these, we incorporate several novel approaches in ADAHESSIAN, including: (i) a fast Hutchinson based method to approximate the curvature matrix with low computational overhead; (ii) a root-mean-square exponential moving average to smooth out variations of the Hessian diagonal across different iterations; and (iii) a block diagonal averaging to reduce the variance of Hessian diagonal elements. We show that ADAHESSIAN achieves new state-of-the-art results by a large margin as compared to other adaptive optimization methods, including variants of Adam. In particular, we perform extensive tests on CV, NLP, and recommendation system tasks and find that ADAHESSIAN: (i) achieves 1.80%/1.45% higher accuracy on ResNets20/32 on Cifar10, and 5.55% higher accuracy on ImageNet as compared to Adam; (ii) outperforms AdamW for transformers by 0.13/0.33 BLEU score on IWSLT14/WMT14 and 2.7/1.0 PPL on PTB/Wikitext-103; (iii) outperforms AdamW for SqueezeBert by 0.41 points on GLUE; and (iv) achieves 0.032% better score than Adagrad for DLRM on the Criteo Ad Kaggle dataset. Importantly, we show that the cost per iteration of ADAHESSIAN is comparable to first order methods, and that it exhibits robustness towards its hyperparameters.

126 citations

Journal ArticleDOI
TL;DR: In this paper, the authors survey systematic approaches to incorporating physics and domain knowledge into ML models and distill these approaches into broad categories, and show how these approaches have been used successfully for emulating, downscaling, and forecasting weather and climate processes.
Abstract: Machine learning (ML) provides novel and powerful ways of accurately and efficiently recognizing complex patterns, emulating nonlinear dynamics, and predicting the spatio-temporal evolution of weather and climate processes. Off-the-shelf ML models, however, do not necessarily obey the fundamental governing laws of physical systems, nor do they generalize well to scenarios on which they have not been trained. We survey systematic approaches to incorporating physics and domain knowledge into ML models and distill these approaches into broad categories. Through 10 case studies, we show how these approaches have been used successfully for emulating, downscaling, and forecasting weather and climate processes. The accomplishments of these studies include greater physical consistency, reduced training time, improved data efficiency, and better generalization. Finally, we synthesize the lessons learned and identify scientific, diagnostic, computational, and resource challenges for developing truly robust and reliable physics-informed ML models for weather and climate processes. This article is part of the theme issue 'Machine learning for weather and climate modelling'.

119 citations

Journal ArticleDOI
01 Oct 2017
TL;DR: The Shifter framework, a framework that delivers Docker-like functionality to HPC, is described and several successful examples of scientists using Shifter to make scientific analysis easily customizable and scalable are profile.
Abstract: Bringing HEP computing to HPC can be difficult. Software stacks are often very complicated with numerous dependencies that are difficult to get installed on an HPC system. To address this issue, NERSC has created Shifter, a framework that delivers Docker-like functionality to HPC. It works by extracting images from native formats and converting them to a common format that is optimally tuned for the HPC environment. We have used Shifter to deliver the CVMFS software stack for ALICE, ATLAS, and STAR on the supercomputers at NERSC. As well as enabling the distribution multi-TB sized CVMFS stacks to HPC, this approach also offers performance advantages. Software startup times are significantly reduced and load times scale with minimal variation to 1000s of nodes. We profile several successful examples of scientists using Shifter to make scientific analysis easily customizable and scalable. We will describe the Shifter framework and several efforts in HEP and NP to use Shifter to deliver their software on the Cori HPC system.

102 citations

Journal ArticleDOI
TL;DR: In this article, the authors apply Generative Adversarial Networks to the problem of generating weak lensing convergence maps and show that their generator network produces maps that are described by, with high statistical confidence, the same summary statistics as the fully simulated maps.
Abstract: Inferring model parameters from experimental data is a grand challenge in many sciences, including cosmology. This often relies critically on high fidelity numerical simulations, which are prohibitively computationally expensive. The application of deep learning techniques to generative modeling is renewing interest in using high dimensional density estimators as computationally inexpensive emulators of fully-fledged simulations. These generative models have the potential to make a dramatic shift in the field of scientific simulations, but for that shift to happen we need to study the performance of such generators in the precision regime needed for science applications. To this end, in this work we apply Generative Adversarial Networks to the problem of generating weak lensing convergence maps. We show that our generator network produces maps that are described by, with high statistical confidence, the same summary statistics as the fully simulated maps.

85 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

01 Jan 2006
TL;DR: The mysterious rattleback and its fluid counterpart:Developments in shear instabilities(Patrick Huerre,Falling clouds+Elisabeth Guazzelli)LEcotectural fluid mechanics%Herbert Huppert )
Abstract: 流体力学杂志“Journal of Fluid Mechanics”由剑桥大学教授George Batchelor在1956年5月创办,在国际流体力学界享有很高的学术声望,被公认为是流体力学最著名的学术刊物之一,2005年的影响因子为2.061,雄居同类期刊之首.在它创刊50周年之际,2006年5月JFM出版了第554卷的纪念特刊,其中刊登了现任主编(美国西北大学S.H.Davis教授和英国剑桥大学T.J.Pedley教授)合写的述评:“Editorial:JFM at50”,以JFM为背景,从独特的视角对近50年来流体力学的发展进行了简明的回顾和展望,并归纳了一系列非常有启发性的有趣统计数字.2006年7月21日在剑桥大学应用数学和理论物理研究所(DAMTP)举行了创刊50周年的庆祝会.下午2点,JFM的新老编辑和来宾会聚一堂,Pedley教授致开幕词,其后是5个精彩的报告:The mysterious rattleback and its fluid counterpart(Keith Moffatt),Developments in shear instabilities(Patrick Huerre),Falling clouds(Elisabeth Guazzelli),Ecotectural fluid mechanics(Paul Linden),The success of JFM(Herbert Huppert),最后由Davis教授致闭幕词.

767 citations

Posted Content
TL;DR: This work forms a new neural operator by parameterizing the integral kernel directly in Fourier space, allowing for an expressive and efficient architecture and shows state-of-the-art performance compared to existing neural network methodologies.
Abstract: The classical development of neural networks has primarily focused on learning mappings between finite-dimensional Euclidean spaces. Recently, this has been generalized to neural operators that learn mappings between function spaces. For partial differential equations (PDEs), neural operators directly learn the mapping from any functional parametric dependence to the solution. Thus, they learn an entire family of PDEs, in contrast to classical methods which solve one instance of the equation. In this work, we formulate a new neural operator by parameterizing the integral kernel directly in Fourier space, allowing for an expressive and efficient architecture. We perform experiments on Burgers' equation, Darcy flow, and Navier-Stokes equation. The Fourier neural operator is the first ML-based method to successfully model turbulent flows with zero-shot super-resolution. It is up to three orders of magnitude faster compared to traditional PDE solvers. Additionally, it achieves superior accuracy compared to previous learning-based solvers under fixed resolution.

762 citations

Journal ArticleDOI
TL;DR: This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software that is capable of scaling computation effectively and efficiently in the era of Big Data.
Abstract: The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

443 citations