scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Improving Precipitation Estimation Using Convolutional Neural Network

01 Mar 2019-Water Resources Research (John Wiley & Sons, Ltd)-Vol. 55, Iss: 3, pp 2301-2321
TL;DR: This study offers a novel approach to enhance numerical precipitation estimation and provides important implications for improving precipitation-related parameterization schemes using a data-driven approach.
Abstract: Author(s): Pan, B; Hsu, K; AghaKouchak, A; Sorooshian, S | Abstract: Precipitation process is generally considered to be poorly represented in numerical weather/climate models. Statistical downscaling (SD) methods, which relate precipitation with model resolved dynamics, often provide more accurate precipitation estimates compared to model's raw precipitation products. We introduce the convolutional neural network model to foster this aspect of SD for daily precipitation prediction. Specifically, we restrict the predictors to the variables that are directly resolved by discretizing the atmospheric dynamics equations. In this sense, our model works as an alternative to the existing precipitation-related parameterization schemes for numerical precipitation estimation. We train the model to learn precipitation-related dynamical features from the surrounding dynamical fields by optimizing a hierarchical set of spatial convolution kernels. We test the model at 14 geogrid points across the contiguous United States. Results show that provided with enough data, precipitation estimates from the convolutional neural network model outperform the reanalysis precipitation products, as well as SD products using linear regression, nearest neighbor, random forest, or fully connected deep neural network. Evaluation for the test set suggests that the improvements can be seamlessly transferred to numerical weather modeling for improving precipitation prediction. Based on the default network, we examine the impact of the network architectures on model performance. Also, we offer simple visualization and analyzing approaches to interpret the models and their results. Our study contributes to the following two aspects: First, we offer a novel approach to enhance numerical precipitation estimation; second, the proposed model provides important implications for improving precipitation-related parameterization schemes using a data-driven approach.

Summary (4 min read)

1. Introduction

  • The modeling of the atmosphere is typically based on a particular set of partial differential equations, which is derived by applying the conservation laws and thermodynamic laws on the continuous “control volume” of the atmosphere (Bjerknes, 1906; Holton & Hakim, 2012).
  • Precipitation estimation involves explicit and implicit representations of the cloud physics, such as the water vapor convection, phase change, and particle coalescence.
  • In numerical models, such unresolved processes are inferred from the resolved dynamics on the computational grid (Kalnay, 2003).
  • Accordingly, the model input/output, resolution, usage, and complexity of parameterization schemes and SD are different.
  • The model is described and tested thereafter.

2.1. Statistical Downscaling

  • Following the survey in Maraun et al. (2010), SD approaches are classified into perfect prognosis (PP), model output statistics (MOS), and weather generators.
  • The simplest form is linear regression, which estimates precipitation using an optimized linear combination of the local circulation features (Hannachi et al., 2007; Jeong et al., 2012; Li & Smith, 2009; Murphy, 2000).
  • The predictors usually consist of PAN ET AL.
  • 2302 the raw variables or the leading principal components (PCs) of the moisture, pressure, and wind field (Wilby & Wigley, 2000).
  • A typical application of MOS is to correct the biases of the numerical model's raw precipitation estimates (Jakob Themeßl et al., 2011).

2.2. DNNs and Their Applications for Physical Processes

  • DNNs belong to the domain of ML, which covers a general scope of computer-aided statistical modeling.
  • On the other hand, DNNs, together with a broader family of representation learning approaches (Bengio et al., 2013), offer an “end-to-end” modeling workflow:.
  • The feature extraction process is integrated into the modeling process, which allows the model to learn customized features rather than subject to the pre-engineered features.
  • For the modeling of the natural physical processes, where the authors have established principled solutions through analytic descriptions of the scientist's prior knowledge of the underlying processes (de Bezenac et al., 2017), dynamical simulations are preferred to ML-based approaches.
  • While many recent research works have started to explore the applicability of DNN for parameterizing the unresolved processes in fluid and geofluid modeling (Ling et al., 2016; Rasp et al., 2018), it remains a question how DNN can translate the big data of observations and numerical simulations into precipitation estimation improvements (Pan et al., 2017).

3. Problem Formulation

  • To formulate the precipitation estimation problem, the authors first clarify the context by introducing a real-world precipitation scenario.
  • The well-established models offer accessible concepts for describing the circulation-precipitation connection.
  • Dynamically, the precipitation process in Figure 1 is associated with the extratropical cyclone.
  • (1) In equation (1), E denotes the expected value, P denotes the precipitation estimates, X denotes the predictors, and C denotes the local climate condition.
  • The authors point out two common deficiencies when applying this conventional approach for weather-scale precipitation estimation.

4.1. Convolutional Neural Network

  • CNNs share many similarities with regular neural networks.
  • For a regular neural network, a statistical connection between the inputs and the outputs is constructed through hierarchical connected layers of neurons.
  • Each convolution operation is performed by computing the element-wise dot product between the tensor and different patches of the input, which is represented as a c × x × y array.
  • Following previous works (Wilby & Wigley, 2000), the predictors consist of the circulation constraint and moisture constraint.
  • To estimate the total daily precipitation, the authors usually have several snapshots of its surrounding dynamical field at different hours through the day.

4.2.1. Regularization

  • DNNs usually have much more complicated structures and more parameters than conventional ML algorithms, which make it possible for models to perform exceptionally well on the training data but predict the test data poorly.
  • Regularization refers to the strategies to avoid overfitting and make the model generalize better to unseen data.
  • The idea of dropout is to assign a probability of existence to the neurons and their associated connections.
  • This prevents neurons from coadapting and has shown significant improvements in reducing overfitting (Srivastava et al., 2014).
  • Batchnormalization addresses the problem of internal covariate shift in training DNNs.

4.2.2. Loss Function and Skill Metrics

  • The root mean square error (RMSE) between the precipitation simulations and observations is used as loss function here: RMSE = √ 1 n ∑ (Pobser − Psimu)2. (2) Here Pobser denotes the observed daily precipitation records, and Psimu denotes the simulated daily precipitation records.
  • The Pearson correlation coefficient (r) between simulated and observed daily precipitation is also used as supplementary skill metric for measuring model performance: r = cov(Pobser,Psimu) 𝜎Pobser𝜎Psimu .
  • (3) Here cov denotes covariance, and 𝜎 denotes standard deviation.
  • The method requires estimating the partial derivative of the loss function with respect to each parameter in the network, including those from both the convolutional and dense layers.
  • The parameters are then adjusted along the gradient descent direction by a predefined stride, which is named “learning rate.”.

4.3. Model Implementation

  • The authors implement the network using the Wolfram Mathematica V11.3 Deep Learning Platform (Wolfram, 2018).
  • The authors use the Nvidia Quadro P5000 GPU (Graphics Processing Unit) to accelerate model training.

5.1. Data

  • The predictors used for building the network models are the GPH and PW field data from the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR) data set (Mesinger et al., 2006).
  • The data set is generated by regional downscaling of the NCEP Global Reanalysis for the North America region, using the NCEP Eta Model and the 3-D Variational Data Assimilation System.
  • The data set covers 1979 to near present and is provided every 3 hr, with spatial resolution of 32 km/45 vertical layers.
  • Besides the pressure and moisture data, the precipitation product from the NARR is used as baseline here.
  • It poses a high challenge for the DNN model to provide comparable precipitation estimates.

5.2. Experiments Design

  • To test the applicability of the model for different climate conditions, the authors selected 14 sample grids that roughly cover the characteristic climate divisions of the contiguous United States.
  • Here𝜇 and 𝜎 are scalar values that are calculated based on the flattened circulation field for the entire data set.
  • The training and validation sets are used to calibrate the model parameters and prevent overfitting.
  • The network simulation results are evaluated against the CPC precipitation records, using skill metrics of RMSE and r.

6. Results

  • The CNN estimated precipitation (PCNN) and the NARR estimated precipitation are compared against the CPC precipitation records.
  • Without a careful tuning of hyperparameters, the CNN models perform relatively well compared to the NARR precipitation product.
  • As indicated by the two skill scores, PCNN outperforms PNARR for most sample points from the west and east coast, where precipitation is more copious than the other areas.
  • Different implementations of CNN show similar skills.
  • The overfitting may due to the fact that for PAN ET AL.

7.1. Network Architecture

  • The results above are achieved using a same default network architecture as presented in Figure 4.
  • Here the authors focus on two dominant configurations in CNN design, namely, the receptive field and the network depth.
  • For processing convenience, the authors use a single geogrid to carry out the experiments.
  • The authors maintain the two other network configurations the same as the default setting.
  • The above experiments verified that an explicit encoding of local spatial circulation structures enhances the estimation of precipitation.

7.1.2. Network Depth: Shallow Or Deep?

  • The network depth can be roughly represented as how many layers there are in the neuron network.
  • These layers learn representations of the data with multiple levels of abstraction (LeCun et al., 2015).
  • The shallower CNN model is constructed by removing the latter two convolutional layers and the last pooling layer from the default network in Figure 4.
  • Compared to the deeper network models, the model with single convolutional layer achieves significant lower skill scores in estimating precipitation.
  • The model with 5 convolutional layers achieves optimal performance for the training and test set.

7.2. Model Interpretations

  • The network models applied here involve much more complicated structures and more parameters compared to the existing SD approaches.
  • In response to this requirement, many approaches for understanding CNNs have been developed in recent years (Erhan et al., 2009; Simonyan et al., 2013; Zeiler & Fergus, 2014).
  • Zeiler and Fergus (2014) offered an excellent example in illustrating how layer activation can be used for interpreting and diagnosing CNNs.
  • Similar distinctions within same channel for two events can be depicted in Conv 2.

7.2.2. Perturbation Sensitivity

  • For image classification problems, the occlusion sensitivity analysis tells the impact of different portions of the image on the classification result.
  • The rescaling matrix is multiplied to different portions of the input.
  • The relation between perturbation location and model output change is visualized in Figure 7.
  • This is the area where the target geogrid point lies.
  • The surrounding dynamics also provide important context for inferring precipitation.

7.3. Comparison Experiments

  • Previous sections have compared the CNN precipitation estimates with (1) NARR precipitation product and (2) precipitation estimates using fully connected DNN.
  • For each of the model, the authors adopt same input variables as for CNN, with optional feature extractions before feeding the input to the model.
  • The best performance in the comparison experiments is achieved by the linear regression model using input of the leading 16 PCs of the circulation field (r = 0.81, RMSE = 6.98).
  • The skill can be further improved if the authors apply the convolution and pooling PAN ET AL.
  • To sum up, the comparison experiments empirically suggest that CNN is competitive in making precipitation estimations based on the resolved surrounding atmospheric dynamics.

8. Conclusion

  • Precipitation estimation provides fundamental information to better understand the land-atmosphere water budget, improve water resources management, and aid in preparation for increasingly extreme hydrometeorological events.
  • The authors introduce the CNN model to overcome these two deficiencies in improving precipitation estimation.
  • The authors focus on a single geogrid point to examine the influence of the network architecture on model performance.
  • By varying the network depth, the authors found that deep networks generally have better performance compared to shallow networks.
  • The performance improvement provides important implications for improving precipitation-related parameterization schemes using a data-driven approach.

Did you find this useful? Give us your feedback

Figures (9)

Content maybe subject to copyright    Report

UC Irvine
UC Irvine Previously Published Works
Title
Improving Precipitation Estimation Using Convolutional Neural Network
Permalink
https://escholarship.org/uc/item/8nb145xd
Journal
Water Resources Research, 55(3)
ISSN
0043-1397
Authors
Pan, B
Hsu, K
AghaKouchak, A
et al.
Publication Date
2019-03-01
DOI
10.1029/2018WR024090
Copyright Information
This work is made available under the terms of a Creative Commons Attribution License,
availalbe at https://creativecommons.org/licenses/by/4.0/
Peer reviewed
eScholarship.org Powered by the California Digital Library
University of California

Improving Precipitation Estimation Using Convolutional
Neural Network
Baoxiang Pan
1
, Kuolin Hsu
1
, Amir AghaKouchak
1,2
, and Soroosh Sorooshian
1,2
1
Center for Hydrometeorology and Remote Sensing, University of California, Irvine, CA, USA,
2
Department of Earth
System Science, University of California, Irvine, CA, USA
Abstract Precipitation process is generally considered to be poorly represented in numerical
weather/climate models. Statistical downscaling (SD) methods, which relate precipitation with model
resolved dynamics, often provide more accurate precipitation estimates compared to model's raw
precipitation products. We introduce the convolutional neural network model to foster this aspect of SD
for daily precipitation prediction. Specifically, we restrict the predictors to the variables that are directly
resolved by discretizing the atmospheric dynamics equations. In this sense, our model works as an
alternative to the existing precipitation-related parameterization schemes for numerical precipitation
estimation. We train the model to learn precipitation-related dynamical features from the surrounding
dynamical fields by optimizing a hierarchical set of spatial convolution kernels. We test the model at
14 geogrid points across the contiguous United States. Results show that provided with enough data,
precipitation estimates from the convolutional neural network model outperform the reanalysis
precipitation products, as well as SD products using linear regression, nearest neighbor, random forest, or
fully connected deep neural network. Evaluation for the test set suggests that the improvements can be
seamlessly transferred to numerical weather modeling for improving precipitation prediction. Based on
the default network, we examine the impact of the network architectures on model performance. Also,
we offer simple visualization and analyzing approaches to interpret the models and their results. Our
study contributes to the following two aspects: First, we offer a novel approach to enhance numerical
precipitation estimation; second, the proposed model provides important implications for improving
precipitation-related parameterization schemes using a data-driven approach.
Plain Language Summary The precipitation process is not well simulated in numerical
weather models, since it takes place at the scales beyond the resolution of current models. We develop a
statistical model using deep learning technique to improve the estimation of precipitation in numerical
weather models.
1. Introduction
The modeling of the atmosphere is typically based on a particular set of partial differential equations, which
is derived by applying the conservation laws and thermodynamic laws on the continuous “control volume”
of the atmosphere (Bjerknes, 1906; Holton & Hakim, 2012). With the rapid growth of computing power, we
can discretize and resolve these equations on increasingly finer computing grids. However, there remains
many critical subgrid scale processes that are not explicitly resolved.
A well-concerned example is the precipitation process. Precipitation estimation involves explicit and implicit
representations of the cloud physics, such as the water vapor convection, phase change, and particle coa-
lescence. These processes take place at millimeter to molecule scales, which far surpass the resolution of
current numerical models (O(1 km)∕O(10 km)−O(100 km) for weather/climate models). Also, the assump-
tions of thermodynamic equilibrium and continuity lose their validity in describing some of the microscopic
processes (Stensrud, 2009), making it necessary to adopt supplementary equations for physically solid
simulations.
In numerical models, such unresolved processes are inferred from the resolved dynamics on the computa-
tional grid (Kalnay, 2003). This process is known as parameterization. Specific to precipitation, the directly
related parameterization schemes are cloud microphysics and subgrid convection. Given the intrinsic
RESEARCH ARTICLE
10.1029/2018WR024090
Special Section:
Big Data & Machine Learning in
Wa ter Sciences: Recent Progress
and Their Use in Advancing
Science
Key Points:
We offer a novel approach to enhance
numerical precipitation estimation
using deep convolutional neural
network (CNN)
The model provides important
implications for improving
precipitation-related
parameterization schemes
using a data-driven approach
The CNN model outperforms
existing precipitation statistical
downscaling approaches by learning
dominant spatial dynamics features
Supporting Information:
Supporting Information S1
Correspondence to:
B. Pan,
baoxianp@uci.edu
Citation:
Pan, B., Hsu, K., AghaKouchak, A.,
& Sorooshian, S. (2019). Improving
precipitation estimation using
convolutional neural network. Water
Resources Research, 55, 2301–2321.
https://doi.org/10.1029/2018WR024090
Received 19 SEP 2018
Accepted 9 JAN 2019
Accepted article online 15 JAN 2019
Published online 22 MAR 2019
©2019. American Geophysical Union.
All Rights Reserved.
PANETAL. 2301

Water Resources Research 10.1029/2018WR024090
complexity of the cloud and precipitation process, the equations and their associated parameters in these
parameterization schemes are generally of high structural and parametric uncertainties (Draper, 1995). As a
result, models' precipitation products are usually considered less reliable compared to the directly resolved
variables, such as pressure and temperature (Betts et al., 1998; Bukovsky & Karoly, 2007; Higgins et al., 1996;
Tian et al., 2017; Vitart, 2004).
Statistical downscaling (SD) methods are also used for the purpose of inferring the poorly represented pro-
cesses from the resolved dynamics and other data sources. However, SD has distinct objectives compared
to the parameterization schemes. The main purpose of parameterization is to depict the subgrid scale pro-
cesses for realistic atmosphere modeling. The primary concern for SD, as indicated by the name, is to resolve
the scale discrepancy between the existing model simulations and application requirements (Maraun et al.,
2010). Accordingly, the model input/output, resolution, usage, and complexity of parameterization schemes
and SD are different.
Besides the scaling issue, another aspect of SD is noted in practices: Compared to raw outputs or the
dynamically downscaled outputs from numerical models, SD occasionally provides more accurate estimates
of the unresolved processes. This is because SD is customized for specific objective, region, and climate
condition. The data-driven model with carefully designed model architecture and calibrated parameters
may outperform the default parameterization schemes in relating the unresolved processes to resolved
circulation. This phenomenon offers valuable implications for improving the relevant parameterization
schemes and opportunities for enhancing the prediction of the parameterized processes (Rasp et al., 2018;
Schneider et al., 2017).
Here we focus on fostering this aspect of SD for weather-scale precipitation forecast. Specifically, we propose
to improve the accuracy of daily precipitation estimates through relating the precipitation process with the
circulation data that are explicitly resolved in the atmospheric primitive equations. Compared to conven-
tional SD applications, the task here poses much higher requirements on model resolution and accuracy.
Recent developments in machine learning (ML) techniques, especially the branch of deep neural networks
(DNNs), offer an opportunity for describing and predicting such complicated physical processes using a
compose of big data and advanced model architectures. Here we illustrate how a particular form of DNN,
named convolutional neural network (CNN; LeCun et al., 1998), can be adapted to address the precipitation
estimation problem.
The rest of the paper is organized as follows: We start with a brief review of relevant works. Then, we
formulate the problem and illustrate the model requirements for this application.
The model is described and tested thereafter. We show the model results and provide methods for analyz-
ing and interpreting the models. We compare the model performance with some of the widely adopted SD
approaches. Conclusions are drawn at last.
2. Related Works
Many studies have been conducted on improving precipitation prediction accuracy with statistical
approaches. We review the relevant SD methods, for which the objective and methodology are closely related
to our work here. Also, we briefly review the basic concepts of DNNs, with a special emphasis on their
applications in physical processes.
2.1. Statistical Downscaling
Following the survey in Maraun et al. (2010), SD approaches are classified into perfect prognosis (PP),
model output statistics (MOS), and weather generators. Since the objective for our study is deterministic
precipitation prediction, we focus on SD approaches that make deterministic estimates of precipitation or
its estimation biases. This includes PP and MOS. The weather generator models are not reviewed here.
PP models construct statistical relations between the large-scale predictors and local scale predictands
(Fowler et al., 2007; Maraun et al., 2010). Both the predictors and predictands are considered to be realisti-
cally simulated or observed, hence the name of “perfect.” Along with the advancement of general circulation
models (GCMs), many precipitation PP methods have been developed. The simplest form is linear regres-
sion, which estimates precipitation using an optimized linear combination of the local circulation features
(Hannachi et al., 2007; Jeong et al., 2012; Li & Smith, 2009; Murphy, 2000). The predictors usually consist of
PANETAL. 2302

Water Resources Research 10.1029/2018WR024090
the raw variables or the leading principal components (PCs) of the moisture, pressure, and wind field (Wilby
& Wigley, 2000). Besides the linear models, there are also approaches that utilize the nonlinear features
of relevant circulation field, such as self-organizing map (Hope, 2006), support vector machine (Tripathi
et al., 2006), nearest neighbor (Gangopadhyay et al., 2005), random forest (Hutengs & Vohland, 2016), and
artificial neural network (ANN; Schoof & Pryor, 2001).
MOS stands for the practice of using statistical approaches to enhance the model's prediction accuracy
(Glahn & Lowry, 1972). Compared to PP, MOS is more frequently used in regional circulation models (RCMs)
than in GCMs (Maraun et al., 2010). Also, the predictors of MOS are numerical models' raw outputs, which
are not assumed to be perfectly estimated. For instance, a typical application of MOS is to correct the biases
of the numerical model's raw precipitation estimates (Jakob Themeßl et al., 2011). It should be noted that
the validity and universality of precipitation MOS rely on the consistency of precipitation estimation biases,
which is usually not guaranteed, given the continuous improvements of numerical models.
The performances of the above-mentioned SD approaches have been compared with dynamical downscaling
results (Ayar et al., 2016; Gutmann et al., 2012; Haylock et al., 2006; Murphy, 1999; Schmidli et al., 2007;
Tang et al., 2016). For instance, an intercomparison of six SD models and five RCMs for Europe indicated
that PP and MOS models achieved higher skill scores in estimating certain aspects of precipitation, such
as the occurrence and intensity (Ayar et al., 2016). On the other hand, another comparison study showed
clear advantage of RCMs for estimating precipitation over complex terrain (Schmidli et al., 2007). Overall,
the performance of SD depends on many factors, including the selection of predictors, the model and its
implementation, the available data, and the climate condition.
2.2. DNNs and Their Applications for Physical Processes
DNNs belong to the domain of ML, which covers a general scope of computer-aided statistical modeling.
DNNs differ from traditional ML approaches in their modeling workflow. In a canonical ML modeling pro-
cess, the raw form data, which quantify certain attributes of the study object, should be transformed into a
suitable feature vector before being effectively processed for the learning objective (Goodfellow et al., 2016;
LeCun et al., 2015). The feature extraction process is typically performed in separation with the modeling
process. Despite the expert knowledge and engineering works required for the feature extraction process,
a predefined feature extractor captures little useful information beyond our prior knowledge. This issue is
particularly severe for high-dimensional problems, where it is difficult to have foresight in the intricate but
important data structures.
On the other hand, DNNs, together with a broader family of representation learning approaches (Bengio
et al., 2013), offer an “end-to-end” modeling workflow: The feature extraction process is integrated into
the modeling process, which allows the model to learn customized features rather than subject to the
pre-engineered features.
DNNs learn to customize features through building multiple levels of representation of the data, which are
achieved by composing simple but nonlinear modules (named as neurons) that each transform the repre-
sentation at one level into a representation at a higher, slightly more abstract level (LeCun et al., 2015).
The differentiability of the hierarchical model allows applying the gradient descent algorithm to tune the
neurons' parameters in order to make the model exhibit desired behavior. This process is widely known as
backpropagation training (Rumelhart et al., 1985; Werbos, 1982). In addition to these basic concepts, mod-
ern DNNs involve numerous network architecture variations, training algorithms and tricks, regularization
methods, among others. A comprehensive review is beyond the scope of this work and can be found in
LeCun et al. (2015), Schmidhuber (2015), and Goodfellow et al. (2016). A transdisciplinary review of DNN
relevant to water resources-related research can be found in Shen (2018) and Shen et al. (2018).
DNNs have dramatically improved the state of the art in applications that cannot be adequately solved with
a deterministic rule-based solution, such as visual recognition (Krizhevsky et al., 2012), speech recognition
(Amodei et al., 2016), video prediction (Lotter et al., 2016), and natural language processing (Socher et al.,
2011). For the modeling of the natural physical processes, where we have established principled solutions
through analytic descriptions of the scientist's prior knowledge of the underlying processes (de Bezenac
et al., 2017), dynamical simulations are preferred to ML-based approaches. However, recent developments
showed that provided with (1) big amount of data and (2) well-designed network architectures that encode
PANETAL. 2303

Water Resources Research 10.1029/2018WR024090
Figure 1. (a) The case study area of a 32 km × 32 km geogrid centered at 46
N, 122
W. Its surrounding circulation field is delineated with the 800 km × 800 km
red polygon. (b) The geogrid's daily precipitation time series from 1979 to 2017. The red thick line represents the gage-based precipitation records from the
National Oceanic and Atmospheric Administration Climate Prediction Center (CPC); the blue slim line represents the model reanalysis records from the
National Centers for Environmental Prediction North American Regional Reanalysis Project (NARR). Data details are given in the section 5.1. (c) The every 3-hr
snapshots of the circulation profile for the storm event that happened on 7 November 2006. The g eopotential height (GPH) at 1,000, 850, and 500 hPa and the
total column precipitable water are obtained form NARR. Data are normalized by subtracting the field mean (
𝜇) and dividing by the field standard deviation (𝜎).
the physical background knowledge, DNNs are competitive with numerical methods in simulating complex
natural processes.
Generally, two motivations are found for adopting a data-driven model besides the classical dynamical
simulation. The first is computing efficiency. The computational demanding components in numerical sim-
ulations can be replaced by data-driven model counterparts to accelerate the simulation without significant
loss of accuracy. Examples include using DNNs to simulate the Eulerian fluid (Tompson et al., 2016) and
to predict the pressure field evolution in fluid flow (Wiewel et al., 2018). The other concern is to repre-
sent the unresolved processes beyond the original numerical simulation. For instance, Gentine et al. (2018)
trained a neural network to represent the subgrid scale convection process in atmospheric modeling. The
trained model was coupled in GCMs and skillfully predicted many of the convective heating, moistening,
and radiative features. Xie et al. (2018) applied a a conditional generative adversarial network to generate
spatiotemporal coherent high-resolution fluid flow based on its low-resolution estimates.
For the applications mentioned above, a particular DNN architecture named CNN acts as a core building
block. Compared to conventional neural networks, CNNs have significantly enhanced our capacity in pro-
cessing structured high-dimensional data. This is achieved by utilizing the inner structure of the data to
reduce the model structural redundancy and foster effective information extraction. Geophysical data are
intrinsically structured in space and time. The huge geophysical data sets from remote sensing observations,
numerical simulations, and their composite offer precious deposits for the application of DNNs (Tao et al.,
2016). CNNs have found applications in detecting extreme weather from the climate data sets (Liu et al.,
2016) and precipitation nowcasting (Shi et al., 2017; Xingjian et al., 2015). More related to our objective,
Vandal et al. (2017) developed a super-resolution convolutional neural network for precipitation SD. The
low-resolution precipitation field (
1
) and elevation field data were fed into the super-resolution convolu-
tional neural network to produce the high-resolution precipitation field (
1
8
). We noted that many of these
geophysical CNN applications took little use of the atmospheric dynamical modeling products, which offer
physically solid and comprehensive information of the atmosphere. While many recent research works have
started to explore the applicability of DNN for parameterizing the unresolved processes in fluid and geofluid
modeling (Ling et al., 2016; Rasp et al., 2018), it remains a question how DNN can translate the big data of
observations and numerical simulations into precipitation estimation improvements (Pan et al., 2017).
PANETAL. 2304

Citations
More filters
Journal ArticleDOI
TL;DR: This study provides a comprehensive review of state-of-the-art deep learning approaches used in the water industry for generation, prediction, enhancement, and classification tasks, and serves as a guide for how to utilize available deep learning methods for future water resources challenges.

185 citations

Journal ArticleDOI

110 citations


Cites methods from "Improving Precipitation Estimation ..."

  • ...Pan et al. (2019) introduced the CNN model to predict daily precipitation and tested it at 14 sites across the contiguous United States and proved that, if provided with sufficient data, the precipitation estimates from the CNNmethod are better than the reanalysis precipitation products and…...

    [...]

Journal ArticleDOI
TL;DR: A comprehensive assessment of deep learning techniques for continental-scale statistical downscaling, building on the VALUE validation framework, shows that, while the added value of CNNs is mostly limited to the reproduction of extremes for temperature, these techniques do outperform the classic ones in the case of precipitation for most aspects considered.
Abstract: . Deep learning techniques (in particular convolutional neural networks, CNNs) have recently emerged as a promising approach for statistical downscaling due to their ability to learn spatial features from huge spatiotemporal datasets. However, existing studies are based on complex models, applied to particular case studies and using simple validation frameworks, which makes a proper assessment of the (possible) added value offered by these techniques difficult. As a result, these models are usually seen as black boxes, generating distrust among the climate community, particularly in climate change applications. In this paper we undertake a comprehensive assessment of deep learning techniques for continental-scale statistical downscaling, building on the VALUE validation framework. In particular, different CNN models of increasing complexity are applied to downscale temperature and precipitation over Europe, comparing them with a few standard benchmark methods from VALUE (linear and generalized linear models) which have been traditionally used for this purpose. Besides analyzing the adequacy of different components and topologies, we also focus on their extrapolation capability, a critical point for their potential application in climate change studies. To do this, we use a warm test period as a surrogate for possible future climate conditions. Our results show that, while the added value of CNNs is mostly limited to the reproduction of extremes for temperature, these techniques do outperform the classic ones in the case of precipitation for most aspects considered. This overall good performance, together with the fact that they can be suitably applied to large regions (e.g., continents) without worrying about the spatial features being considered as predictors, can foster the use of statistical approaches in international initiatives such as Coordinated Regional Climate Downscaling Experiment (CORDEX).

104 citations

Journal ArticleDOI
TL;DR: In this paper, accurate and timely precipitation estimates are critical for monitoring and forecasting natural disasters such as floods, despite having high-resolution satellite information, precipitation cannot be accurately estimated due to high computational complexity.
Abstract: Accurate and timely precipitation estimates are critical for monitoring and forecasting natural disasters such as floods. Despite having high-resolution satellite information, precipitation...

91 citations

Journal ArticleDOI
01 May 2019-Water
TL;DR: Wang et al. as mentioned in this paper developed a deep neural network composed of a convolution and Long Short Term Memory (LSTM) recurrent module to estimate precipitation based on well-resolved atmospheric dynamical fields.
Abstract: Precipitation downscaling is widely employed for enhancing the resolution and accuracy of precipitation products from general circulation models (GCMs) In this study, we propose a novel statistical downscaling method to foster GCMs’ precipitation prediction resolution and accuracy for the monsoon region We develop a deep neural network composed of a convolution and Long Short Term Memory (LSTM) recurrent module to estimate precipitation based on well-resolved atmospheric dynamical fields The proposed model is compared against the GCM precipitation product and classical downscaling methods in the Xiangjiang River Basin in South China Results show considerable improvement compared to the European Centre for Medium-Range Weather Forecasts (ECMWF)-Interim reanalysis precipitation Also, the model outperforms benchmark downscaling approaches, including (1) quantile mapping, (2) the support vector machine, and (3) the convolutional neural network To test the robustness of the model and its applicability in practical forecasting, we apply the trained network for precipitation prediction forced by retrospective forecasts from the ECMWF model Compared to the ECMWF precipitation forecast, our model makes better use of the resolved dynamical field for more accurate precipitation prediction at lead times from 1 day up to 2 weeks This superiority decreases with the forecast lead time, as the GCM’s skill in predicting atmospheric dynamics is diminished by the chaotic effect Finally, we build a distributed hydrological model and force it with different sources of precipitation inputs Hydrological simulation forced with the neural network precipitation estimation shows significant advantage over simulation forced with the original ERA-Interim precipitation (with NSE value increases from 006 to 064), and the performance is only slightly worse than the observed precipitation forced simulation (NSE = 082) This further proves the value of the proposed downscaling method, and suggests its potential for hydrological forecasts

78 citations

References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations


"Improving Precipitation Estimation ..." refers methods in this paper

  • ...Details of the models and feature extraction are given in the supporting information (Breiman, 2001; Louppe, 2014; Wolfram, 2018; Pearson, 1901; Shlens, 2014)....

    [...]

Journal ArticleDOI
28 May 2015-Nature
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Abstract: Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

46,982 citations


"Improving Precipitation Estimation ..." refers background in this paper

  • ...Through down-sampling, the higher layer convolutions work on extracted local features, which enables learning higher-level abstractions on the expanded receptive field (LeCun et al., 2015)....

    [...]

  • ...…features through building multiple levels of representation of the data, which are achieved by composing simple but nonlinear modules (named as neurons) that each transform the representation at one level into a representation at a higher, slightly more abstract level (LeCun et al., 2015)....

    [...]

  • ...A comprehensive review is beyond the scope of this work and can be found in LeCun et al. (2015), Schmidhuber (2015), and Goodfellow et al. (2016)....

    [...]

  • ...These layers learn representations of the data with multiple levels of abstraction (LeCun et al., 2015)....

    [...]

  • ...In a canonical ML modeling process, the raw form data, which quantify certain attributes of the study object, should be transformed into a suitable feature vector before being effectively processed for the learning objective (Goodfellow et al., 2016; LeCun et al., 2015)....

    [...]

Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Frequently Asked Questions (16)
Q1. What are the contributions mentioned in the paper "Improving precipitation estimation using convolutional neural network" ?

The authors introduce the convolutional neural network model to foster this aspect of SD for daily precipitation prediction. In this sense, their model works as an alternative to the existing precipitation-related parameterization schemes for numerical precipitation estimation. Based on the default network, the authors examine the impact of the network architectures on model performance. Their study contributes to the following two aspects: First, the authors offer a novel approach to enhance numerical precipitation estimation ; second, the proposed model provides important implications for improving precipitation-related parameterization schemes using a data-driven approach. The authors develop a statistical model using deep learning technique to improve the estimation of precipitation in numerical weather models. Evaluation for the test set suggests that the improvements can be seamlessly transferred to numerical weather modeling for improving precipitation prediction. 

In the following studies, the authors plan to make more comprehensive examination on the impact of different information processing unions in the network. Also, the authors wish to explore novel network architectures and advanced regularization approaches to support more accurate and high-resolution precipitation estimation. 

to disintegrate the impact of the cyclone geometric shape and position, the authors adopt the convolution mechanism in the network modeling. 

The best performance in the comparison experiments is achieved by the linear regression model using input of the leading 16 PCs of the circulation field (r = 0.81, RMSE = 6.98). 

The authors include the dropout (Srivastava et al., 2014) and batchnormalization (Ioffe & Szegedy, 2015) modules to enhance the model's performance. 

the kernels that are used to extract the salient features from the resolved dynamical field are optimized by backpropagating the precipitation estimation error through the convolutional layers. 

The computational demanding components in numerical simulations can be replaced by data-driven model counterparts to accelerate the simulation without significant loss of accuracy. 

The authors carry out simulations using input composed of the leading 2, 8, 16, 64, and 256 PCs of the circulation field data, as well as simulations using the raw circulation field data. 

To guarantee model's robustness with respect to parameter initialization, the authors carry out several implementations with different parameter initializations. 

2308To test the applicability of the model for different climate conditions, the authors selected 14 sample grids that roughly cover the characteristic climate divisions of the contiguous United States. 

The kernel size of the included convolutional layers is set to 20 × c × 4 × 4, where c is the channel number of the previous layer. 

The predictors used for building the network models are the GPH and PW field data from the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR) data set (Mesinger et al., 2006). 

By varying the receptive field of the convolutional layers, the authors verify that the CNN model outperforms conventional fully connected ANN SD in estimating precipitation through explicit encoding of local spatial circulation structures. 

For the middle part of the continent, the CNN model shows slightly worse performance, which can be attributed to model overfitting when there are limited precipitation samples for training the model. 

The deeper CNN models are constructed by adding two/four extra convolutional layers before the first pooling layer for the default network architecture in Figure 4. 

This is achieved by utilizing the inner structure of the data to reduce the model structural redundancy and foster effective information extraction. 

Trending Questions (1)
How do I set weather on noise Colorfit Pro 3?

Evaluation for the test set suggests that the improvements can be seamlessly transferred to numerical weather modeling for improving precipitation prediction.