Identifying, attributing, and overcoming common data quality issues of manned station observations
Stefan Hunziker,Stefan Hunziker,Stefanie Gubler,Juan Calle,Isabel Moreno,Marcos Andrade,Fernando Velarde,Laura Ticona,Gualberto Carrasco,Yaruska Castellón,Clara Oria,Mischa Croci-Maspoli,Thomas Konzelmann,Mario Rohrer,Stefan Brönnimann,Stefan Brönnimann +15 more
Reads0
Chats0
TLDR
In this article, the authors identify and attribute the most important common data quality issues in Bolivian and Peruvian temperature and precipitation datasets, and find that a large fraction of these issues can be traced back to measurement errors by the observers.Abstract:
In situ climatological observations are essential for studies related to climate trends and extreme events. However, in many regions of the globe, observational records are affected by a large number of data quality issues. Assessing and controlling the quality of such datasets is an important, often overlooked aspect of climate research. Besides analysing the measurement data, metadata are important for a comprehensive data quality assessment. However, metadata are often missing, but may partly be reconstructed by suitable actions such as station inspections. This study identifies and attributes the most important common data quality issues in Bolivian and Peruvian temperature and precipitation datasets. The same or similar errors are found in many other predominantly manned station networks worldwide. A large fraction of these issues can be traced back to measurement errors by the observers. Therefore, the most effective way to prevent errors is to strengthen the training of observers and to establish a near real-time quality control (QC) procedure. Many common data quality issues are hardly detected by usual QC approaches. Data visualization, however, is an effective tool to identify and attribute those issues, and therefore enables data users to potentially correct errors and to decide which purposes are not affected by specific problems. The resulting increase in usable station records is particularly important in areas where station networks are sparse. In such networks, adequate selection and treatment of time series based on a comprehensive QC procedure may contribute to improving data homogeneity more than statistical data homogenization methods.read more
Citations
More filters
Journal ArticleDOI
Evaluation of Gridded Precipitation Datasets over Arid Regions of Pakistan
TL;DR: In this paper, the performance of four widely used gauge-based gridded precipitation data products, namely, Global Precipitation Climatology Centre (GPCC), Climatic Research Unit (CRU); Asian PrecIPitation Highly Resolved Observational Data Integration towards Evaluation (APHRODITE), Center for Climatic research-University of Delaware (UDel) at stations located in semi-arid, arid, and hyper-rid regions in the Balochistan province of Pakistan.
Journal ArticleDOI
Machine learning for site-adaptation and solar radiation forecasting
TL;DR: Through a study case with real data, the benefits of using the proposed methodology based on machine and deep learning techniques to integrate data from different sources and to construct precise solar radiation forecasting models in regions where solar energy systems are required are shown.
Journal ArticleDOI
Land Surface Air Temperature Variations Across the Globe Updated to 2019: The CRUTEM5 Data Set
Timothy J. Osborn,Philip Jones,David Lister,Colin Morice,Ian R. Simpson,Jonathan Winn,Emma Hogan,Ian Harris +7 more
TL;DR: Climatic Research Unit temperature version 5 (CRUTEM5) as discussed by the authors is an extensive revision of the land surface air temperature data set, which has expanded the underlying compilation of monthly temperature records from 5,583 to 10,639 stations, and those with sufficient data to be used in the gridded data set has grown from 4,842 to 7,983.
Journal ArticleDOI
Recent changes in the precipitation-driving processes over the southern tropical Andes/western Amazon
Hans Segura,Jhan Carlo Espinoza,Clementine Junquas,Thierry Lebel,Mathias Vuille,René D. Garreaud +5 more
TL;DR: In this paper, the authors used the ERA-Interim data set to identify the first mode of interannual DJF precipitation variability (PC1-Andes) over the past 35 years.
Journal ArticleDOI
Construction of a high-resolution gridded rainfall dataset for Peru from 1981 to the present day
TL;DR: A new gridded rainfall dataset available for Peru is introduced, called PISCOp V2.1 (Peruvian Interpolated data of SENAMHI’s Climatological and Hydrological Observations).
References
More filters
Book
Pattern Recognition and Machine Learning
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Journal ArticleDOI
Global observed changes in daily climate extremes of temperature and precipitation
Lisa V. Alexander,Lisa V. Alexander,Lisa V. Alexander,Xuebin Zhang,Thomas C. Peterson,John Caesar,Byron E. Gleason,A. M. G. Klein Tank,M. R. Haylock,Dean Collins,Blair Trewin,Fatemeh Rahimzadeh,A. Tagipour,K. Rupa Kumar,J. V. Revadekar,G M Griffiths,Lucie A. Vincent,David B. Stephenson,J. Burn,Enric Aguilar,Manola Brunet,Michael A. Taylor,Mark New,Panmao Zhai,Matilde Rusticucci,J. L. Vazquez-Aguirre +25 more
TL;DR: A suite of climate change indices derived from daily temperature and precipitation data, with a primary focus on extreme events, were computed and analyzed as discussed by the authors, and the results showed widespread significant changes in temperature extremes associated with warming.
Journal ArticleDOI
An Overview of the Global Historical Climatology Network-Daily Database
Abstract: A database is described that has been designed to fulfill the need for daily climate data over global land areas. The dataset, known as Global Historical Climatology Network (GHCN)-Daily, was developed for a wide variety of potential applications, including climate analysis and monitoring studies that require data at a daily time resolution (e.g., assessments of the frequency of heavy rainfall, heat wave duration, etc.). The dataset contains records from over 80 000 stations in 180 countries and territories, and its processing system produces the official archive for U.S. daily data. Variables commonly include maximum and minimum temperature, total daily precipitation, snowfall, and snow depth; however, about two-thirds of the stations report precipitation only. Quality assurance checks are routinely applied to the full dataset, but the data are not homogenized to account for artifacts associated with the various eras in reporting practice at any particular station (i.e., for changes in systematic...
Journal ArticleDOI
Indices for monitoring changes in extremes based on daily temperature and precipitation data
Xuebin Zhang,Lisa V. Alexander,Gabriele C. Hegerl,Philip Jones,Philip Jones,Albert Klein Tank,Thomas C. Peterson,Blair Trewin,Francis W. Zwiers +8 more
TL;DR: A review of gridding indices of extremes can be found in this article, where the authors discuss the obstacles to robustly calculating and analyzing indices and the methods developed to overcome these obstacles.
Journal ArticleDOI
Climate extremes indices in the CMIP5 multimodel ensemble: Part 2. Future climate projections
TL;DR: This paper provided an overview of projected changes in climate extremes indices defined by the Expert Team on Climate Change Detection and Indices (ETCCDI) over the 21st century relative to the reference period 1981-2000.