scispace - formally typeset
Open AccessJournal ArticleDOI

Identifying, attributing, and overcoming common data quality issues of manned station observations

Reads0
Chats0
TLDR
In this article, the authors identify and attribute the most important common data quality issues in Bolivian and Peruvian temperature and precipitation datasets, and find that a large fraction of these issues can be traced back to measurement errors by the observers.
Abstract
In situ climatological observations are essential for studies related to climate trends and extreme events. However, in many regions of the globe, observational records are affected by a large number of data quality issues. Assessing and controlling the quality of such datasets is an important, often overlooked aspect of climate research. Besides analysing the measurement data, metadata are important for a comprehensive data quality assessment. However, metadata are often missing, but may partly be reconstructed by suitable actions such as station inspections. This study identifies and attributes the most important common data quality issues in Bolivian and Peruvian temperature and precipitation datasets. The same or similar errors are found in many other predominantly manned station networks worldwide. A large fraction of these issues can be traced back to measurement errors by the observers. Therefore, the most effective way to prevent errors is to strengthen the training of observers and to establish a near real-time quality control (QC) procedure. Many common data quality issues are hardly detected by usual QC approaches. Data visualization, however, is an effective tool to identify and attribute those issues, and therefore enables data users to potentially correct errors and to decide which purposes are not affected by specific problems. The resulting increase in usable station records is particularly important in areas where station networks are sparse. In such networks, adequate selection and treatment of time series based on a comprehensive QC procedure may contribute to improving data homogeneity more than statistical data homogenization methods.

read more

Citations
More filters
Journal ArticleDOI

Evaluation of Gridded Precipitation Datasets over Arid Regions of Pakistan

TL;DR: In this paper, the performance of four widely used gauge-based gridded precipitation data products, namely, Global Precipitation Climatology Centre (GPCC), Climatic Research Unit (CRU); Asian PrecIPitation Highly Resolved Observational Data Integration towards Evaluation (APHRODITE), Center for Climatic research-University of Delaware (UDel) at stations located in semi-arid, arid, and hyper-rid regions in the Balochistan province of Pakistan.
Journal ArticleDOI

Machine learning for site-adaptation and solar radiation forecasting

TL;DR: Through a study case with real data, the benefits of using the proposed methodology based on machine and deep learning techniques to integrate data from different sources and to construct precise solar radiation forecasting models in regions where solar energy systems are required are shown.
Journal ArticleDOI

Land Surface Air Temperature Variations Across the Globe Updated to 2019: The CRUTEM5 Data Set

TL;DR: Climatic Research Unit temperature version 5 (CRUTEM5) as discussed by the authors is an extensive revision of the land surface air temperature data set, which has expanded the underlying compilation of monthly temperature records from 5,583 to 10,639 stations, and those with sufficient data to be used in the gridded data set has grown from 4,842 to 7,983.
Journal ArticleDOI

Recent changes in the precipitation-driving processes over the southern tropical Andes/western Amazon

TL;DR: In this paper, the authors used the ERA-Interim data set to identify the first mode of interannual DJF precipitation variability (PC1-Andes) over the past 35 years.
Journal ArticleDOI

Construction of a high-resolution gridded rainfall dataset for Peru from 1981 to the present day

TL;DR: A new gridded rainfall dataset available for Peru is introduced, called PISCOp V2.1 (Peruvian Interpolated data of SENAMHI’s Climatological and Hydrological Observations).
References
More filters
Book

Pattern Recognition and Machine Learning

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Journal ArticleDOI

An Overview of the Global Historical Climatology Network-Daily Database

Abstract: A database is described that has been designed to fulfill the need for daily climate data over global land areas. The dataset, known as Global Historical Climatology Network (GHCN)-Daily, was developed for a wide variety of potential applications, including climate analysis and monitoring studies that require data at a daily time resolution (e.g., assessments of the frequency of heavy rainfall, heat wave duration, etc.). The dataset contains records from over 80 000 stations in 180 countries and territories, and its processing system produces the official archive for U.S. daily data. Variables commonly include maximum and minimum temperature, total daily precipitation, snowfall, and snow depth; however, about two-thirds of the stations report precipitation only. Quality assurance checks are routinely applied to the full dataset, but the data are not homogenized to account for artifacts associated with the various eras in reporting practice at any particular station (i.e., for changes in systematic...
Journal ArticleDOI

Climate extremes indices in the CMIP5 multimodel ensemble: Part 2. Future climate projections

TL;DR: This paper provided an overview of projected changes in climate extremes indices defined by the Expert Team on Climate Change Detection and Indices (ETCCDI) over the 21st century relative to the reference period 1981-2000.
Related Papers (5)