Identifying, attributing, and overcoming common data quality issues of manned station observations

doi:10.1002/JOC.5037

Open AccessJournal ArticleDOI

Identifying, attributing, and overcoming common data quality issues of manned station observations

Stefan Hunziker, +15 more

- 01 Sep 2017 -

International Journal of Climatology

- Vol. 37, Iss: 11, pp 4131-4145

Chats0

TLDR

In this article, the authors identify and attribute the most important common data quality issues in Bolivian and Peruvian temperature and precipitation datasets, and find that a large fraction of these issues can be traced back to measurement errors by the observers.

Abstract:

In situ climatological observations are essential for studies related to climate trends and extreme events. However, in many regions of the globe, observational records are affected by a large number of data quality issues. Assessing and controlling the quality of such datasets is an important, often overlooked aspect of climate research. Besides analysing the measurement data, metadata are important for a comprehensive data quality assessment. However, metadata are often missing, but may partly be reconstructed by suitable actions such as station inspections. This study identifies and attributes the most important common data quality issues in Bolivian and Peruvian temperature and precipitation datasets. The same or similar errors are found in many other predominantly manned station networks worldwide. A large fraction of these issues can be traced back to measurement errors by the observers. Therefore, the most effective way to prevent errors is to strengthen the training of observers and to establish a near real-time quality control (QC) procedure. Many common data quality issues are hardly detected by usual QC approaches. Data visualization, however, is an effective tool to identify and attribute those issues, and therefore enables data users to potentially correct errors and to decide which purposes are not affected by specific problems. The resulting increase in usable station records is particularly important in areas where station networks are sparse. In such networks, adequate selection and treatment of time series based on a comprehensive QC procedure may contribute to improving data homogeneity more than statistical data homogenization methods.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Evaluation of Gridded Precipitation Datasets over Arid Regions of Pakistan

Kamal Ahmed, +4 more

- 26 Jan 2019 -

Water

TL;DR: In this paper, the performance of four widely used gauge-based gridded precipitation data products, namely, Global Precipitation Climatology Centre (GPCC), Climatic Research Unit (CRU); Asian PrecIPitation Highly Resolved Observational Data Integration towards Evaluation (APHRODITE), Center for Climatic research-University of Delaware (UDel) at stations located in semi-arid, arid, and hyper-rid regions in the Balochistan province of Pakistan.

...read moreread less

Journal ArticleDOI

Machine learning for site-adaptation and solar radiation forecasting

Gabriel Narvaez, +3 more

- 01 Apr 2021 -

Renewable Energy

TL;DR: Through a study case with real data, the benefits of using the proposed methodology based on machine and deep learning techniques to integrate data from different sources and to construct precise solar radiation forecasting models in regions where solar energy systems are required are shown.

...read moreread less

Journal ArticleDOI

Land Surface Air Temperature Variations Across the Globe Updated to 2019: The CRUTEM5 Data Set

Timothy J. Osborn, +7 more

- 27 Jan 2021 -

Journal of Geophysical Research

TL;DR: Climatic Research Unit temperature version 5 (CRUTEM5) as discussed by the authors is an extensive revision of the land surface air temperature data set, which has expanded the underlying compilation of monthly temperature records from 5,583 to 10,639 stations, and those with sufficient data to be used in the gridded data set has grown from 4,842 to 7,983.

...read moreread less

Journal ArticleDOI

Recent changes in the precipitation-driving processes over the southern tropical Andes/western Amazon

Hans Segura, +5 more

- 01 Mar 2020 -

Climate Dynamics

TL;DR: In this paper, the authors used the ERA-Interim data set to identify the first mode of interannual DJF precipitation variability (PC1-Andes) over the past 35 years.

...read moreread less

Journal ArticleDOI

Construction of a high-resolution gridded rainfall dataset for Peru from 1981 to the present day

Cesar Aybar, +5 more

- 03 Apr 2020 -

Hydrological Sciences Journal-journal De...

TL;DR: A new gridded rainfall dataset available for Peru is introduced, called PISCOp V2.1 (Peruvian Interpolated data of SENAMHI’s Climatological and Hydrological Observations).

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Pattern Recognition and Machine Learning

Christopher M. Bishop

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.

...read moreread less

Journal ArticleDOI

Global observed changes in daily climate extremes of temperature and precipitation

Lisa V. Alexander, +25 more

- 16 Mar 2006 -

Journal of Geophysical Research

TL;DR: A suite of climate change indices derived from daily temperature and precipitation data, with a primary focus on extreme events, were computed and analyzed as discussed by the authors, and the results showed widespread significant changes in temperature extremes associated with warming.

...read moreread less

Journal ArticleDOI

An Overview of the Global Historical Climatology Network-Daily Database

Matthew J. Menne, +4 more

- 01 Jul 2012 -

Journal of Atmospheric and Oceanic Techn...

Abstract: A database is described that has been designed to fulfill the need for daily climate data over global land areas. The dataset, known as Global Historical Climatology Network (GHCN)-Daily, was developed for a wide variety of potential applications, including climate analysis and monitoring studies that require data at a daily time resolution (e.g., assessments of the frequency of heavy rainfall, heat wave duration, etc.). The dataset contains records from over 80 000 stations in 180 countries and territories, and its processing system produces the official archive for U.S. daily data. Variables commonly include maximum and minimum temperature, total daily precipitation, snowfall, and snow depth; however, about two-thirds of the stations report precipitation only. Quality assurance checks are routinely applied to the full dataset, but the data are not homogenized to account for artifacts associated with the various eras in reporting practice at any particular station (i.e., for changes in systematic...

...read moreread less

Journal ArticleDOI

Indices for monitoring changes in extremes based on daily temperature and precipitation data

Xuebin Zhang, +8 more

- 01 Nov 2011 -

Wiley Interdisciplinary Reviews: Climate...

TL;DR: A review of gridding indices of extremes can be found in this article, where the authors discuss the obstacles to robustly calculating and analyzing indices and the methods developed to overcome these obstacles.

...read moreread less

Journal ArticleDOI

Climate extremes indices in the CMIP5 multimodel ensemble: Part 2. Future climate projections

Jana Sillmann, +4 more

- 27 Mar 2013 -

Journal of Geophysical Research

TL;DR: This paper provided an overview of projected changes in climate extremes indices defined by the Expert Team on Climate Change Detection and Indices (ETCCDI) over the 21st century relative to the reference period 1981-2000.

...read moreread less