scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The world's user-generated road map is more than 80% complete.

10 Aug 2017-PLOS ONE (Public Library of Science)-Vol. 12, Iss: 8
TL;DR: Two complementary, independent methods are used to assess the completeness of OSM road data in each country in the world and find that globally, OSM is ∼83% complete, and more than 40% of countries—including several in the developing world—have a fully mapped street network.
Abstract: OpenStreetMap, a crowdsourced geographic database, provides the only global-level, openly licensed source of geospatial road data, and the only national-level source in many countries. However, researchers, policy makers, and citizens who want to make use of OpenStreetMap (OSM) have little information about whether it can be relied upon in a particular geographic setting. In this paper, we use two complementary, independent methods to assess the completeness of OSM road data in each country in the world. First, we undertake a visual assessment of OSM data against satellite imagery, which provides the input for estimates based on a multilevel regression and poststratification model. Second, we fit sigmoid curves to the cumulative length of contributions, and use them to estimate the saturation level for each country. Both techniques may have more general use for assessing the development and saturation of crowd-sourced data. Our results show that in many places, researchers and policymakers can rely on the completeness of OSM, or will soon be able to do so. We find (i) that globally, OSM is ∼83% complete, and more than 40% of countries-including several in the developing world-have a fully mapped street network; (ii) that well-governed countries with good Internet access tend to be more complete, and that completeness has a U-shaped relationship with population density-both sparsely populated areas and dense cities are the best mapped; and (iii) that existing global datasets used by the World Bank undercount roads by more than 30%.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The Global Road Inventory Project (GOP) as discussed by the authors is a large-scale dataset of road infrastructure data from 222 countries and includes over 21 million km of roads, which is two to three times the total length in the currently best available country-based global roads datasets.
Abstract: Georeferenced information on road infrastructure is essential for spatial planning, socio-economic assessments and environmental impact analyses. Yet current global road maps are typically outdated or characterized by spatial bias in coverage. In the Global Roads Inventory Project we gathered, harmonized and integrated nearly 60 geospatial datasets on road infrastructure into a global roads dataset. The resulting dataset covers 222 countries and includes over 21 million km of roads, which is two to three times the total length in the currently best available country-based global roads datasets. We then related total road length per country to country area, population density, GDP and OECD membership, resulting in a regression model with adjusted R 2 of 0.90, and found that that the highest road densities are associated with densely populated and wealthier countries. Applying our regression model to future population densities and GDP estimates from the Shared Socioeconomic Pathway (SSP) scenarios, we obtained a tentative estimate of 3.0–4.7 million km additional road length for the year 2050. Large increases in road length were projected for developing nations in some of the world's last remaining wilderness areas, such as the Amazon, the Congo basin and New Guinea. This highlights the need for accurate spatial road datasets to underpin strategic spatial planning in order to reduce the impacts of roads in remaining pristine ecosystems.

309 citations

Journal ArticleDOI
TL;DR: A cumulative measure of human modification of terrestrial lands based on modeling the physical extents of 13 anthropogenic stressors and their estimated impacts using spatially explicit global datasets with a median year of 2016 suggests that most of the world is in a state of intermediate modification and moderately modified ecoregions warrant elevated attention.
Abstract: An increasing number of international initiatives aim to reconcile development with conservation. Crucial to successful implementation of these initiatives is a comprehensive understanding of the current ecological condition of landscapes and their spatial distributions. Here, we provide a cumulative measure of human modification of terrestrial lands based on modeling the physical extents of 13 anthropogenic stressors and their estimated impacts using spatially explicit global datasets with a median year of 2016. We quantified the degree of land modification and the amount and spatial configuration of low modified lands (i.e., natural areas relatively free from human alteration) across all ecoregions and biomes. We identified that fewer unmodified lands remain than previously reported and that most of the world is in a state of intermediate modification, with 52% of ecoregions classified as moderately modified. Given that these moderately modified ecoregions fall within critical land use thresholds, we propose that they warrant elevated attention and require proactive spatial planning to maintain biodiversity and ecosystem function before important environmental values are lost.

306 citations

Journal ArticleDOI
TL;DR: It is shown that most fundamental Arctic infrastructure and population will be at high hazard risk, even if the Paris Agreement target is achieved, and fundamental engineering structures at risk by 2050.
Abstract: Degradation of near-surface permafrost can pose a serious threat to the utilization of natural resources, and to the sustainable development of Arctic communities. Here we identify at unprecedentedly high spatial resolution infrastructure hazard areas in the Northern Hemisphere's permafrost regions under projected climatic changes and quantify fundamental engineering structures at risk by 2050. We show that nearly four million people and 70% of current infrastructure in the permafrost domain are in areas with high potential for thaw of near-surface permafrost. Our results demonstrate that one-third of pan-Arctic infrastructure and 45% of the hydrocarbon extraction fields in the Russian Arctic are in regions where thaw-related ground instability can cause severe damage to the built environment. Alarmingly, these figures are not reduced substantially even if the climate change targets of the Paris Agreement are reached.

279 citations

Journal ArticleDOI
TL;DR: The authors address damaged networked infrastructure at the asset level for a wider range of hazards and reveal a global Expected Annual Damages ranging from $3.1 to 22 billion with a particular vulnerability of transport infrastructure in Small Island Developing States.
Abstract: Transport infrastructure is exposed to natural hazards all around the world. Here we present the first global estimates of multi-hazard exposure and risk to road and rail infrastructure. Results reveal that ~27% of all global road and railway assets are exposed to at least one hazard and ~7.5% of all assets are exposed to a 1/100 year flood event. Global Expected Annual Damages (EAD) due to direct damage to road and railway assets range from 3.1 to 22 billion US dollars, of which ~73% is caused by surface and river flooding. Global EAD are small relative to global GDP (~0.02%). However, in some countries EAD reach 0.5 to 1% of GDP annually, which is the same order of magnitude as national transport infrastructure budgets. A cost-benefit analysis suggests that increasing flood protection would have positive returns on ~60% of roads exposed to a 1/100 year flood event.

238 citations

Journal ArticleDOI
06 Mar 2020-bioRxiv
TL;DR: This work generates the first globally-consistent, continuous index of forest condition as determined by degree of anthropogenic modification, by integrating data on observed and inferred human pressures and an index of lost connectivity.
Abstract: Many global environmental agendas, including halting biodiversity loss, reversing land degradation, and limiting climate change, depend upon retaining forests with high ecological integrity, yet the scale and degree of forest modification remain poorly quantified and mapped. By integrating data on observed and inferred human pressures and an index of lost connectivity, we generate a globally consistent, continuous index of forest condition as determined by the degree of anthropogenic modification. Globally, only 17.4 million km2 of forest (40.5%) has high landscape-level integrity (mostly found in Canada, Russia, the Amazon, Central Africa, and New Guinea) and only 27% of this area is found in nationally designated protected areas. Of the forest inside protected areas, only 56% has high landscape-level integrity. Ambitious policies that prioritize the retention of forest integrity, especially in the most intact areas, are now urgently needed alongside current efforts aimed at halting deforestation and restoring the integrity of forests globally.

141 citations

References
More filters
Book
01 Jan 2006
TL;DR: Data Analysis Using Regression and Multilevel/Hierarchical Models is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models.
Abstract: Data Analysis Using Regression and Multilevel/Hierarchical Models is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models. The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the concepts by working through scores of real data examples that have arisen from the authors' own applied research, with programming codes provided for each one. Topics covered include causal inference, including regression, poststratification, matching, regression discontinuity, and instrumental variables, as well as multilevel logistic regression and missing-data imputation. Practical tips regarding building, fitting, and understanding are provided throughout.

9,098 citations


"The world's user-generated road map..." refers background in this paper

  • ...The first step of MRP is the multilevel regression, as in [43]....

    [...]

Journal ArticleDOI
TL;DR: Stan as discussed by the authors is a probabilistic programming language for specifying statistical models, where a program imperatively defines a log probability function over parameters conditioned on specified data and constants, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration.
Abstract: Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.

4,947 citations

01 Jan 2017
TL;DR: Stan is a probabilistic programming language for specifying statistical models that provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler and an adaptive form of Hamiltonian Monte Carlo sampling.
Abstract: Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.

2,938 citations


"The world's user-generated road map..." refers methods in this paper

  • ...The model is estimated in a Bayesian framework using the open-source PyStan software [46]....

    [...]

Journal ArticleDOI
TL;DR: Analysis of the quality of OpenStreetMap information focuses on London and England, since OSM started in London in August 2004 and therefore the study of these geographies provides the best understanding of the achievements and difficulties of VGI.
Abstract: Within the framework of Web 2.0 mapping applications, the most striking example of a geographical application is the OpenStreetMap (OSM) project. OSM aims to create a free digital map of the world and is implemented through the engagement of participants in a mode similar to software development in Open Source projects. The information is collected by many participants, collated on a central database, and distributed in multiple digital formats through the World Wide Web. This type of information was termed 'Volunteered Geographical Information' (VG!) by Goodchild, 2007. However, to date there has been no systematic analysis of the quality of VGI. This study aims to fill this gap by analysing OSM information. The examination focuses on analysis of its quality through a comparison with Ordnance Survey (OS) datasets. The analysis focuses on London and England, since OSM started in London in August 2004 and therefore the study of these geographies provides the best understanding of the achievements and difficulties of VGI. The analysis shows that OSM information can be fairly accurate: on average within about 6 m of the position recorded by the OS, and with approximately 80% overlap of motorway objects between the two datasets. In the space of four years, OSM has captured about 29% of the area of England, of which approximately 24% are digitised lines without a complete set of attributes. The paper concludes with a discussion of the implications of the findings to the study of VGI as well as suggesting future research directions.

1,493 citations

Journal ArticleDOI
TL;DR: The quality of French OpenStreetMap data is studied to provide a larger set of spatial data quality element assessments, and raises questions such as the heterogeneity of processes, scales of production, and the compliance to standardized and accepted specifications.
Abstract: The concept of Volunteered Geographic Information (VGI) has recently emerged from the new Web 2.0 technologies. The OpenStreetMap project is currently the most significant example of a system based on VGI. It aims at producing free vector geographic databases using contributions from Internet users. Spatial data quality becomes a key consideration in this context of freely downloadable geographic databases. This article studies the quality of French OpenStreetMap data. It extends the work of Haklay to France, provides a larger set of spatial data quality element assessments (i.e. geometric, attribute, semantic and temporal accuracy, logical consistency, completeness, lineage, and usage), and uses different methods of quality control. The outcome of the study raises questions such as the heterogeneity of processes, scales of production, and the compliance to standardized and accepted specifications. In order to improve data quality, a balance has to be struck between the contributors' freedom and their respect of specifications. The development of appropriate solutions to provide this balance is an important research issue in the domain of user-generated content.

631 citations


"The world's user-generated road map..." refers result in this paper

  • ...While early assessments found significant gaps [18, 19, 30, 31], more recent studies of European countries have found that the network is virtually complete, and is comparable to or better than official or proprietary data sources [17, 22]....

    [...]