scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Reproducibility and replicability: opportunities and challenges for geospatial research

TL;DR: The article highlights the open questions, opportunities, and potential new directions in geospatial research related to R&R and stresses that the path ahead will likely require a mixture of computational,Geospatial, and behavioral research that collectively addresses the many sides of reproducibility and replicability issues.
Abstract: A cornerstone of the scientific method, the ability to reproduce and replicate the results of research has gained widespread attention across the sciences in recent years. A corresponding burst of ...
Citations
More filters
Journal ArticleDOI
TL;DR: The concept of weak replicability has been introduced in this paper, where the authors discuss possible approaches to its measurement and discuss how the principle of spatial heterogeneity might be addressed in the context of artificial intelligence.
Abstract: Replicability takes on special meaning when researching phenomena that are embedded in space and time, including phenomena distributed on the surface and near surface of the Earth. Two principles, spatial dependence and spatial heterogeneity, are generally characteristic of such phenomena. Various practices have evolved in dealing with spatial heterogeneity, including the use of place-based models. We review the rapidly emerging applications of artificial intelligence to phenomena distributed in space and time and speculate on how the principle of spatial heterogeneity might be addressed. We introduce a concept of weak replicability and discuss possible approaches to its measurement.

26 citations

Journal ArticleDOI
05 Aug 2021-PLOS ONE
TL;DR: Wang et al. as discussed by the authors developed a scalable online platform for extracting, analyzing, and sharing multi-source multi-scale human mobility flows in response to the soaring needs of human mobility data, especially during disaster events such as the COVID-19 pandemic, and the associated big data challenges.
Abstract: In response to the soaring needs of human mobility data, especially during disaster events such as the COVID-19 pandemic, and the associated big data challenges, we develop a scalable online platform for extracting, analyzing, and sharing multi-source multi-scale human mobility flows. Within the platform, an origin-destination-time (ODT) data model is proposed to work with scalable query engines to handle heterogenous mobility data in large volumes with extensive spatial coverage, which allows for efficient extraction, query, and aggregation of billion-level origin-destination (OD) flows in parallel at the server-side. An interactive spatial web portal, ODT Flow Explorer, is developed to allow users to explore multi-source mobility datasets with user-defined spatiotemporal scales. To promote reproducibility and replicability, we further develop ODT Flow REST APIs that provide researchers with the flexibility to access the data programmatically via workflows, codes, and programs. Demonstrations are provided to illustrate the potential of the APIs integrating with scientific workflows and with the Jupyter Notebook environment. We believe the platform coupled with the derived multi-scale mobility data can assist human mobility monitoring and analysis during disaster events such as the ongoing COVID-19 pandemic and benefit both scientific communities and the general public in understanding human mobility dynamics.

17 citations

Journal ArticleDOI
TL;DR: The goal is to increase the reproducibility of terrain rendering algorithms and techniques across different scales and landscapes by introducing elevation models of varying terrain types, available to the user at no cost, with minimal common data imperfections.
Abstract: This paper proposes elevation models to promote, evaluate, and compare various terrain representation techniques. Our goal is to increase the reproducibility of terrain rendering algorithms and tec...

16 citations

Journal ArticleDOI
TL;DR: In this article , the authors propose a training data model for AI in Earth Observation (EO) to allow documentation, storage, and sharing of geospatial training data in a distributed infrastructure.
Abstract: Abstract Artificial Intelligence Machine Learning (AI/ML), in particular Deep Learning (DL), is reorienting and transforming Earth Observation (EO). A consistent data model for delivery of training data will support the FAIR data principles (findable, accessible, interoperable, reusable) and enable Web-based use of training data in a spatial data infrastructure (SDI). Existing training datasets, including open source benchmark datasets, are usually packaged into public or personal repositories and lack discoverability and accessibility. Moreover, there is no unified method to describe the training data. Here we propose a training data model for AI in EO to allow documentation, storage, and sharing of geospatial training data in a distributed infrastructure. We present design rationales, information models, and an encoding method. Several scenarios illustrate the intended uses and benefits for EO DL applications in an open Web environment. The relationship with Open Geospatial Consortium (OGC) standards is also discussed, as is the impact on an AI-ready SDI.

10 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a geographically reproducible approach to urban scene sensing based on large-scale pre-trained models, which bridges the boundaries of data formats under location coupling, allowing for the acquisition of text-image urban scene objective descriptions in the physical space from the human perspective.

8 citations

References
More filters
01 Jan 2014
TL;DR: A model of how one group of actors managed this tension between divergent viewpoints was presented, drawing on the work of amateurs, professionals, administrators and others connected to the Museum of Vertebrate Zoology at the University of California, Berkeley, during its early years.
Abstract: Scientific work is heterogeneous, requiring many different actors and viewpoints. It also requires cooperation. The two create tension between divergent viewpoints and the need for generalizable findings. We present a model of how one group of actors managed this tension. It draws on the work of amateurs, professionals, administrators and others connected to the Museum of Vertebrate Zoology at the University of California, Berkeley, during its early years. Extending the Latour-Callon model of interessement, two major activities are central for translating between viewpoints: standardization of methods, and the development of 'boundary objects'. Boundary objects are both adaptable to different viewpoints and robust enough to maintain identity across them. We distinguish four types of boundary objects: repositories, ideal types, coincident boundaries and standardized forms.

7,800 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a model of how one group of actors managed the tension between divergent viewpoints and the need for generalizable findings in scientific work, and distinguish four types of boundary objects: repositories, ideal types, coincident boundaries and standardized forms.
Abstract: Scientific work is heterogeneous, requiring many different actors and viewpoints. It also requires cooperation. The two create tension between divergent viewpoints and the need for generalizable findings. We present a model of how one group of actors managed this tension. It draws on the work of amateurs, professionals, administrators and others connected to the Museum of Vertebrate Zoology at the University of California, Berkeley, during its early years. Extending the Latour-Callon model of interessement, two major activities are central for translating between viewpoints: standardization of methods, and the development of `boundary objects'. Boundary objects are both adaptable to different viewpoints and robust enough to maintain identity across them. We distinguish four types of boundary objects: repositories, ideal types, coincident boundaries and standardized forms.

7,634 citations

Journal ArticleDOI
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

7,602 citations

Journal ArticleDOI
TL;DR: Quantitative procedures for computing the tolerance for filed and future null results are reported and illustrated, and the implications are discussed.
Abstract: For any given research area, one cannot tell how many studies have been conducted but never reported. The extreme view of the "file drawer problem" is that journals are filled with the 5% of the studies that show Type I errors, while the file drawers are filled with the 95% of the studies that show nonsignificant results. Quantitative procedures for computing the tolerance for filed and future null results are reported and illustrated, and the implications are discussed. (15 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)

7,159 citations

15 Aug 2006
TL;DR: In this paper, the authors discuss the implications of these problems for the conduct and interpretation of research and suggest that claimed research findings may often be simply accurate measures of the prevailing bias.
Abstract: There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser pre-selection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.

5,003 citations

Related Papers (5)