scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Geographical Information Science in 2022"


Journal ArticleDOI
TL;DR: A novel multi-view bidirectional spatiotemporal graph network called Multi-BiSTGN is proposed to impute urban traffic data with complex missing patterns and outperformed ten existing baselines under different missing types and missing rates.
Abstract: Abstract Accurate estimation of missing traffic data is one of the essential components in intelligent transportation systems (ITS). The non-Euclidean data structure and complex missing traffic flow patterns make it challenging to capture nonlinear spatiotemporal correlations of missing traffic flow, which are critical for the imputation of missing traffic data. In this study, we propose a novel multi-view bidirectional spatiotemporal graph network called Multi-BiSTGN to impute urban traffic data with complex missing patterns. First, three spatiotemporal graph sequences are constructed to comprehensively describe traffic conditions from different temporal correlation views, i.e. temporal closeness view, daily periodicity view, and weekly periodicity view. Then, three bidirectional spatiotemporal graph networks are fused by a parametric-matrix-based method to obtain the final imputation results. To train the Multi-BiSTGN model, a novel loss function that considers the interactions between three temporal correlation views is designed to optimize the parameters of the Multi-BiSTGN model. The proposed model was validated on real-world traffic datasets collected in Wuhan, China. Experimental results showed that Multi-BiSTGN outperformed ten existing baselines under different missing types (random missing, block missing, and mixed missing) and missing rates.

22 citations


Journal ArticleDOI
TL;DR: A graph convolutional network with neighborhood information is applied, named the neighbour supporting graph convolutionsal neural network, to learn spatial relationships for urban scene classification to improve the accuracy of urbanscene classification.
Abstract: Abstract Urban scenes consist of visual and semantic features and exhibit spatial relationships among land-use types (e.g. industrial areas are far away from the residential zones). This study applied a graph convolutional network with neighborhood information (henceforth, named the neighbour supporting graph convolutional neural network), to learn spatial relationships for urban scene classification. Furthermore, a co-occurrence analysis with visual and semantic features proceeded to improve the accuracy of urban scene classification. We tested the proposed method with the fifth ring road of Beijing with an overall classification accuracy of 0.827 and a Kappa coefficient of 0.769. In comparison with other methods, such as support vector machine, random forest, and general graph convolutional network, the case study showed that the proposed method improved about 10% in urban scene classification.

18 citations


Journal ArticleDOI
TL;DR: In this paper , the authors present a new method to create spatial data using a generative adversarial network (GAN), which uses coarse and widely available geospatial data to create maps of less available features at the finer scale in the built environment.
Abstract: We present a new method to create spatial data using a generative adversarial network (GAN). Our contribution uses coarse and widely available geospatial data to create maps of less available features at the finer scale in the built environment, bypassing their traditional acquisition techniques (e.g. satellite imagery or land surveying). In the work, we employ land use data and road networks as input to generate building footprints and conduct experiments in 9 cities around the world. The method, which we implement in a tool we release openly, enables the translation of one geospatial dataset to another with high fidelity and morphological accuracy. It may be especially useful in locations missing detailed and high-resolution data and those that are mapped with uncertain or heterogeneous quality, such as much of OpenStreetMap. The quality of the results is influenced by the urban form and scale. In most cases, the experiments suggest promising performance as the method tends to truthfully indicate the locations, amount, and shape of buildings. The work has the potential to support several applications, such as energy, climate, and urban morphology studies in areas previously lacking required data or inpainting geospatial data in regions with incomplete data.

17 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a methodological framework to detect dynamical communities in multilayer spatial interaction networks and examine their spatio-temporal patterns, where random walks are used to merge network layers with different weights, the Leiden technique is used for deriving dynamic communities and exploratory analytic methods are adopted to examine spatiotemporal patterns.
Abstract: Abstract Detecting network communities has recently attracted extensive studies in many fields. However, little attention has been paid to detection and analysis of dynamical communities. This study intends to propose a methodological framework to detect dynamical communities in multilayer spatial interaction networks and examine their spatiotemporal patterns. Random walks are used to merge network layers with different weights, the Leiden technique is used for deriving dynamical communities and exploratory analytic methods are adopted to examine spatiotemporal patterns. To verify our methods, experiments were conducted in Wuhan, China, where trajectory data were used to construct the time-dependent multilayer networks. (1) We derived a set of spatiotemporally cohesive and comparable dynamical communities on each day for one week; (2) They exhibit interesting clustering patterns according to the similarity of their growth curves; (3) They display distinct life courses of occurrence, expansion, stability, contract and disappearance, and their dynamical interactions are vividly depicted; (4) They manifest mixed land use patterns via transfers of human activities. Thus, our methods can enrich research on dynamical organization of urban space and may be applicable in other contexts, while experimental results can provide decision-making support for sustainable urban management.

17 citations


Journal ArticleDOI
TL;DR: Location Encoding: Location Encoding as discussed by the authors is the process to encode a single point location into an embedding space, such that this embedding is learning-friendly for downstream machine learning models.
Abstract: A common need for artificial intelligence models in the broader geoscience is to encode various types of spatial data, such as points, polylines, polygons, graphs, or rasters, in a hidden embedding space so that they can be readily incorporated into deep learning models. One fundamental step is to encode a single point location into an embedding space, such that this embedding is learning-friendly for downstream machine learning models. We call this process location encoding. However, there lacks a systematic review on location encoding, its potential applications, and key challenges that need to be addressed. This paper aims to fill this gap. We first provide a formal definition of location encoding, and discuss the necessity of it for GeoAI research. Next, we provide a comprehensive survey about the current landscape of location encoding research. We classify location encoding models into different categories based on their inputs and encoding methods, and compare them based on whether they are parametric, multi-scale, distance preserving, and direction aware. We demonstrate that existing location encoders can be unified under one formulation framework. We also discuss the application of location encoding. Finally, we point out several challenges that need to be solved in the future.

16 citations


Journal ArticleDOI
TL;DR: A novel approach for estimating the proportional distributions of function types in an urban area through learning semantics preserved embeddings of points-of-interest (POIs) and a manifold learning algorithm to capture categorical semantics is presented.
Abstract: Abstract We present a novel approach for estimating the proportional distributions of function types (i.e. functional distributions) in an urban area through learning semantics preserved embeddings of points-of-interest (POIs). Specifically, we represent POIs as low-dimensional vectors to capture (1) the spatial co-occurrence patterns of POIs and (2) the semantics conveyed by the POI hierarchical categories (i.e. categorical semantics). The proposed approach utilizes spatially explicit random walks in a POI network to learn spatial co-occurrence patterns, and a manifold learning algorithm to capture categorical semantics. The learned POI vector embeddings are then aggregated to generate regional embeddings with long short-term memory (LSTM) and attention mechanisms, to take account of the different levels of importance among the POIs in a region. Finally, a multilayer perceptron (MLP) maps regional embeddings to functional distributions. A case study in Xiamen Island, China implements and evaluates the proposed approach. The results indicate that our approach outperforms several competitive baseline models in all evaluation measures, and yields a relatively high consistency between the estimation and ground truth. In addition, a comprehensive error analysis unveils several intrinsic limitations of POI data for this task, e.g. ambiguous linkage between POIs and functions.

15 citations


Journal ArticleDOI
TL;DR: In this review, eight common tasks and their solutions in social media content analysis for natural disasters are summarized and grouped and analyzed studies that make further use of this extracted information, either standalone or in combination with other sources.
Abstract: Abstract The idea of ‘citizen as sensors’ has gradually become a reality over the past decade. Today, Volunteered Geographic Information (VGI) from citizens is highly involved in acquiring information on natural disasters. In particular, the rapid development of deep learning techniques in computer vision and natural language processing in recent years has allowed more information related to natural disasters to be extracted from social media, such as the severity of building damage and flood water levels. Meanwhile, many recent studies have integrated information extracted from social media with that from other sources, such as remote sensing and sensor networks, to provide comprehensive and detailed information on natural disasters. Therefore, it is of great significance to review the existing work, given the rapid development of this field. In this review, we summarized eight common tasks and their solutions in social media content analysis for natural disasters. We also grouped and analyzed studies that make further use of this extracted information, either standalone or in combination with other sources. Based on the review, we identified and discussed challenges and opportunities.

12 citations


Journal ArticleDOI
TL;DR: This work sets the foundation of the new research line on 3D urban morphology by providing a comprehensive set of3D metrics, implementing them in openly released software, generating an open dataset containing 2D and 3D metrics for 823,000 buildings in the Netherlands, and demonstrating a use case where clusters and architectural patterns are analysed through time.
Abstract: Abstract Urban morphology is important in a broad range of investigations across the fields of city planning, transportation, climate, energy, and urban data science. Characterising buildings with a set of numerical metrics is fundamental to studying the urban form. Despite the rapid developments in 3D geoinformation science, and the growing 3D data availability, most studies simplify buildings to their 2D footprint, and when taking their height into account, they at most assume one height value per building, i.e. simple 3D. We take the first step in elevating building metrics into full/true 3D, uncovering the use of higher levels of detail, and taking into account the detailed shape of a building. We set the foundation of the new research line on 3D urban morphology by providing a comprehensive set of 3D metrics, implementing them in openly released software, generating an open dataset containing 2D and 3D metrics for 823,000 buildings in the Netherlands, and demonstrating a use case where clusters and architectural patterns are analysed through time. Our experiments suggest the added value of 3D metrics to complement existing counterparts, reducing ambiguity, and providing advanced insights. Furthermore, we provide a comparative analysis using different levels of detail of 3D building models.

11 citations


Journal ArticleDOI
TL;DR: In this article , the authors propose a training data model for AI in Earth Observation (EO) to allow documentation, storage, and sharing of geospatial training data in a distributed infrastructure.
Abstract: Abstract Artificial Intelligence Machine Learning (AI/ML), in particular Deep Learning (DL), is reorienting and transforming Earth Observation (EO). A consistent data model for delivery of training data will support the FAIR data principles (findable, accessible, interoperable, reusable) and enable Web-based use of training data in a spatial data infrastructure (SDI). Existing training datasets, including open source benchmark datasets, are usually packaged into public or personal repositories and lack discoverability and accessibility. Moreover, there is no unified method to describe the training data. Here we propose a training data model for AI in EO to allow documentation, storage, and sharing of geospatial training data in a distributed infrastructure. We present design rationales, information models, and an encoding method. Several scenarios illustrate the intended uses and benefits for EO DL applications in an open Web environment. The relationship with Open Geospatial Consortium (OGC) standards is also discussed, as is the impact on an AI-ready SDI.

10 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed various approaches, including type-based and regression-based approaches and their subtypes, and designed measures and methods to evaluate these approaches and concluded that the use of population data as referenced building data is an effective method for the assessment of OSM building completeness.
Abstract: Abstract OpenStreetMap (OSM) is currently an important source for building data, despite the existence of potential quality issues. Previous studies have assessed OSM data quality by comparing it with reference building data, which may not otherwise be readily available. This study assessed OSM building completeness using population data, and investigated the effectiveness of using population data for building reference data. We proposed various approaches, including type-based and regression-based approaches and their subtypes, and designed measures and methods to evaluate these approaches. Our evaluation examined four study areas in two countries, using global population data sets at three spatial resolutions (1-km, 100-m, and 30-m). Results showed that the type-based approach correctly classified approximately 80–99% of the assessed grid cells. The regression-based approach resulted in a high linear correlation (0.7 or greater) between the population counts and the referenced building count/building area size, with the strongest correlation present for the 1-km population dataset. We conclude that the use of population data as referenced building data is an effective method for the assessment of OSM building completeness. The paper concludes with the advantages and limitations of using both the type-based and the regression-based approaches.

9 citations


Journal ArticleDOI
TL;DR: In this paper , a hierarchical data-mining model was proposed to identify building function types using accessible auxiliary data, which was then applied to a case study, where residential building property was assessed to address missing residential POIs.
Abstract: Abstract Building function type is an important parameter for urban planning and disaster management. However, existing identification methods do not always correctly recognize all building functions because of missing point of interest (POI) data in private areas. In this study, we proposed a hierarchical data-mining model to identify building function types using accessible auxiliary data, which was then applied to a case study. Residential building property was assessed to address missing residential POIs. The building functions were assigned to one of five different types, or a mixed-function type. Standard deviation and mean values extracted from remotely sensed images, distances to major roads, and building shape parameters were used to infer the function types of buildings without assigned function types. The proposed model was able to identify 65% of buildings not previously assigned as residential through the POI, with an overall accuracy of 87%. In addition, all buildings were successfully assigned a function type of residential, commercial, office, warehouse, public service, or mixed-function, with an overall accuracy of 85% for unclassified buildings. Our results demonstrated that missing POI data in private areas could be addressed by integration with multisource data using a simple method.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a graph neural network constrained by environmental consistency (GNN-EC), which consists of graphs with nodes and edges, and aggregates node information in the graph for LSE.
Abstract: Abstract In complex and heterogeneous geoenvironments, landslides exhibit varying features in different environments, and data in landslide inventories are imbalanced. Existing data-driven landslide susceptibility evaluation (LSE) methods overlook environmental heterogeneity and cannot reliably predict regions with few samples. Alternatively, global random negative sampling strategies may produce imbalanced positive and negative samples in some environments, contributing to inaccurate predictions. This article proposes a graph neural network (GNN) constrained by environmental consistency (GNN-EC) to overcome these problems. The GNN-EC consists of graphs with nodes, and edges. A graph represents the environmental relationships in the study area. Nodes are geographic units delineated from terrain polygon approximation. Edges capture the relationships between node-pairs. Additionally, the weights of edges reflect the similarity between two node environments. A GNN aggregates node information in the graph for LSE. Our experiment showed that the proposed method outperformed the common machine learning methods: increasing prediction accuracy by approximately 7, 5–6 and 3–4% compared to the artificial neural network (ANN), the support vector machine (SVM) and the random forest (RF), respectively. Moreover, our method can maintain high prediction accuracy, even with a small training set.

Journal ArticleDOI
TL;DR: The 3D City Index as mentioned in this paper is a comprehensive four-category framework to evaluate 3D city models and derive quantitative and qualitative insights, which can also be applied to other instances in spatial data infrastructures.
Abstract: Abstract 3D city models are omnipresent in urban management and simulations. However, instruments for their evaluation have been limited. Furthermore, current instances are scattered worldwide and developed independently, hampering their comparison and understanding practices. While there are developed assessment frameworks in open data, such efforts are generic and not applied to geospatial data. We establish a holistic and comprehensive four-category framework ‘3D City Index’, encompassing 47 criteria to identify key properties of 3D city models, enabling their assessment and benchmarking, and suggesting usability. We evaluate 40 authoritative 3D city models and derive quantitative and qualitative insights. The framework implementation enables a comprehensive and structured understanding of the landscape of semantic 3D geospatial data, as well as doubles as an evaluated collection of open 3D city models. For example, datasets differ substantially in their characteristics, having heterogeneous properties influenced by their different purposes. There are further applications of this first endeavour to standardise the characterisation of 3D data: monitoring developments and trends in 3D city modelling, and enabling researchers and practitioners to find the most appropriate datasets for their needs. The work is designed to measure datasets continuously and can also be applied to other instances in spatial data infrastructures.

Journal ArticleDOI
TL;DR: In this article , the authors review the use of these models in geographical analysis, focusing on the representation and comparison of spatial patterns, and provide semantic linking across domains using similar model constructs through the lens of scale.
Abstract: Abstract Comparison of landscapes and patterns is a long-standing challenge in spatial analysis research. Recently, new models and tools developed for non-geographic image data are being used to study geographic problems involving classification or prediction. Specifically, computer vision models and artificial neural networks have been deployed in an ever-growing number of geographical analyses. In this paper, we review the use of these models in geographical analysis, focusing on the representation and comparison of spatial patterns. We review artificial neural networks and provide semantic linking across domains using similar model constructs through the lens of scale. We note that scale, a contextual element in geographical research, is typically considered a model parameter in computer vision. Scale impacts both computer vision techniques and traditional pixel-based or object-oriented analysis, yet computer vision methods such as CNNs are relatively robust to small-scale variations due to their capability to learn multiscale features via spatial filtering and the formation of scale-space tensors across layers. Parameterization of computer vision models to represent multiscale patterns however remains ad hoc. A typology of scales, therefore, provides a framework for mapping model constructs to develop guidelines for parameterizing and evaluating computer vision models in a geographic context.

Journal ArticleDOI
TL;DR: In this article , the authors explore the use of neural networks to reason over ternary projective relations such as between, and propose a subsymbolic approach to facilitate qualitative spatial reasoning.
Abstract: Abstract Qualitative spatial reasoning has been a core research topic in GIScience and AI for decades. It has been adopted in a wide range of applications such as wayfinding, question answering, and robotics. Most developed spatial inference engines use symbolic representation and reasoning, which focuses on small and densely connected data sets, and struggles to deal with noise and vagueness. However, with more sensors becoming available, reasoning over spatial relations on large-scale and noisy geospatial data sets requires more robust alternatives. This paper, therefore, proposes a subsymbolic approach using neural networks to facilitate qualitative spatial reasoning. More specifically, we focus on higher-order spatial relations as those have been largely ignored due to the binary nature of the underlying representations, e.g. knowledge graphs. We specifically explore the use of neural networks to reason over ternary projective relations such as between. We consider multiple types of spatial constraint, including higher-order relatedness and the conceptual neighborhood of ternary projective relations to make the proposed model spatially explicit. We introduce evaluating results demonstrating that the proposed spatially explicit method substantially outperforms the existing baseline by about 20%.

Journal ArticleDOI
TL;DR: In this article , the authors compared commonly used spatial and spatio-temporal methods to determine social media users' country of residence in Instagram posts from visitors to the Kruger National Park in South Africa.
Abstract: Abstract Identifying users’ place of residence is an important step in many social media analysis workflows. Various techniques for detecting home locations from social media data have been proposed, but their reliability has rarely been validated using ground truth data. In this article, we compared commonly used spatial and Spatio-temporal methods to determine social media users’ country of residence. We applied diverse methods to a global data set of publicly shared geo-located Instagram posts from visitors to the Kruger National Park in South Africa. We evaluated the performance of each method using both individual-level expert assessment for a sample of users and aggregate-level official visitor statistics. Based on the individual-level assessment, a simple Spatio-temporal approach was the best-performed for detecting the country of residence. Results show why aggregate-level official statistics are not the best indicators for evaluating method performance. We also show how social media usage, such as the number of countries visited and posting activity over time, affect the performance of methods. In addition to a methodological contribution, this work contributes to the discussion about spatial and temporal biases in mobile big data.

Journal ArticleDOI
TL;DR: In this article , a geographically and temporally weighted co-location quotient which includes global and local computation, a method to calculate a spatiotemporal weight matrix and a significance test using Monte Carlo simulation is used to identify spatio-temporal crime patterns across Greater Manchester.
Abstract: Abstract Incident data, a form of big data frequently used in urban studies, are characterized by point features with high spatial and temporal resolution and categorical values. In contrast to panel data, such spatial data pooled over time reflect multi-directional spatial effects but only unidirectional temporal effects, which are challenging to analyze. This paper presents an innovative approach to address this challenge – a geographically and temporally weighted co-location quotient which includes global and local computation, a method to calculate a spatiotemporal weight matrix and a significance test using Monte Carlo simulation. This new approach is used to identify spatio-temporal crime patterns across Greater Manchester in 2016 from open source recorded crime data. The results show that this approach is suitable for the analysis and visualization of spatio-temporal dependence and heterogeneity in categorical spatial data pooled over time. It is particularly useful for detecting symmetrical spatio-temporal co-location patterns and mapping local clusters. The method also addresses the unbalanced temporal scale problem caused by unidirectional temporal data representation and explores potential impacts. The empirical evidence of the spatiotemporal crime patterns might usefully be deployed to inform the development of criminological theory by helping to disentangle the relationships between crime and the urban environment.

Journal ArticleDOI
TL;DR: In this article , a Generalized Heterogeneity Model (GHM) is proposed for characterizing local and stratified heterogeneity within variables and to improve interpolation accuracy, which can be used to predict the spatial distributions of marine chlorophyll in Townsville, Queensland, Australia.
Abstract: Abstract Spatial heterogeneity refers to uneven distributions of geographical variables. Spatial interpolation methods that utilize spatial heterogeneity are sensitive to the way in which spatial heterogeneity is characterized. This study developed a Generalized Heterogeneity Model (GHM) for characterizing local and stratified heterogeneity within variables and to improve interpolation accuracy. GHM first divides a study area into multiple spatial strata according to the sample values and locations of a variable. Then, GHM estimates simultaneously the spatial variations of the variable within and between the spatial strata. Finally, GHM interpolates unbiased estimates and uncertainty at unsampled locations. We demonstrated the GHM by predicting the spatial distributions of marine chlorophyll in Townsville, Queensland, Australia. Results show that GHM improved both the overall interpolation accuracy across the study area and along strata boundaries compared with previous interpolation models. GHM also avoided bull’s eye patterns and abrupt changes along strata boundaries. In future studies, GHM has the potential to be integrated with machine learning and advanced algorithms to improve spatial prediction accuracy for studies in broader fields.


Journal ArticleDOI
TL;DR: A graph-based deep learning approach for interchange detection that computes the shape measures and contextual properties of individual road segments for features characterizing the associated nodes in the graph and an adaptive clustering approach groups the detected interchange segments into interchanges.
Abstract: Abstract Detecting interchanges in road networks benefit many applications, such as vehicle navigation and map generalization. Traditional approaches use manually defined rules based on geometric, topological, or both properties, and thus can present challenges for structurally complex interchange. To overcome this drawback, we propose a graph-based deep learning approach for interchange detection. First, we model the road network as a graph in which the nodes represent road segments, and the edges represent their connections. The proposed approach computes the shape measures and contextual properties of individual road segments for features characterizing the associated nodes in the graph. Next, a semi-supervised approach uses these features and limited labeled interchanges to train a graph convolutional network that classifies these road segments into an interchange and non-interchange segments. Finally, an adaptive clustering approach groups the detected interchange segments into interchanges. Our experiment with the road networks of Beijing and Wuhan achieved a classification accuracy >95% at a label rate of 10%. Moreover, the interchange detection precision and recall were 79.6 and 75.7% on the Beijing dataset and 80.6 and 74.8% on the Wuhan dataset, respectively, which were 18.3–36.1 and 17.4–19.4% higher than those of the existing approaches based on characteristic node clustering.

Journal ArticleDOI
TL;DR: In this paper , the authors examined bandwidth at a conceptual, operational and empirical level within the framework of geographically weighted regression, one of the more frequently employed local spatial models, and outlined how bandwidth relates to three characteristics of spatial processes: variation, dependence and strength.
Abstract: Abstract Models designed to capture spatially varying processes are now employed extensively in the social and environmental sciences. The main strength of such models is their ability to represent relationships that vary across locations through locally varying parameter estimates. However, local models of spatial processes also provide information on the nature of these spatially varying relationships through the estimation of a ‘bandwidth’ parameter. This paper examines bandwidth at a conceptual, operational and empirical level within the framework of geographically weighted regression, one of the more frequently employed local spatial models. We outline how bandwidth relates to three characteristics of spatial processes: variation, dependence and strength.

Journal ArticleDOI
TL;DR: In this paper , the authors explore the use case of mountain road generalisation, to explore the potential of a specific deep learning approach: generative adversarial networks (GAN) to generate images that depict road maps generalised at the 1:250k scale, from images depicting road maps of the same area using un-generalised 1:25k data.
Abstract: Abstract Map generalisation is a process that transforms geographic information for a cartographic at a specific scale. The goal is to produce legible and informative maps even at small scales from a detailed dataset. The potential of deep learning to help in this task is still unknown. This article examines the use case of mountain road generalisation, to explore the potential of a specific deep learning approach: generative adversarial networks (GAN). Our goal is to generate images that depict road maps generalised at the 1:250k scale, from images that depict road maps of the same area using un-generalised 1:25k data. This paper not only shows the potential of deep learning to generate generalised mountain roads, but also analyses how the process of deep learning generalisation works, compares supervised and unsupervised learning and explores possible improvements. With this experiment we have exhibited an unsupervised model that is able to generate generalised maps evaluated as good as the reference and reviewed some possible improvements for deep learning-based generalisation, including training set management and the definition of a new road connectivity loss. All our results are evaluated visually using a four questions process and validated by a user test conducted on 113 individuals.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a density-based clustering method for bivariate flows to identify bivariate-flow clusters with different flow density combinations, at least one of which exhibits a clustering pattern.
Abstract: Abstract Geographical flows reflect the movements, spatial interactions or connections among locations and are generally abstracted as origin-destination (OD) flows. In this context, clustering is a spatial pattern describing a group of flows with adjacent O and D points. For data composed of two types of flows (bivariate-flow data), a bivariate-flow cluster is a cluster comprising two types of flows, at least one of which exhibits a clustering pattern. In a bivariate-flow cluster, varying flow density combinations imply different meanings. For instance, a cluster with high-density travel flows on both weekdays (type A) and weekends (type B) may be associated with entertainment, whereas high-density flows on weekdays and sparse flows on weekends may reveal work-related travel. However, identifying bivariate-flow clusters with different flow density combinations is still an unsolved problem. To this end, we extend a bivariate-point clustering method and propose a density-based clustering method for bivariate flows. The simulation experiments verify model robustness. In a case study, we apply this method to extract clusters of bivariate-flow data comprising Beijing taxi OD flows of different periods, and identify clusters of work-related, entertainment, tourism, or egress and return travels. These results demonstrate the capability of our method in detecting bivariate-flow clusters.

Journal ArticleDOI
TL;DR: In this paper , a hierarchical compressed linear reference (CLR) model is proposed to transform network-constrained time geographic entities from 3D (x, y, t) space into two-dimensional (2D) space.
Abstract: Abstract The availability of Spatiotemporal Big Data has provided a golden opportunity for time geographical studies that have long been constrained by the lack of individual-level data. However, how to store, manage, and query a huge number of time geographic entities effectively and efficiently with complex spatiotemporal characteristics and relationships poses a significant challenge to contemporary GIS platforms. In this article, a hierarchical compressed linear reference (CLR) model is proposed to transform network-constrained time geographic entities from three-dimensional (3D) (x, y, t) space into two-dimensional (2D) space. Accordingly, time geographic entities can be represented as 2D spatial entities and stored in a classical spatial database. The proposed CLR model supports a hierarchical linear reference system (LRS) including not only underlying a link-based LRS but also multiple higher-level route-based LRSs. In addition, an LRS-based spatiotemporal index structure is developed to index both time geographic entities and the corresponding hierarchical network. The results of computational experiments on large datasets of space–time paths and prisms show that the proposed hierarchical CLR model is effective at storing and managing time geographic entities in road networks. The developed index structure achieves satisfactory query performance in milliseconds on large datasets of time geographic entities.

Journal ArticleDOI
TL;DR: In this article , the usability of an adaptive indoor route guidance system is tested by tracking the way-finding and gaze behavior of the users, and the difference in gaze behavior between all route instruction types is compared.
Abstract: Abstract Every route instruction type (e.g. map, symbol, photo) induces a specific cognitive load. However, when these types are used at different decision points in a building, the building configuration of these points also influences the induced cognitive load. Therefore, the process of route guidance results in an interaction between the instruction type and the decision point, which determines the induced cognitive load. One way of reducing cognitive load during route guidance is by using adaptive systems that show specific route instruction types at specific decision points. Therefore, in this VR experiment, the usability of such an adaptive indoor route guidance system is tested by tracking the wayfinding and gaze behavior of the users. First, the difference in wayfinding and gaze behavior between all route instruction types is compared. Next, the building configuration at the decision points is quantified through the architectural theory of space syntax, and the correlation with the wayfinding and gaze behavior is determined. Our findings indicate that adapting the route instruction type does make a difference for the user.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a dynamic temporal graph neural network considering missing values (D-TGNM) model to mine traffic flow patterns in missing data scenarios for traffic flow prediction.
Abstract: Abstract Accurate traffic flow prediction on the urban road network is an indispensable function of Intelligent Transportation Systems (ITS), which is of great significance for urban traffic planning. However, the current traffic flow prediction methods still face many challenges, such as missing values and dynamic spatial relationships in traffic flow. In this study, a dynamic temporal graph neural network considering missing values (D-TGNM) is proposed for traffic flow prediction. First, inspired by the Bidirectional Encoder Representations from Transformers (BERT), we extend the classic BERT model, called Traffic BERT, to learn the dynamic spatial associations on the road structure. Second, we propose a temporal graph neural network considering missing values (TGNM) to mine traffic flow patterns in missing data scenarios for traffic flow prediction. Finally, the proposed D-TGNM model can be obtained by integrating the dynamic spatial associations learned by Traffic BERT into the TGNM model. To train the D-TGNM model, we design a novel loss function, which considers the missing values problem and prediction problem in traffic flow, to optimize the proposed model. The proposed model was validated on an actual traffic dataset collected in Wuhan, China. Experimental results showed that D-TGNM achieved good prediction results under four missing data scenarios (15% random missing, 15% block missing, 30% random missing, and 30% block missing), and outperformed ten existing state-of-the-art baselines.

Journal ArticleDOI
TL;DR: The proposed infrastructure, GeoCube, extends the capacity of data cubes to multi-source big vector and raster data and improves EO data cube management and keeps connections with the business intelligence cube, which provides supplementary information for Eo data cube processing.
Abstract: Abstract Data management and analysis are challenging with big Earth observation (EO) data. Expanding upon the rising promises of data cubes for analysis-ready big EO data, we propose a new geospatial infrastructure layered over a data cube to facilitate big EO data management and analysis. Compared to previous work on data cubes, the proposed infrastructure, GeoCube, extends the capacity of data cubes to multi-source big vector and raster data. GeoCube is developed in terms of three major efforts: formalize cube dimensions for multi-source geospatial data, process geospatial data query along these dimensions, and organize cube data for high-performance geoprocessing. This strategy improves EO data cube management and keeps connections with the business intelligence cube, which provides supplementary information for EO data cube processing. The paper highlights the major efforts and key research contributions to online analytical processing for dimension formalization, distributed cube objects for tiles, and artificial intelligence enabled prediction of computational intensity for data cube processing. Case studies with data from Landsat, Gaofen, and OpenStreetMap demonstrate the capabilities and applicability of the proposed infrastructure.

Journal ArticleDOI
TL;DR: In this article , a machine learning approach is proposed for mining spatial relations in Chinese geological texts, and the extracted spatial relations are classified into three major categories (topological relations, absolute directional relations and relative directional relations) and 14 subcategories.
Abstract: Texts have become an important spatial data resource. Interpretation of unstructured geoscience texts using natural language processing methods can effectively facilitate the discovery and retrieval of geographic information. Yet studies on the extraction of spatial information from textual geoscience data are limited compared to digital geoscience data. In this work, a machine learning approach is proposed for mining spatial relations in Chinese geological texts. The approach views spatial relation extraction as a sequence labeling problem, avoids the division of relation categories, and enables mining fine-grained spatial relations. The extracted geological texts commonly describe three-dimensional spatial relations among regions, strata, and lithologies. The extracted spatial relations are classified into three major categories (topological relations, absolute directional relations and relative directional relations) and 14 subcategories. We validated the proposed model with a test dataset, constructed visual displays of the extracted spatial relations on different topics, and quantified the uncertainty in the process from spatial entity recognition to spatial relation extraction. With the detailed portrayal of these spatial relations, this study provides support for solving theoretical and practical problems of cognition, prediction, decision-making, and evaluation in geoscience.

Journal ArticleDOI
TL;DR: This paper examined how different professional groups (agricultural scientists or health and nutrition experts) interpret information, presented this way, when making a decision about interventions to address human selenium deficiency.
Abstract: Abstract Spatial information, inferred from samples, is needed for decision-making, but is uncertain. One way to convey uncertain information is with probabilities (e.g. that a value falls below a critical threshold). We examined how different professional groups (agricultural scientists or health and nutrition experts) interpret information, presented this way, when making a decision about interventions to address human selenium (Se) deficiency. The information provided was a map, either of the probability that Se concentration in local staple grain falls below a nutritionally-significant threshold (negative framing) or of the probability that grain Se concentration is above the threshold (positive framing). There was evidence for an effect of professional group and of framing on the decision process. Negative framing led to more conservative decisions; intervention was recommended at a smaller probability that the grain Se is inadequate than if the question were framed positively, and the decisions were more comparable between professional groups under negative framing. Our results show the importance of framing in probabilistic presentations of uncertainty, and of the background of the interpreter. Our experimental approach could be used to elicit threshold probabilities which represent the preferences of stakeholder communities to support them in the interpretation of uncertain information.

Journal ArticleDOI
Qiang Liu, Jie Yang, Min Deng, Wenkai Liu, Rui Xu 
TL;DR: Wang et al. as discussed by the authors proposed a novel bivariate flow clustering method by improving a multidirectional optimum ecotope-based algorithm (AMOEBA) which embeds local Getis-Ord statistic in an iterative procedure to detect irregular-shaped clusters.
Abstract: Abstract A bivariate flow cluster is a group of two types of spatial flows, where both types of flows have high (or low) values, or one type of flow has a high value while the other has a low value. Identifying bivariate flow clusters aids in understanding the complex interactions between different flow patterns. Detecting bivariate flow clusters remains challenging because statistics for quantitatively assessing bivariate flow clusters are lacking and the shapes and sizes of clusters vary. This study proposes a novel bivariate flow clustering method (BiFlowAMOEBA) by improving a multidirectional optimum ecotope-based algorithm (AMOEBA) which embeds local Getis-Ord statistic in an iterative procedure to detect irregular-shaped clusters. We define a bivariate local Getis-Ord statistic for quantitatively assessing bivariate flow clusters, use a hierarchical clustering strategy to construct clusters, and evaluate the statistical significance of clusters using a Monte Carlo simulation. Experimental results of simulated datasets show that BiFlowAMOEBA can identify bivariate flow clusters of different shapes more accurately and completely, compared with two state-of-the-art methods. Two case studies show that BiFlowAMOEBA helps not only unveil the interactions between public transport and taxi services but also identifies competition patterns between taxis and ride-hailing services.