scispace - formally typeset
Search or ask a question

Showing papers on "Spatial analysis published in 2016"


01 Jan 2016

983 citations


Journal ArticleDOI
TL;DR: Multiscale convolutional neural network has two merits: high-level spatial features can be effectively learned by using the hierarchical learning structure and the multiscale learning scheme can capture contextual information at different scales.
Abstract: It is widely agreed that spatial features can be combined with spectral properties for improving interpretation performances on very-high-resolution (VHR) images in urban areas. However, many existing methods for extracting spatial features can only generate low-level features and consider limited scales, leading to unpleasant classification results. In this study, multiscale convolutional neural network (MCNN) algorithm was presented to learn spatial-related deep features for hyperspectral remote imagery classification. Unlike traditional methods for extracting spatial features, the MCNN first transforms the original data sets into a pyramid structure containing spatial information at multiple scales, and then automatically extracts high-level spatial features using multiscale training data sets. Specifically, the MCNN has two merits: (1) high-level spatial features can be effectively learned by using the hierarchical learning structure and (2) multiscale learning scheme can capture contextual information at different scales. To evaluate the effectiveness of the proposed approach, the MCNN was applied to classify the well-known hyperspectral data sets and compared with traditional methods. The experimental results shown a significant increase in classification accuracies especially for urban areas.

323 citations


Journal ArticleDOI
TL;DR: It is demonstrated that cloud cover varies strongly in its geographic heterogeneity and that the direct, observation-based nature of cloud-derived metrics can improve predictions of habitats, ecosystem, and species distributions with reduced spatial autocorrelation compared to commonly used interpolated climate data.
Abstract: Cloud cover can influence numerous important ecological processes, including reproduction, growth, survival, and behavior, yet our assessment of its importance at the appropriate spatial scales has remained remarkably limited. If captured over a large extent yet at sufficiently fine spatial grain, cloud cover dynamics may provide key information for delineating a variety of habitat types and predicting species distributions. Here, we develop new near-global, fine-grain (≈1 km) monthly cloud frequencies from 15 y of twice-daily Moderate Resolution Imaging Spectroradiometer (MODIS) satellite images that expose spatiotemporal cloud cover dynamics of previously undocumented global complexity. We demonstrate that cloud cover varies strongly in its geographic heterogeneity and that the direct, observation-based nature of cloud-derived metrics can improve predictions of habitats, ecosystem, and species distributions with reduced spatial autocorrelation compared to commonly used interpolated climate data. These findings support the fundamental role of remote sensing as an effective lens through which to understand and globally monitor the fine-grain spatial variability of key biodiversity and ecosystem properties.

269 citations


Proceedings ArticleDOI
14 Jun 2016
TL;DR: Simba is a scalable and efficient in-memory spatial query processing and analytics for big spatial data that extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and the DataFrame API.
Abstract: Large spatial data becomes ubiquitous. As a result, it is critical to provide fast, scalable, and high-throughput spatial queries and analytics for numerous applications in location-based services (LBS). Traditional spatial databases and spatial analytics systems are disk-based and optimized for IO efficiency. But increasingly, data are stored and processed in memory to achieve low latency, and CPU time becomes the new bottleneck. We present the Simba (Spatial In-Memory Big data Analytics) system that offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. Simba is based on Spark and runs over a cluster of commodity machines. In particular, Simba extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and the DataFrame API. It introduces indexes over RDDs in order to work with big spatial data and complex spatial operations. Lastly, Simba implements an effective query optimizer, which leverages its indexes and novel spatial-aware optimizations, to achieve both low latency and high throughput. Extensive experiments over large data sets demonstrate Simba's superior performance compared against other spatial analytics system.

228 citations


Journal ArticleDOI
TL;DR: A random distribution of soil eukaryotes with respect to space and environment in the absence of environmental gradients at the local scale is indicated, reflecting the dominant role of drift and homogenizing dispersal.
Abstract: A central challenge in ecology is to understand the relative importance of processes that shape diversity patterns. Compared with aboveground biota, little is known about spatial patterns and processes in soil organisms. Here we examine the spatial structure of communities of small soil eukaryotes to elucidate the underlying stochastic and deterministic processes in the absence of environmental gradients at a local scale. Specifically, we focus on the fine-scale spatial autocorrelation of prominent taxonomic and functional groups of eukaryotic microbes. We collected 123 soil samples in a nested design at distances ranging from 0.01 to 64 m from three boreal forest sites and used 454 pyrosequencing analysis of Internal Transcribed Spacer for detecting Operational Taxonomic Units of major eukaryotic groups simultaneously. Among the main taxonomic groups, we found significant but weak spatial variability only in the communities of Fungi and Rhizaria. Within Fungi, ectomycorrhizas and pathogens exhibited stronger spatial structure compared with saprotrophs and corresponded to vegetation. For the groups with significant spatial structure, autocorrelation occurred at a very fine scale (<2 m). Both dispersal limitation and environmental selection had a weak effect on communities as reflected in negative or null deviation of communities, which was also supported by multivariate analysis, that is, environment, spatial processes and their shared effects explained on average <10% of variance. Taken together, these results indicate a random distribution of soil eukaryotes with respect to space and environment in the absence of environmental gradients at the local scale, reflecting the dominant role of drift and homogenizing dispersal.

221 citations



Journal ArticleDOI
TL;DR: Downscaled ten-band 10 m Sentinel-2 datasets represent important and promising products for a wide range of applications in remote sensing and have potential for blending with the upcoming Sentinel-3 data for fine spatio-temporal resolution monitoring at the global scale.

159 citations


Journal ArticleDOI
TL;DR: It is illustrated that the inclusion of the spatial latent factors greatly increases the predictive performance of the modelling approach with a case study of 55 species of butterfly recorded on a 10 km × 10 km grid in Great Britain consisting of 2609 grid cells.
Abstract: 1. Modern species distribution models account for spatial autocorrelation in order to obtain unbiased statistical inference on the effects of covariates, to improve the model's predictive ability through spatial interpolation and to gain insight in the spatial processes shaping the data. Somewhat analogously, hierarchical approaches to community-level data have been developed to gain insights into community-level processes and to improve species-level inference by borrowing information from other species that are either ecologically or phylogenetically related to the focal species. 2. We unify spatial and community-level structures by developing spatially explicit joint species distribution models. The models utilize spatially structured latent factors to model missing covariates as well as species-to-species associations in a statistically and computationally effective manner. 3. We illustrate that the inclusion of the spatial latent factors greatly increases the predictive performance of the modelling approach with a case study of 55 species of butterfly recorded on a 10 km × 10 km grid in Great Britain consisting of 2609 grid cells.

156 citations


Journal ArticleDOI
01 Sep 2016
TL;DR: This work builds two new layers over Spark, namely a query scheduler and a query executor, and embeds an efficient spatial Bloom filter into LocationSpark's indexes to avoid unnecessary network communication overhead when processing overlapped spatial data.
Abstract: We present LocationSpark, a spatial data processing system built on top of Apache Spark, a widely used distributed data processing system. LocationSpark offers a rich set of spatial query operators, e.g., range search, kNN, spatio-textual operation, spatial-join, and kNN-join. To achieve high performance, LocationSpark employs various spatial indexes for in-memory data, and guarantees that immutable spatial indexes have low overhead with fault tolerance. In addition, we build two new layers over Spark, namely a query scheduler and a query executor. The query scheduler is responsible for mitigating skew in spatial queries, while the query executor selects the best plan based on the indexes and the nature of the spatial queries. Furthermore, to avoid unnecessary network communication overhead when processing overlapped spatial data, We embed an efficient spatial Bloom filter into LocationSpark's indexes. Finally, LocationSpark tracks frequently accessed spatial data, and dynamically flushes less frequently accessed data into disk. We evaluate our system on real workloads and demonstrate that it achieves an order of magnitude performance gain over a baseline framework.

150 citations


Journal ArticleDOI
TL;DR: A standard canon of SPPA techniques in ecology has been largely identified and that most of the earlier technical issues that occupied ecologists, such as edge correction, have been solved; however, the majority of studies underused the methodological potential offered by modern SPPA.
Abstract: Over the last two decades spatial point pattern analysis (SPPA) has become increasingly popular in ecological research. To direct future work in this area we review studies using SPPA techniques in ecology and related disciplines. We first summarize the key elements of SPPA in ecology (i.e. data types, summary statistics and their estimation, null models, comparison of data and models, and consideration of heterogeneity); second, we review how ecologists have used these key elements; and finally, we identify practical difficulties that are still commonly encountered and point to new methods that allow current key questions in ecology to be effectively addressed. Our review of 308 articles published over the period 1992–2012 reveals that a standard canon of SPPA techniques in ecology has been largely identified and that most of the earlier technical issues that occupied ecologists, such as edge correction, have been solved. However, the majority of studies underused the methodological potential offered by modern SPPA. More advanced techniques of SPPA offer the potential to address a variety of highly relevant ecological questions. For example, inhomogeneous summary statistics can quantify the impact of heterogeneous environments, mark correlation functions can include trait and phylogenetic information in the analysis of multivariate spatial patterns, and more refined point process models can be used to realistically characterize the structure of a wide range of patterns. Additionally, recent advances in fitting spatially-explicit simulation models of community dynamics to point pattern summary statistics hold the promise for solving the longstanding problem of linking pattern to process. All these newer developments allow ecologists to keep up with the increasing availability of spatial data sets provided by newer technologies, which allow point patterns and environmental variables to be mapped over large spatial extents at increasingly higher image resolutions.

140 citations


Journal ArticleDOI
TL;DR: Citizen science datasets, which rely on untrained amateurs, are more heavily prone to spatial biases from infrastructure and human population density, andObjectives and protocols of mass-participating projects should be designed with this in mind.
Abstract: Aim To understand how the integration of contextual spatial data on land cover and human infrastructure can help reduce spatial bias in sampling effort, and improve the utilization of citizen science-based species recording schemes. By comparing four different citizen science projects, we explore how the sampling design's complexity affects the role of these spatial biases. Location Denmark, Europe. Methods We used a point process model to estimate the effect of land cover and human infrastructure on the intensity of observations from four different citizen science species recording schemes. We then use these results to predict areas of under- and oversampling as well as relative biodiversity ‘hotspots’ and ‘deserts’, accounting for common spatial biases introduced in unstructured sampling designs. Results We demonstrate that the explanatory power of spatial biases such as infrastructure and human population density increased as the complexity of the sampling schemes decreased. Despite a low absolute sampling effort in agricultural landscapes, these areas still appeared oversampled compared to the observed species richness. Conversely, forests and grassland appeared undersampled despite higher absolute sampling efforts. We also present a novel and effective analytical approach to address spatial biases in unstructured sampling schemes and a new way to address such biases, when more structured sampling is not an option. Main conclusions We show that citizen science datasets, which rely on untrained amateurs, are more heavily prone to spatial biases from infrastructure and human population density. Objectives and protocols of mass-participating projects should thus be designed with this in mind. Our results suggest that, where contextual data is available, modelling the intensity of individual observation can help understand and quantify how spatial biases affect the observed biological patterns.

Journal ArticleDOI
TL;DR: A tensor based method considering the full spatial–temporal information of traffic flow, is proposed to fuse the traffic flow data from multiple detecting locations and achieves a better imputation performance than the method without spatial information.
Abstract: Missing and suspicious traffic data is a major problem for intelligent transportation system, which adversely affects a diverse variety of transportation applications. Several missing traffic data imputation methods had been proposed in the last decade. It is still an open problem of how to make full use of spatial information from upstream/downstream detectors to improve imputing performance. In this paper, a tensor based method considering the full spatial–temporal information of traffic flow, is proposed to fuse the traffic flow data from multiple detecting locations. The traffic flow data is reconstructed in a 4-way tensor pattern, and the low-n-rank tensor completion algorithm is applied to impute missing data. This novel approach not only fully utilizes the spatial information from neighboring locations, but also can impute missing data in different locations under a unified framework. Experiments demonstrate that the proposed method achieves a better imputation performance than the method without spatial information. The experimental results show that the proposed method can address the extreme case where the data of a long period of one or several weeks are completely missing.

Journal ArticleDOI
TL;DR: The spatial configuration of high level hospitals in Shenzhen is not well balanced, and further optimization is urgently needed.

Journal ArticleDOI
TL;DR: These findings show how patterns of geographic and environmental space use correspond to the two sides of a coin, linked by movement responses of individuals to environmental heterogeneity, sets the basis for new theoretical and methodological advances in movement ecology.
Abstract: Summary 1. Animal space use has been studied by focusing either on geographic (e.g. home ranges, species’ distribution) or on environmental (e.g. habitat use and selection) space. However, all patterns of space use emerge from individual movements, which are the primary means by which animals change their environment. 2. Individuals increase their use of a given area by adjusting two key movement components: the duration of their visit and/or the frequency of revisits. Thus, in spatially heterogeneous environments, animals exploit known, high-quality resource areas by increasing their residence time (RT) in and/or decreasing their time to return (TtoR) to these areas. We expected that spatial variation in these two movement properties should lead to observed patterns of space use in both geographic and environmental spaces. We derived a set of nine predictions linking spatial distribution of movement properties to emerging space-use patterns. We predicted that, at a given scale, high variation in RT and TtoR among habitats leads to strong habitat selection and that long RT and short TtoR result in a small home range size. 3. We tested these predictions using moose (Alces alces) GPS tracking data. We first modelled the relationship between landscape characteristics and movement properties. Then, we investigated how the spatial distribution of predicted movement properties (i.e. spatial autocorrelation, mean, and variance of RT and TtoR) influences home range size and hierarchical habitat selection. 4. In landscapes with high spatial autocorrelation of RT and TtoR, a high variation in both RT and TtoR occurred in home ranges. As expected, home range location was highly selective in such landscapes (i.e. second-order habitat selection); RT was higher and TtoR lower within the selected home range than outside, and moose home ranges were small. Within home ranges, a higher variation in both RT and TtoR was associated with higher selectivity among habitat types (i.e. third-order habitat selection). 5. Our findings show how patterns of geographic and environmental space use correspond to the two sides of a coin, linked by movement responses of individuals to environmental heterogeneity. By demonstrating the potential to assess the consequences of altering RT or TtoR (e.g. through human disturbance or climatic changes) on home range size and habitat selection, our work sets the basis for new theoretical and methodological advances in movement ecology.

Journal ArticleDOI
TL;DR: A linear conditional Gaussian (LCG) Bayesian network (BN) model is utilized to consider both spatial and temporal dimensions of traffic as well as speed information for short-term traffic flow prediction, indicating that the prediction accuracy will increase significantly when both spatial data and speed data are included.
Abstract: Summary Traffic flow prediction is an essential part of intelligent transportation systems (ITS). Most of the previous traffic flow prediction work treated traffic flow as a time series process only, ignoring the spatial relationship from the upstream flows or the correlation with other traffic attributes like speed and density. In this paper, we utilize a linear conditional Gaussian (LCG) Bayesian network (BN) model to consider both spatial and temporal dimensions of traffic as well as speed information for short-term traffic flow prediction. The LCG BN allows both continuous and discrete variables, which enables the consideration of categorical variables in traffic flow prediction. A microscopic traffic simulation dataset is used to test the performance of the proposed model compared to other popular approaches under different predicting time intervals. In addition, the authors investigate the importance of spatial data and speed data in flow prediction by comparing models with different levels of information. The results indicate that the prediction accuracy will increase significantly when both spatial data and speed data are included. Copyright © 2016 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: Data mining, statistical analysis, and semantic analysis methods are explored to obtain valuable information on public opinion and requirements based on Chinese microblog data to enhance situation awareness and help the government offer more effective assistance.
Abstract: With the advances of information communication technologies, it is critical to improve the efficiency and accuracy of emergency management systems through modern data processing techniques. Geographic information system (GIS) models and simulation capabilities are used to exercise response and recovery plans during non-disaster times. They help the decision-makers understand near real-time possibilities during an event. In this paper, a participatory sensing-based model for mining spatial information of urban emergency events is introduced. Firstly, basic definitions of the proposed method are given. Secondly, positive samples are selected to mine the spatial information of urban emergency events. Thirdly, location and GIS information are extracted from positive samples. At last, the real spatial information is determined based on address and GIS information. Moreover, this study explores data mining, statistical analysis, and semantic analysis methods to obtain valuable information on public opinion and requirements based on Chinese microblog data. Typhoon Chan-hom is used as an example. Semantic analysis on microblog data is conducted and high-frequency keywords in different provinces are extracted for different stages of the event. With the geo-tagged and time-tagged data, the collected microblog data can be classified into different categories. Correspondingly, public opinion and requirements can be obtained from the spatial and temporal perspectives to enhance situation awareness and help the government offer more effective assistance.

BookDOI
06 Oct 2016
TL;DR: This is a revised and updated second edition, including new chapters on temporal and point uncertainty model, as well as on sampling and deterministic modeling, and several alternatives to Kriging methodology are presented.
Abstract: This is a revised and updated second edition, including new chapters on temporal and point uncertainty model, as well as on sampling and deterministic modeling. It is a comprehensive presentation of spatial modeling techniques used in the earth sciences, outlining original techniques developed by the author. Data collection in the earth sciences is difficult and expensive, but simple, rational and logical approaches help the reader to appreciate the fundamentals of advanced methodologies. It requires special care to gather accurate geological, hydrogeological, meteorological and hydrological information all with risk assessments. Spatial simulation methodologies in the earth sciences are essential, then, if we want to understand the variability in features such as fracture frequencies, rock quality, and grain size distribution in rock and porous media. This book outlines in a detailed yet accessible way the main spatial modeling techniques, in particular the Kriging methodology. It also presents many unique physical approaches, field cases, and sample interpretations. Since Krigings origin in the 1960s it has been developed into a number of new methods such as cumulative SV (CSV), point CSV (PCSV), and spatial dependence function, which have been applied in different aspects of the earth sciences. Each one of these techniques is explained in this book, as well as how they are used to model earth science phenomena such as geology, earthquakes, meteorology, and hydrology. In addition to Kriging and its variants, several alternatives to Kriging methodology are presented and the necessary steps in their applications are clearly explained. Simple spatial variation prediction methodologies are also revised with up-to-date literature, and the ways in which they relate to more advanced spatial modeling methodologies are explained. The book is a valuable resource for students, researchers and professionals of a broad range of disciplines including geology, geography, hydrology, meteorology, environment, image processing, spatial modeling and related topics. Keywords Data mining - Geo-statistics - Kriging - Regional uncertainty - Spatial dependence - Spatial modeling - geographic data - geoscience - hydrology - image processing

Proceedings ArticleDOI
14 Mar 2016
TL;DR: This study introduces hyperlocal spatial crowdsourcing, where all workers who are located within the spatiotemporal vicinity of a task are eligible to perform the task, e.g., reporting the precipitation level at their area and time.
Abstract: Spatial Crowdsourcing (SC) is a novel platform that engages individuals in the act of collecting various types of spatial data. This method of data collection can significantly reduce cost and turnover time, and is particularly useful in environmental sensing, where traditional means fail to provide fine-grained field data. In this study, we introduce hyperlocal spatial crowdsourcing, where all workers who are located within the spatiotemporal vicinity of a task are eligible to perform the task, e.g., reporting the precipitation level at their area and time. In this setting, there is often a budget constraint, either for every time period or for the entire campaign, on the number of workers to activate to perform tasks. The challenge is thus to maximize the number of assigned tasks under the budget constraint, despite the dynamic arrivals of workers and tasks as well as their co-location relationship. We study two problem variants in this paper: budget is constrained for every timestamp, i.e. fixed, and budget is constrained for the entire campaign, i.e. dynamic. For each variant, we study the complexity of its offline version and then propose several heuristics for the online version which exploit the spatial and temporal knowledge acquired over time. Extensive experiments with real-world and synthetic datasets show the effectiveness and efficiency of our proposed solutions.

Journal ArticleDOI
TL;DR: This article presents a methodology to acquire and process high resolution data for coastal zones acquired by a vertical take off and landing (VTOL) unmanned aerial vehicle (UAV) attached to a small commercial camera and shows that the presented methodology is a robust tool for the classification, 3D visualization, and mapping of coastal morphology.
Abstract: Spatial data acquisition is a critical process for the identification of the coastline and coastal zones for scientists involved in the study of coastal morphology. The availability of very high-resolution digital surface models (DSMs) and orthophoto maps is of increasing interest to all scientists, especially those monitoring small variations in the earth’s surface, such as coastline morphology. In this article, we present a methodology to acquire and process high resolution data for coastal zones acquired by a vertical take off and landing (VTOL) unmanned aerial vehicle (UAV) attached to a small commercial camera. The proposed methodology integrated computer vision algorithms for 3D representation with image processing techniques for analysis. The computer vision algorithms used the structure from motion (SfM) approach while the image processing techniques used the geographic object-based image analysis (GEOBIA) with fuzzy classification. The SfM pipeline was used to construct the DSMs and orthophotos with a measurement precision in the order of centimeters. Consequently, GEOBIA was used to create objects by grouping pixels that had the same spectral characteristics together and extracting statistical features from them. The objects produced were classified by fuzzy classification using the statistical features as input. The classification output classes included beach composition (sand, rubble, and rocks) and sub-surface classes (seagrass, sand, algae, and rocks). The methodology was applied to two case studies of coastal areas with different compositions: a sandy beach with a large face and a rubble beach with a small face. Both are threatened by beach erosion and have been degraded by the action of sea storms. Results show that the coastline, which is the low limit of the swash zone, was detected successfully by both the 3D representations and the image classifications. Furthermore, several traces representing previous sea states were successfully recognized in the case of the sandy beach, while the erosion and beach crests were detected in the case of the rubble beach. The achieved level of detail of the 3D representations revealed new beach characteristics, including erosion crests, berm zones, and sand dunes. In conclusion, the UAV SfM workflow provides information in a spatial resolution that permits the study of coastal changes with confidence and provides accurate 3D visualizations of the beach zones, even for areas with complex topography. The overall results show that the presented methodology is a robust tool for the classification, 3D visualization, and mapping of coastal morphology.

Journal ArticleDOI
TL;DR: Twenty years of modeling work spanning multivariate spatial analysis, gradient analysis, Bayesian nonparametric spatial ideas, directional data, extremes, data fusion, and large spatial and spatio-temporal datasets are reviewed.
Abstract: Spatial analysis has grown at a remarkable rate over the past two decades. Fueled by sophisticated GIS software and inexpensive and fast computation, collection of data with spatially referenced information has increased. Recognizing that such information can improve data analysis has led to an explosion of modeling and model fitting. The contribution of this paper is to illustrate how Gaussian processes have emerged as, arguably, the most valuable tool in the toolkit for geostatistical modeling. Apart from the simplest versions, geostatistical modeling can be viewed as a hierarchical specification with Gaussian processes introduced appropriately at different levels of the specification. This naturally leads to adopting a Bayesian framework for inference and suitable Gibbs sampling/Markov chain Monte Carlo for model fitting. Here, we review twenty years of modeling work spanning multivariate spatial analysis, gradient analysis, Bayesian nonparametric spatial ideas, directional data, extremes, data fusion, and large spatial and spatio-temporal datasets. We demonstrate that Gaussian processes are the key ingredients in all of this work. Most of the content is focused on modeling with examples being limited due to length constraints for the article. Altogether, we are able to conclude that spatial statistics and Gaussian processes do, indeed, make a beautiful marriage.

Journal ArticleDOI
TL;DR: This paper designs a symmetric-key searchable encryption scheme that can support geometric range queries on encrypted spatial data, and formally defines and proves the security of the scheme with indistinguishability under selective chosen-plaintext attacks.
Abstract: Geometric range search is a fundamental primitive for spatial data analysis in SQL and NoSQL databases. It has extensive applications in location-based services, computer-aided design, and computational geometry. Due to the dramatic increase in data size, it is necessary for companies and organizations to outsource their spatial data sets to third-party cloud services (e.g., Amazon) in order to reduce storage and query processing costs, but, meanwhile, with the promise of no privacy leakage to the third party. Searchable encryption is a technique to perform meaningful queries on encrypted data without revealing privacy. However, geometric range search on spatial data has not been fully investigated nor supported by existing searchable encryption schemes. In this paper, we design a symmetric-key searchable encryption scheme that can support geometric range queries on encrypted spatial data. One of our major contributions is that our design is a general approach, which can support different types of geometric range queries. In other words, our design on encrypted data is independent from the shapes of geometric range queries. Moreover, we further extend our scheme with the additional use of tree structures to achieve search complexity that is faster than linear. We formally define and prove the security of our scheme with indistinguishability under selective chosen-plaintext attacks, and demonstrate the performance of our scheme with experiments in a real cloud platform (Amazon EC2).

Journal ArticleDOI
TL;DR: In this article, quantile regression forests (an elaboration of random forests) are used to investigate the potential of high resolution auxiliary information alone to support the generation of accurate and interpretable geochemical maps.

Journal ArticleDOI
TL;DR: The proposed approach is able to cover different analysis scenarios by means of a fully adaptive processing chain (based on three steps) for hyperspectral image classification, exhibiting good classification performance even when the number of training samples available a priori is very limited.
Abstract: This paper presents a new approach for accurate spatial–spectral classification of hyperspectral images, which consists of three main steps. First, a pixelwise classifier, i.e., the probabilistic-kernel collaborative representation classification (PKCRC), is proposed to obtain a set of classification probability maps using the spectral information contained in the original data. This is achieved by means of a kernel extension based on collaborative representation (CR) classification. Then, an adaptive weighted graph (AWG)-based postprocessing model is utilized to include the spatial information by refining the obtained pixelwise probability maps. Furthermore, to deal with scenarios dominated by limited training samples, we modify the postprocessing model by fixing the probabilistic outputs of training samples to integrate the spatial and label information. The proposed approach is able to cover different analysis scenarios by means of a fully adaptive processing chain (based on three steps) for hyperspectral image classification. All the techniques that integrate the proposed approach have a closed-form analytic solution and are easy to be implemented and calculated, exhibiting potential benefits for hyperspectral image classification under different conditions. Specifically, the proposed method is experimentally evaluated using two real hyperspectral imagery data sets, exhibiting good classification performance even when the number of training samples available a priori is very limited.

Journal ArticleDOI
TL;DR: Experimental results show that the improved FCM method proposed is effective and performs better than the existing methods, including the existing FCM methods, for segmentation of the IR ship images.
Abstract: Segmentation of infrared (IR) ship images is always a challenging task, because of the intensity inhomogeneity and noise. The fuzzy C-means (FCM) clustering is a classical method widely used in image segmentation. However, it has some shortcomings, like not considering the spatial information or being sensitive to noise. In this paper, an improved FCM method based on the spatial information is proposed for IR ship target segmentation. The improvements include two parts: 1) adding the nonlocal spatial information based on the ship target and 2) using the spatial shape information of the contour of the ship target to refine the local spatial constraint by Markov random field. In addition, the results of ${K}$ -means are used to initialize the improved FCM method. Experimental results show that the improved method is effective and performs better than the existing methods, including the existing FCM methods, for segmentation of the IR ship images.

Journal ArticleDOI
TL;DR: The results show that combining multispectral and SAR data improves the overall performance of several classifiers, with random forest (RF) performing the best overall.
Abstract: There is an urgent need for more detailed spatial information on cities globally that has been acquired using a standard method to facilitate comparison and the transfer of scientific and practical knowledge between places. As part of the world urban database and access portal tools (WUDAPT) initiative, a simple workflow has been developed to perform this task. Using freely available satellite imagery (Landsat) and software (SAGA), WUDAPT characterizes settlements using the local climate zone (LCZ) scheme, which decomposes the city into distinctive neighborhoods ( ${ > } \text{1}\ \hbox{km}^2$ ) based on typical properties (e.g., green proportion and built fraction). In this paper, the methodology is extended to examine the effect of adding synthetic aperture radar (SAR) data, which is now freely available from Sentinel 1, for generating LCZs. Using the city of Khartoum as a case study, the results show that combining multispectral and SAR data improves the overall performance of several classifiers, with random forest (RF) performing the best overall.

Journal ArticleDOI
TL;DR: In this paper, the authors apply a spatial regression model to explain the amount of land use change as a function of land-use factors in the cellular automata model, and the results are further compared with that of an OLS-based CA model.

Journal ArticleDOI
TL;DR: The driving mechanism of urban forest LST was revealed through a combination of multi-source spatial data and spatial statistical analysis of clustering regions, and the main factors contributing were dominant tree species and elevation.

Journal ArticleDOI
TL;DR: A simulation experiment shows that, with a well-prepared candidate eigenvector set, ESF can effectively account for spatial autocorrelation and achieve computational efficiency, and a nonlinear equation is proposed for constructing an ideal candidate eigervector set based on the results of the simulation experiment.
Abstract: Because eigenvector spatial filtering (ESF) provides a relatively simple and successful method to account for spatial autocorrelation in regression, increasingly it has been adopted in various fields. Although ESF can be easily implemented with a stepwise procedure, such as traditional stepwise regression, its computational efficiency can be further improved. Two major computational components in ESF are extracting eigenvectors and identifying a subset of these eigenvectors. This paper focuses on how a subset of eigenvectors can be efficiently and effectively identified. A simulation experiment summarized in this paper shows that, with a well-prepared candidate eigenvector set, ESF can effectively account for spatial autocorrelation and achieve computational efficiency. This paper further proposes a nonlinear equation for constructing an ideal candidate eigenvector set based on the results of the simulation experiment.

Journal ArticleDOI
TL;DR: The general evolution of the GIS architecture is presented which includes main two parallel GIS architectures based on high performance computing cluster and Hadoop cluster and the current spatial data partition strategies, key methods to realize Parallel GIS in the view of data decomposition and progress of the special parallel Gis algorithms are summarized.
Abstract: With the increasing interest in large-scale, high-resolution and real-time geographic information system (GIS) applications and spatial big data processing, traditional GIS is not efficient enough to handle the required loads due to limited computational capabilities.Various attempts have been made to adopt high performance computation techniques from different applications, such as designs of advanced architectures, strategies of data partition and direct parallelization method of spatial analysis algorithm, to address such challenges. This paper surveys the current state of parallel GIS with respect to parallel GIS architectures, parallel processing strategies, and relevant topics. We present the general evolution of the GIS architecture which includes main two parallel GIS architectures based on high performance computing cluster and Hadoop cluster. Then we summarize the current spatial data partition strategies, key methods to realize parallel GIS in the view of data decomposition and progress of the special parallel GIS algorithms. We use the parallel processing of GRASS as a case study. We also identify key problems and future potential research directions of parallel GIS.

Journal ArticleDOI
TL;DR: A novel method using independent component analysis (ICA) and edge-preserving filtering (EPF) via an ensemble strategy for the classification of hyperspectral data using several subsets randomly selected from the original feature space to produce spatial features.
Abstract: To obtain accurate classification results of hyperspectral images, both spectral and spatial information should be fully exploited in the classification process. In this paper, we propose a novel method using independent component analysis (ICA) and edge-preserving filtering (EPF) via an ensemble strategy for the classification of hyperspectral data. First, several subsets are randomly selected from the original feature space. Second, ICA is used to extract spectrally independent components followed by an effective EPF method, to produce spatial features. Two strategies (i.e., parallel and concatenated) are presented to include the spatial features in the analysis. The spectral–spatial features are then classified with a random forest or a rotation forest classifier. Experimental results on two real hyperspectral data sets demonstrate the effectiveness of the proposed methods. A sensitivity analysis of the new classifiers is also performed.